Pocket TTS — Voice Studio

foundry · pick a voice or blend Kokoro voices, generate, play. Clips save to tts/studio/.
overrides the voice above · longer & cleaner (~10–30s) = better clone
↑ more expressive · too high = unstable
1 = fast · 2+ = smoother, slower
↓ lets it finish · ↑ stops sooner
0 = off · caps sampled noise — lower = steadier/flatter
ⓘ How these settings work — Pocket TTS

Voice — the reference whose prosody is cloned, grouped by persona (athena / majel / custom) plus the built-in Kyutai voices. This is the biggest lever: a deadpan reference stays deadpan no matter the sliders.

Or clone an uploaded clip — upload any audio to clone it on the fly; it overrides the dropdown. Longer & cleaner (~10–30s) clones better. The ✕ clears the upload.

Temperature (0.7) — randomness of delivery. Higher = more pitch/emotion; too high slurs. Lower = flatter but rock-stable.

Decode steps (1) — decoder refinement passes; 2+ smooths artifacts at ~linear CPU cost (subtle, slower).

EOS threshold (−4) — how eagerly it stops; −2 can clip the ending, −6 may add trailing junk.

Noise clamp (off) — caps the magnitude of sampled noise. 0 = off (no clamp); a value steadies/flattens delivery (lower = tighter). Leave off unless a take is too jittery.

Quick starts: lively temp 0.9 · stable temp 0.6 · cleaner take → steps 2.

ⓘ Text, generating & saved clips (all engines)

Text — what gets spoken; shared across all three tabs.

Generate renders the clip and plays it immediately — but nothing is saved yet (shown as · unsaved). Reset defaults restores the current tab's controls.

💾 save — writes the just-generated clip into the library at studio/<engine>/<voice>/; only saved clips appear under Saved clips.

Saved clips — grouped by engine · voice, each with a player, ➜ voice (promote the clip into a reusable reference voice under custom), and delete.

Voice library — manage the cloneable reference voices (used by the Pocket & XTTS tabs): ▶ preview, ✎ rename or move to another persona, ✕ delete, and Add a voice to upload a new WAV into voices/<persona>/. Changes refresh the voice pickers immediately.

Saved clips

Loading…

Voice library

Manage cloneable reference voices (used by the Pocket & XTTS tabs). Preview ▶ · rename/move ✎ · delete ✕.
Loading…
Add a voice (WAV)
Stored at voices/<persona>/<name>.wav · WAV only (no transcoder here).