The open-source TTS directory.
Every model, every voice, reading the same three scripts. Compare open-source TTS side-by-side without downloading anything.
22 models · 9 with samples
Chatterbox
MITElevenLabs-tier quality, MIT licensed, voice cloning from a clip
CosyVoice 2
Apache-2.0Alibaba's real-time TTS, 150ms streaming latency
Dia
Apache-2.0Ultra-realistic two-speaker dialogue, sounds like a podcast
F5-TTS
MITFlow-matching TTS, fast zero-shot voice cloning
Fish Speech
Apache-2.0Open-source TTS that benchmarks above closed-source
Higgs Audio v2
Apache-2.0Built on Llama 3.2 3B, 10M+ hours of training audio
IndexTTS 2
Apache-2.0Bilibili's TTS with multi-speaker and precise emotion control
KittenTTS
Apache-2.015M params, under 25MB, runs anywhere — even your phone
MegaTTS 3
Apache-2.0ByteDance's TTS with sparse alignment — robust prosody
MeloTTS
MITMultilingual, real-time on CPU
OpenVoice v2
MITTone-color cloning + cross-lingual transfer
Orpheus TTS
Apache-2.0Llama-3 based, empathetic, low-latency for interactive apps
Parler-TTS
Apache-2.0Describe the voice in natural language ('soft female, fast, clear')
Piper
MITTiny, fast, runs offline on a Raspberry Pi
Qwen3-TTS
Apache-2.0Alibaba's flagship, 97ms latency, 10 languages
Spark-TTS
Apache-2.0Built on Qwen2.5, zero-shot voice cloning + style control
StyleTTS 2
MITDiffusion-based, human-level naturalness on LibriTTS
VibeVoice
MITMicrosoft's long-form TTS — 90 minutes, 4 speakers
Whisper, but inverted — TTS by 'unwrapping' OpenAI's ASR
XTTS v2
CPML (non-commercial)Most-downloaded TTS on HF, 6-second voice cloning
Missing a model? Add it.
OpenSpeech is community-maintained. Adding a model is a single PR: edit one JSON file, drop in three audio samples, open the pull request. Models must be genuinely open-source.