Open-source deep learning TTS toolkit — train custom voices, 20+ languages, extensive model zoo.
Coqui TTS — Open-source deep learning TTS toolkit — train custom voices, 20+ languages, extensive model zoo. Built primarily for technical teams and organizations that prefer open, auditable, self-hosted infrastructure, the platform addresses common pain points in the open source segment with a focused feature set. Buyers researching open source options will find Coqui TTS a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Open Source
⭐ 45,431 stars
open-source self-hosted multilingual training
Suno's generative audio model — TTS with emotion, laughter, sound effects, and music generation.
Bark — Suno's generative audio model — TTS with emotion, laughter, sound effects, and music generation. Built primarily for technical teams and organizations that prefer open, auditable, self-hosted infrastructure, the platform addresses common pain points in the open source segment with a focused feature set. Buyers researching open source options will find Bark a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Open Source
⭐ 39,141 stars
open-source expressive music sound-effects
High-quality open-source TTS with voice cloning — known for naturalistic output, slower than real-time.
Tortoise TTS — High-quality open-source TTS with voice cloning — known for naturalistic output, slower than real-time. Built primarily for technical teams and organizations that prefer open, auditable, self-hosted infrastructure, the platform addresses common pain points in the open source segment with a focused feature set. Buyers researching open source options will find Tortoise TTS a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Open Source
⭐ 14,851 stars
open-source high-quality voice-cloning slow
AWS neural TTS — standard and neural voices in 30+ languages with real-time streaming via NTTS.
Amazon Polly — AWS neural TTS — standard and neural voices in 30+ languages with real-time streaming via NTTS. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the neural tts segment with a focused feature set. Buyers researching neural tts options will find Amazon Polly a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Neural Tts
aws neural enterprise streaming
Real-time TTS with sub-100ms latency — purpose-built for voice agent applications requiring instant response.
Cartesia — Real-time TTS with sub-100ms latency — purpose-built for voice agent applications requiring instant response. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the real time tts segment with a focused feature set. Buyers researching real time tts options will find Cartesia a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Real Time Tts
real-time low-latency api voice-agents
Deepgram's TTS offering — low-latency streaming TTS integrated with their speech platform for end-to-end voice AI.
Deepgram Aura — Deepgram's TTS offering — low-latency streaming TTS integrated with their speech platform for end-to-end voice AI. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the real time tts segment with a focused feature set. Buyers researching real time tts options will find Deepgram Aura a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Real Time Tts
real-time api deepgram voice-agents
State-of-the-art text-to-speech and voice cloning — hyper-realistic voices with ultra-low latency streaming API.
ElevenLabs — State-of-the-art text-to-speech and voice cloning — hyper-realistic voices with ultra-low latency streaming API. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find ElevenLabs a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
voice-cloning realistic api streaming
Google's neural TTS with WaveNet and Studio voices — 40+ languages, SSML support, enterprise-grade reliability.
Google Cloud Text-to-Speech — Google's neural TTS with WaveNet and Studio voices — 40+ languages, SSML support, enterprise-grade reliability. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the neural tts segment with a focused feature set. Buyers researching neural tts options will find Google Cloud Text-to-Speech a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Neural Tts
google wavenet enterprise multilingual
IBM Watson's text-to-speech service — expressive and transformative voices for enterprise applications.
IBM Watson TTS — IBM Watson's text-to-speech service — expressive and transformative voices for enterprise applications. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the neural tts segment with a focused feature set. Buyers researching neural tts options will find IBM Watson TTS a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Neural Tts
ibm enterprise watson expressive
Ultra-fast TTS API designed for real-time voice applications — minimal latency with high naturalness.
LMNT — Ultra-fast TTS API designed for real-time voice applications — minimal latency with high naturalness. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the real time tts segment with a focused feature set. Buyers researching real time tts options will find LMNT a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Real Time Tts
real-time low-latency developer-api voice-agents
Azure's neural TTS with 400+ voices across 140 languages — Custom Neural Voice for branded voice creation.
Microsoft Azure Neural TTS — Azure's neural TTS with 400+ voices across 140 languages — Custom Neural Voice for branded voice creation. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the neural tts segment with a focused feature set. Buyers researching neural tts options will find Microsoft Azure Neural TTS a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Neural Tts
microsoft neural enterprise custom-neural
AI voice generator with a studio UI — voiceovers for videos, presentations, and e-learning in 20+ languages.
Murf — AI voice generator with a studio UI — voiceovers for videos, presentations, and e-learning in 20+ languages. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find Murf a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
studio voiceover content-creation multilingual
OpenAI's text-to-speech API — 6 natural voices with streaming support, simple integration for OpenAI users.
OpenAI TTS — OpenAI's text-to-speech API — 6 natural voices with streaming support, simple integration for OpenAI users. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find OpenAI TTS a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
openai api natural streaming
AI TTS platform with voice cloning and a studio UI — natural-sounding voices for content and voice agents.
PlayHT — AI TTS platform with voice cloning and a studio UI — natural-sounding voices for content and voice agents. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find PlayHT a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
voice-cloning api real-time studio
AI voice actor platform for games, film, and interactive media — diverse cast of licensed AI voices.
Replica Studios — AI voice actor platform for games, film, and interactive media — diverse cast of licensed AI voices. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find Replica Studios a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
game-audio film voice-acting commercial
Custom AI voice creation platform — clone any voice or build a new one, with real-time streaming API.
Resemble AI — Custom AI voice creation platform — clone any voice or build a new one, with real-time streaming API. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the voice cloning segment with a focused feature set. Buyers researching voice cloning options will find Resemble AI a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Voice Cloning
voice-cloning real-time api custom-voices
High-fidelity TTS with natural American English voices — built for voice agent and telephony use cases.
Rime AI — High-fidelity TTS with natural American English voices — built for voice agent and telephony use cases. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the real time tts segment with a focused feature set. Buyers researching real time tts options will find Rime AI a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Real Time Tts
american-voices realistic api telephony
Text-to-speech app and API for accessibility and productivity — popular for audio reading of articles and documents.
Speechify — Text-to-speech app and API for accessibility and productivity — popular for audio reading of articles and documents. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find Speechify a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
accessibility reading consumer productivity
AI text-to-speech and avatar platform for video content — multilingual with character-based voices.
Typecast — AI text-to-speech and avatar platform for video content — multilingual with character-based voices. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find Typecast a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
video studio multilingual avatar
Enterprise-grade AI voiceover — natural studio-quality voices for content teams at scale.
WellSaid Labs — Enterprise-grade AI voiceover — natural studio-quality voices for content teams at scale. Built primarily for teams and organizations evaluating solutions in this market, the platform addresses common pain points in the commercial api segment with a focused feature set. Buyers researching commercial api options will find WellSaid Labs a relevant candidate to include in their evaluation, particularly when comparing capabilities, pricing models, and integration depth against competing platforms in the same category.
Commercial Api
enterprise voiceover content-creation brand-voice