Overview
Cartesia provides two TTS service implementations:CartesiaTTSService
: WebSocket-based with streaming and word timestampsCartesiaHttpTTSService
: HTTP-based for simpler synthesis
CartesiaTTSService
is recommended for real-time applications.API Reference
Complete API documentation and method details
Cartesia Docs
Official Cartesia documentation and features
Example Code
Working example with interruption handling
Installation
To use Cartesia services, install the required dependencies:CARTESIA_API_KEY
.
Get your API key by signing up at
Cartesia.
Frames
Input
TextFrame
- Text content to synthesize into speechTTSSpeakFrame
- Text that the TTS service should speakTTSUpdateSettingsFrame
- Runtime configuration updates (e.g., voice)LLMFullResponseStartFrame
/LLMFullResponseEndFrame
- LLM response boundaries
Output
TTSStartedFrame
- Signals start of synthesisTTSAudioRawFrame
- Generated audio data chunksTTSStoppedFrame
- Signals completion of synthesisErrorFrame
- Connection or processing errors
Service Comparison
Feature | CartesiaTTSService (WebSocket) | CartesiaHttpTTSService (HTTP) |
---|---|---|
Streaming | ✅ Real-time chunks | ❌ Single audio block |
Word Timestamps | ✅ Precise timing | ❌ Not available |
Interruption | ✅ Advanced handling | ⚠️ Basic support |
Latency | 🚀 Low | 📈 Higher |
Best For | Interactive apps | Batch processing |
Language Support
Supports multiple languages through theLanguage
enum:
Language Code | Description | Service Code |
---|---|---|
Language.DE | German | de |
Language.EN | English | en |
Language.ES | Spanish | es |
Language.FR | French | fr |
Language.HI | Hindi | hi |
Language.IT | Italian | it |
Language.JA | Japanese | ja |
Language.KO | Korean | ko |
Language.NL | Dutch | nl |
Language.PL | Polish | pl |
Language.PT | Portuguese | pt |
Language.RU | Russian | ru |
Language.SV | Swedish | sv |
Language.TR | Turkish | tr |
Language.ZH | Chinese (Mandarin) | zh |
Usage Example
WebSocket Service (Recommended)
Initialize the WebSocket service with your API key and desired voice:HTTP Service
Initialize the HTTP service and use it in a pipeline:Dynamic Configuration
Make settings updates by pushing aTTSUpdateSettingsFrame
for the CartesiaTTSService
:
Metrics
Both services provide:- Time to First Byte (TTFB) - Latency from text input to first audio
- Processing Duration - Total synthesis time
- Usage Metrics - Character count and synthesis statistics
Learn how to enable Metrics in your Pipeline.
Additional Notes
- WebSocket Recommended: Use
CartesiaTTSService
for low-latency streaming and accurate context updates with word timestamps - Connection Management: WebSocket lifecycle is handled automatically with reconnection support
- Sample Rate: Set globally in
PipelineParams
rather than per-service for consistency