Overview
GroqSTTService
provides high-accuracy speech recognition using Groq’s hosted Whisper API with ultra-fast inference speeds. It uses Voice Activity Detection (VAD) to process speech segments efficiently to create speech segments to send to the API.
API Reference
Complete API documentation and method details
Groq Docs
Official Groq STT documentation and features
Example Code
Working example with Groq ecosystem integration
Installation
To use Groq services, install the required dependency:GROQ_API_KEY
.
Get your API key from the Groq Console.
Frames
Input
InputAudioRawFrame
- Raw PCM audio data (16-bit, mono)UserStartedSpeakingFrame
- VAD signal to start buffering audioUserStoppedSpeakingFrame
- VAD signal to process buffered audio
Output
TranscriptionFrame
- Final transcription results (no interim results)ErrorFrame
- API or processing errors
Models
Groq currently offers one optimized Whisper model:Model | Description | Performance |
---|---|---|
whisper-large-v3-turbo | Groq’s optimized Whisper large v3 | Ultra-fast inference with high accuracy |
Groq’s hardware acceleration makes this model significantly faster than
standard Whisper implementations while maintaining accuracy.
Language Support
Groq’s Whisper API supports 60+ languages with automatic language detection:View All Supported Languages
View All Supported Languages
Language Code | Description | Whisper Code |
---|---|---|
Language.AF | Afrikaans | af |
Language.AR | Arabic | ar |
Language.HY | Armenian | hy |
Language.AZ | Azerbaijani | az |
Language.BE | Belarusian | be |
Language.BS | Bosnian | bs |
Language.BG | Bulgarian | bg |
Language.CA | Catalan | ca |
Language.ZH | Chinese | zh |
Language.HR | Croatian | hr |
Language.CS | Czech | cs |
Language.DA | Danish | da |
Language.NL | Dutch | nl |
Language.EN | English | en |
Language.ET | Estonian | et |
Language.FI | Finnish | fi |
Language.FR | French | fr |
Language.GL | Galician | gl |
Language.DE | German | de |
Language.EL | Greek | el |
Language.HE | Hebrew | he |
Language.HI | Hindi | hi |
Language.HU | Hungarian | hu |
Language.IS | Icelandic | is |
Language.ID | Indonesian | id |
Language.IT | Italian | it |
Language.JA | Japanese | ja |
Language.KN | Kannada | kn |
Language.KK | Kazakh | kk |
Language.KO | Korean | ko |
Language.LV | Latvian | lv |
Language.LT | Lithuanian | lt |
Language.MK | Macedonian | mk |
Language.MS | Malay | ms |
Language.MR | Marathi | mr |
Language.MI | Maori | mi |
Language.NE | Nepali | ne |
Language.NO | Norwegian | no |
Language.FA | Persian | fa |
Language.PL | Polish | pl |
Language.PT | Portuguese | pt |
Language.RO | Romanian | ro |
Language.RU | Russian | ru |
Language.SR | Serbian | sr |
Language.SK | Slovak | sk |
Language.SL | Slovenian | sl |
Language.ES | Spanish | es |
Language.SW | Swahili | sw |
Language.SV | Swedish | sv |
Language.TL | Tagalog | tl |
Language.TA | Tamil | ta |
Language.TH | Thai | th |
Language.TR | Turkish | tr |
Language.UK | Ukrainian | uk |
Language.UR | Urdu | ur |
Language.VI | Vietnamese | vi |
Language.CY | Welsh | cy |
Language.EN
- English -en
Language.ES
- Spanish -es
Language.FR
- French -fr
Language.DE
- German -de
Language.IT
- Italian -it
Language.JA
- Japanese -ja
Regional variants (like
EN_US
, FR_CA
) are automatically mapped to their
base language codes.Usage Example
Basic Configuration
Initialize theGroqSTTService
and use it in a pipeline:
Advanced Configuration
Dynamic Configuration
Make settings updates by pushing anSTTUpdateSettingsFrame
for the GroqSTTService
:
Metrics
The service provides comprehensive metrics:- Time to First Byte (TTFB) - API response latency
- Processing Duration - Total transcription time
Learn how to enable Metrics in your Pipeline.
Additional Notes
- Segmented Processing: Processes complete utterances, not continuous streams
- No Interim Results: Only final transcriptions are provided (typical for batch APIs)
- Audio Buffer: Maintains 1-second buffer to capture speech before VAD detection
- Language Variants: Regional language codes automatically map to base languages
- Context Prompts: Use prompts to improve accuracy for specific domains or conversation styles
- Rate Limits: Check your Groq plan for concurrent request and usage limits
- Hardware Optimization: Leverages Groq’s custom inference chips for maximum performance