Available Transport Types
Pipecat supports multiple transport types to fit different use cases and deployment scenarios:DailyTransport
WebRTC-based transport using Daily’s infrastructure for video calls and
conferencing
FastAPIWebsocketTransport
WebSocket transport for telephony providers and custom WebSocket connections
LiveKitTransport
WebRTC transport using LiveKit’s real-time communication platform
SmallWebRTCTransport
Direct peer-to-peer WebRTC connections without cloud infrastructure
TavusTransport
Specialized transport for Tavus video generation and streaming
WebsocketTransport
General-purpose WebSocket transport for custom implementations
Pipeline Integration
Transports provide two key components for your pipeline:input()
and output()
methods. These methods define how the transport interacts with the pipeline:
Transport Input and Output
transport.input()
typically goes first in the pipeline to receive user inputtransport.output()
doesn’t always go last - you may want processors after it- Post-output processing enables synchronized actions like:
- Recording with word-level accuracy
- Displaying subtitles synchronized to audio
- Capturing context information precisely timed to output
Transport Modularity
Transports are modular components in your Pipeline, allowing you to flexibly change how users connect to your bot depending on the context. This modularity enables you to:- Switch environments easily: Use P2P WebRTC for development, Daily for production
- Support multiple connection types: Same bot logic works across different transports
- Optimize for use case: Choose the best transport for your specific requirements
Transport Configuration
All transports are configured usingTransportParams
, which provides common settings across transport types:
Each transport may have its own specialized parameters class that extends
TransportParams with transport-specific options. Check the individual
transport documentation for details.
TransportParams Reference
Complete reference for all transport configuration options
Telephony Integration
Telephony services (phone calls) use WebSocket connections with specialized serialization:Supported Telephony Providers
Twilio
Media Streams over WebSocket with TwilioFrameSerializer
Telnyx
Real-time media streaming with TelnyxFrameSerializer
Plivo
Voice streaming API with PlivoFrameSerializer
Exotel
Voice streaming integration with ExotelFrameSerializer
Telephony Transport Setup
Telephony requires aFrameSerializer
to handle provider-specific message formats:
The development runner automatically detects and configures the appropriate
serializer when using
parse_telephony_websocket()
.Conditional Transport Selection
The development runner provides a pattern for conditionally selecting transports based on the environment:WebRTC vs WebSocket Considerations
Understanding when to use each connection type is crucial for building effective voice AI applications:WebRTC (Recommended for Client Applications)
Best for: Browser apps, mobile apps, real-time conversations Advantages:- Low latency: Optimized for real-time media with minimal delay
- Built-in resilience: Handles packet loss and network variations
- Advanced audio processing: Echo cancellation, noise reduction, automatic gain control
- Quality monitoring: Detailed performance and media quality statistics
- Automatic timestamping: Simplifies interruption and playout logic
- Robust reconnection: Built-in connection management
- Building client-facing applications (web, mobile)
- Conversational latency is critical
- Users are on potentially unreliable networks
- You need built-in audio processing features
WebSocket (Good for Server-to-Server)
Best for: Telephony integration, server-to-server communication, prototyping Limitations for real-time media:- TCP-based: Subject to head-of-line blocking
- Network sensitivity: Less resilient to packet loss and jitter
- Manual implementation: Requires custom logic for reconnection, timestamping
- Limited observability: Harder to monitor connection quality
- Integrating with telephony providers (Twilio, Telnyx, etc.)
- Building server-to-server connections
- Prototyping or latency isn’t critical
- Working within existing WebSocket infrastructure
Key Takeaways
- Transports are modular - swap them without changing bot logic
- Choose based on use case - WebRTC for clients, WebSocket for telephony
- Configuration is standardized - TransportParams work across transport types
- Pipeline placement matters - consider what processing happens after output
- Development runner helps - provides patterns for multi-transport bots
What’s Next
Now that you understand how transports connect users to your bot, let’s explore how to configure speech recognition to convert user audio into text.Speech Input & Turn Detection
Learn how to configure speech recognition in your voice AI pipeline