Understanding the Architecture
Session initialization involves multiple components working together:- Runner: A FastAPI server that handles incoming connection requests and manages session setup
- Pipecat Bot: Your voice AI application running as a separate server-side service
- Client Application: The user-facing app (web browser, mobile app, etc.)
Development Runner
For most development and many production use cases, Pipecat provides a development runner that handles all the session initialization complexity for you. Instead of building FastAPI servers and managing WebRTC connections yourself, you focus on your bot logic while the runner handles the infrastructure.Using the Development Runner
Your bot needs a single entry point function that the runner will call:-t
specifies the transport type (e.g., webrtc
, daily
, twilio
) and -x
is the optional proxy domain for telephony.
The development runner automatically:
- Creates the FastAPI server
- Sets up the appropriate endpoints
- Handles connection management
- Starts your bot instances
- Provides a web interface (for WebRTC)
Learn more about building with the development runner in the runner
guide.
Connection Types Under the Hood
While the development runner handles the complexity, understanding the three connection patterns helps you choose the right approach and debug issues:1. P2P WebRTC Connections
What happens:- Runner serves a web interface at
http://localhost:7860/client
- When you open the page and connect, browser creates a WebRTC offer
- Runner receives the offer, establishes connection, starts your bot
- Browser and bot communicate directly via WebRTC
2. Room-Based WebRTC (Daily)
What happens:1
Room Request
User visits the client application and clicks to start a session
2
Room Creation
Runner calls Daily’s API to create a room and tokens using
pipecat.runner.daily.configure()
3
Parallel Join
Both user’s browser and your bot join the same Daily room
4
Media Handshake
Once media streams are established, browser sends
client_ready
message5
Bot Activation
Your bot receives the event and starts the conversation
Room-based WebRTC can also be used for SIP or PSTN connections, which require
different connection patterns. Refer to the telephony
guide for details.
3. WebSocket Connections (Telephony)
What happens:- Telephony provider (Twilio, etc.) receives a phone call
- Provider connects to your runner’s webhook endpoint
- Runner accepts WebSocket connection and parses telephony-specific messages
- Your bot starts immediately with the parsed connection data
Starting Conversations
How and when your bot begins talking depends on the connection type:Immediate Start (P2P WebRTC, WebSocket)
These connections are ready immediately, so you can start talking right after connection:Handshake Required (Client/Server Room-based WebRTC)
For client/server applications using room-based WebRTC, a handshake ensures both sides are ready and the client won’t miss the opening message:Process Isolation
Each session runs its own dedicated bot instance for:- Resource Management: Dedicated CPU and memory per session
- Error Isolation: One session crash doesn’t affect others
- Clean Cleanup: Resources automatically freed when sessions end
Custom Runners: When You Need More Control
The development runner works for most cases, but sometimes you need custom behavior - specific authentication, custom endpoints, or integration with existing systems. For these cases, you can create your own FastAPI runner. The development runner source code (available on GitHub) provides excellent examples for: Daily Integration Example:Refer to the development runner source code to understand these patterns
before building custom runners. It handles many edge cases and provides
battle-tested implementations.
Key Takeaways
- Start with the development runner for fastest development and learning
- Understand connection types to choose the right approach for your use case
- Handle startup timing correctly - immediate start vs. handshake patterns matter
- Plan for process isolation - one bot instance per session is the recommended pattern
- Reference the source code when building custom runners for production
What’s Next
Now that you understand session initialization, let’s explore the different transport options and how to configure them for your specific needs.Pipeline & Frame Processing
Learn how Pipecat’s pipeline architecture orchestrates frame processing for
voice AI applications