Voice (SIP, Twilio)
Saaya runs voice agents over SIP trunks. We bring our own carrier-tier trunks and you can also bring your own, Twilio is the most common in production. Either way, the agent config is the same; only the number and the trunk binding differ.
Provisioning a number
You can buy a number directly from Saaya (recommended for new builds) or attach an existing Twilio number you already own. Attached numbers stay billed to your Twilio account; Saaya-issued numbers roll into your Saaya bill.
POST /api/v1/phone-numbers/attach
{
"provider": "twilio",
"phoneNumber": "+14155550123",
"credentialsRef": "TWILIO_ACME",
"agentId": "agt_2N3rH...",
"direction": "both"
}Inbound routing
When a number is bound to an agent, every inbound call is routed to that agent. Saaya answers in under 800ms (p50), runs STT in parallel with TTS warm-up, and starts the session before the caller hears a hello.
Outbound calls
Outbound is just `sessions.dispatch`. Pass an agent, a "from" number you own, and a "to" number, Saaya places the call, runs the agent, and emits the same session record as an inbound call.
await saaya.sessions.dispatch({
agentId: "agt_2N3rH...",
channel: "voice",
fromNumber: "+14155550123",
toNumber: "+919876543210",
metadata: { leadId: "lead_882", source: "web-form" },
});Latency notes
- Cold-start config is cached in Redis with a 30-min TTL, first call after publish is ~300ms slower.
- Streaming STT begins on the first audio frame; the LLM is invoked at the first phrase boundary, not at end-of-utterance.
- TTS is streamed back to the trunk in 100ms chunks; the caller hears the agent before generation finishes.
- Pick a region close to your callers, EU India and US regions are available; cross-region calls add ~80ms.
Recording
Recording is opt-in per agent and per call. When enabled, Saaya stores a stereo WAV (caller / agent on separate tracks) and a transcript with millisecond-level alignment. Recordings live in your residency region and are exportable to S3.
Consent and disclosure