Video sessions
Video sessions take the same agent and put a face on it. Saaya runs WebRTC over LiveKit, drives an avatar (Anam, Simli, Tavus, AvatarIO, AvatarTalk, LiveAvatar, or TruGen), and streams the model output through the avatar in real time.
Starting a video session
video.ts
const session = await saaya.sessions.create({
agentId: "agt_2N3rH...",
channel: "video",
avatar: { provider: "tavus", replicaId: "rep_abc" },
});
// Hand the LiveKit token to the browser to join the room.
console.log(session.livekitUrl, session.livekitToken);Avatar overlay
Avatars run as a video track in the LiveKit room. You can render them full-screen, in a picture-in-picture, or with a custom overlay (logo, captions, presenter title). Captions are served as a separate text track so screen readers and search index them.
Accessibility
- Live captions on by default, with a toggle for the user.
- Keyboard-only join + mute + leave, no mouse required.
- Avatar carries a visible label ("This is an AI agent").
- Audio-only fallback when bandwidth drops below 250kbps.
Recording
Video recording is opt-in. Saaya stores an MP4 with mixed audio plus the per-track transcript. Recordings live in your residency region and can be exported to S3 or fetched via `sessions.exports`.
Tuning the avatar
Different providers have different latency / fidelity trade-offs. Anam and Simli ship the lowest first-frame time; Tavus tends to feel more lifelike on long sessions. Try two and switch in the dashboard.
Was this page helpful?