Video sessions

Video sessions take the same agent and put a face on it. Saaya runs WebRTC over LiveKit, drives an avatar (Anam, Simli, Tavus, AvatarIO, AvatarTalk, LiveAvatar, or TruGen), and streams the model output through the avatar in real time.

Starting a video session

video.ts
const session = await saaya.sessions.create({
  agentId: "agt_2N3rH...",
  channel: "video",
  avatar:  { provider: "tavus", replicaId: "rep_abc" },
});

// Hand the LiveKit token to the browser to join the room.
console.log(session.livekitUrl, session.livekitToken);

Avatar overlay

Avatars run as a video track in the LiveKit room. You can render them full-screen, in a picture-in-picture, or with a custom overlay (logo, captions, presenter title). Captions are served as a separate text track so screen readers and search index them.

Accessibility

  • Live captions on by default, with a toggle for the user.
  • Keyboard-only join + mute + leave, no mouse required.
  • Avatar carries a visible label ("This is an AI agent").
  • Audio-only fallback when bandwidth drops below 250kbps.

Recording

Video recording is opt-in. Saaya stores an MP4 with mixed audio plus the per-track transcript. Recordings live in your residency region and can be exported to S3 or fetched via `sessions.exports`.

Tuning the avatar

Different providers have different latency / fidelity trade-offs. Anam and Simli ship the lowest first-frame time; Tavus tends to feel more lifelike on long sessions. Try two and switch in the dashboard.
Was this page helpful?