Stop guessing. Cite the source.
Drop in your docs, help center, PDFs, or websites. Saaya builds a vector-grounded knowledge base that your agents query before every answer — with citations and a strict-citation default.
Hallucinations are a prompt problem. We fixed the prompt.
Every agent worth shipping is grounded. Without a knowledge base, an LLM answers from its training data — confidently, beautifully, and often wrong about your product. With Saaya Knowledge Bases, your agents answer only from the sources you connect, with citations the user can click to verify.
You can ingest help centers, PDFs, websites, Notion spaces, Confluence docs, and raw markdown. Saaya chunks, embeds, and stores into a per-tenant managed vector store. The agent retrieves the most relevant passages on every turn, summarizes, and cites — never hallucinates a feature or invents a policy.
The strict-citation default refuses to answer if the retrieval confidence is low. Instead, it escalates to a human and logs the gap so your team can fill it. No more "the bot promised something we don't actually do".
What's under the hood.
A knowledge base that production support and sales teams can actually trust.
Multi-source ingest
Help centers (Zendesk, Intercom, Notion, Confluence), PDFs, websites, Markdown, raw text. URL-crawl with auto-refresh, or drop-and-go uploads.
Smart chunking
Heading-aware chunking that respects your doc structure. Code blocks stay intact. Tables stay readable. Cross-references survive embedding.
Citations on every answer
Every agent reply includes the source passage and a click-through link. On voice, citations are spoken in summary form ("according to our refund policy…").
Freshness controls
Per-source refresh schedule (hourly, daily, on-demand). Invalidate by URL or by tag. Old answers don't survive a doc update.
Scoped access
Tag KB entries by audience (public, internal, support-only). The agent retrieves only what the current user is allowed to see.
Strict-citation mode
On by default. The agent refuses to answer when retrieval confidence is below threshold — and escalates instead. No confident wrong answers.
Four steps from raw docs to grounded answers.
Connect a source
Point Saaya at your help center, Notion space, S3 bucket, or upload PDFs directly. Auto-detect structure and tags.
Chunk & embed
Saaya chunks heading-aware, embeds with the model you choose (OpenAI, Cohere, BGE), and stores into your isolated tenant vector store.
Query at runtime
On every agent turn, Saaya retrieves the top-K passages, scores them, and feeds the agent only what's relevant. Latency budget: under 80 ms.
Cite or escalate
The agent answers with citations. If confidence is low, strict-citation mode escalates to a human and flags the KB gap.
A working KB, ready to ship.
import { createKnowledgeBase } from '@saaya/sdk'
export const productDocs = await createKnowledgeBase({
id: 'kb_product_docs',
sources: [
{ type: 'url', url: 'https://docs.acme.com', refresh: 'daily' },
{ type: 'notion', spaceId: process.env.NOTION_SPACE },
{ type: 'pdf', url: 's3://acme-public/handbook.pdf' },
],
embedding: { provider: 'openai', model: 'text-embedding-3-large' },
chunking: { strategy: 'heading-aware', maxTokens: 800 },
access: { audience: 'public' },
})On every tier — what you get.
1 KB · 50 docs · daily refresh · OpenAI embeddings only · 100K queries/month included.
Unlimited KBs · 10K docs each · hourly refresh · all embedding providers · 1M queries/month · scoped access · strict-citation analytics.
Self-hosted vector store option · BYO embedding model · custom chunking strategies · audit log export · 99.95% SLA on retrieval.
Pair this with the right Solution.
Frequently Asked Questions.
Saaya KBs ship the production layer most teams underestimate: heading-aware chunking, per-source refresh schedules, scoped access, citation rendering for both voice and chat, strict-citation gating, KB-gap analytics, and per-tenant isolation. You can roll your own — most teams burn 6+ weeks doing it.
Ground your agents.
Connect your first KB in under five minutes — your agents stop guessing and start citing. Free tier covers 100K queries a month.