
Knowledge base
Knowledge base and RAG
RAG, embeddings and vector storage for the AI system
Overview
The knowledge base in Cusmato feeds the AI system with relevant documentation. Via RAG (Retrieval-Augmented Generation), the right sources are retrieved when generating responses.
- RAG — Searches indexed documents based on semantic similarity
- Embeddings — Via Ollama
- Vector storage — Qdrant
- Filtering — Per channel (channelKey) possible
How it works
During first-contact
When a ticket is processed:
- The content of the customer message is used as search query
- RAG searches for similar documents in the knowledge base
- The found context is added to the AI prompt
- The response is generated with this extra context
Search parameters
| Parameter | Default | Description |
|---|---|---|
limit | 5 | Maximum number of results |
scoreThreshold | 0.7 | Minimum relevance score (0–1) |
sourceTypes | — | Filter by source type |
channelKey | — | Filter by channel (bolcom, shopify, etc.) |
Managing the knowledge base
The knowledge base API supports:
- Search — Search indexed content
- Seed — Add documents to the index
- Extract — Extract content from sources
- AI-extract — AI-driven extraction
- AI-generate — AI-driven content generation
- Examples — Examples for training
- Webpages — Scrape and index web pages
Redis integration
Knowledge base data is also stored in Redis. The system prompt hash contains a timestamp of the knowledge base, so changes trigger prompt invalidation.
API endpoints
POST /api/[accountSlug]/rag/search— Search in RAG (query, limit, scoreThreshold, sourceTypes)GET /api/[accountSlug]/rag/search— Retrieve collection statistics
Best practices
- Relevant content — Only add documentation that directly helps with customer questions
- Channel-specific — Use channelKey filtering for channel-specific knowledge
- Keep updated — Keep the knowledge base up-to-date with policy changes and new products