Skip to main content
RAG grounds LLM answers in real data by retrieving relevant document chunks and injecting them into the prompt. chainforge provides two integration patterns: auto-retrieval (always injects context) and tool-based retrieval (LLM calls retrieval explicitly).

Core types

TypeRole
rag.DocumentA text chunk with ID, Content, Source, and Metadata
rag.RetrieverInterface: Retrieve(ctx, query, topK) ([]Document, error)
rag.DocumentStorerInterface: Store(ctx, sessionID, docs) error
rag.IngestorSplits documents and stores them via a DocumentStorer
rag.QdrantRetrieverRetriever backed by Qdrant semantic search
rag.QdrantIngestorStorer that writes chunks into Qdrant
rag.PGVectorStoreRetriever + storer backed by PostgreSQL + pgvector
rag.NewRetrieverToolWraps a Retriever as a core.Tool for LLM-driven retrieval

Loading documents

import "github.com/lioarce01/chainforge/pkg/rag/loader"

// Single file (.txt, .md, .json, .html)
docs, err := loader.LoadFile("handbook.md")

// All .txt files in a directory
docs, err := loader.LoadDir("./docs", "*.txt")

// HTML with automatic tag stripping
docs, err := loader.LoadHTMLFile("page.html")

// PDF (heuristic text extraction)
docs, err := loader.LoadPDF("report.pdf")

Chunking / Splitting

Two built-in splitters:
import "github.com/lioarce01/chainforge/pkg/rag/splitter"

// Fixed-size chunks (fast, predictable)
s := splitter.NewFixedSizeSplitter(512, 64)   // 512-rune chunks, 64-rune overlap

// Recursive: tries paragraph → line → sentence → word → character
s := splitter.NewRecursiveCharacterSplitter(512, 64)
Implement splitter.Splitter to use a custom strategy with rag.WithSplitter(s).

Ingesting into Qdrant

import (
    "github.com/lioarce01/chainforge/pkg/memory/qdrant"
    "github.com/lioarce01/chainforge/pkg/memory/qdrant/embedders"
    "github.com/lioarce01/chainforge/pkg/rag"
)

// 1. Create a Qdrant store and embedder.
embedder := embedders.OpenAI(apiKey)
store, _ := qdrant.New(
    qdrant.WithURL("localhost:6334"),
    qdrant.WithCollectionName("kb"),
    qdrant.WithEmbedder(embedder),
)

// 2. Build the ingestor.
qIngestor := rag.NewQdrantIngestor(store, embedder)
ingestor  := rag.NewIngestor(qIngestor)

// 3. Load and ingest.
docs, _ := loader.LoadFile("handbook.pdf")
ingestor.Ingest(ctx, "kb", docs,
    rag.WithChunkSize(512),
    rag.WithChunkOverlap(64),
)
sessionID (here "kb") is the namespace. Use the same ID when retrieving.

Pattern 1 — Auto-retrieval (WithRetriever)

The agent automatically retrieves context before every LLM call and appends it to the system prompt. Best for Q&A chatbots where every question benefits from background knowledge.
retriever := rag.NewQdrantRetriever(store, embedder, "kb")

agent, _ := chainforge.NewAgent(
    chainforge.WithAnthropic(key, "claude-sonnet-4-6"),
    chainforge.WithSystemPrompt("Answer using the provided context."),
    chainforge.WithRetriever(retriever, rag.WithTopK(5)),
)

result, _ := agent.Run(ctx, "session-1", "What is our refund policy?")
The retrieved documents are formatted as Relevant context:\n[1] Source: ...\n... and appended to the system prompt before the first iteration.

Pattern 2 — Tool-based retrieval (RetrieverTool)

The LLM decides when to retrieve. Best for agents with multiple tools where retrieval is conditional.
retriever := rag.NewQdrantRetriever(store, embedder, "kb")

agent, _ := chainforge.NewAgent(
    chainforge.WithAnthropic(key, "claude-sonnet-4-6"),
    chainforge.WithTools(
        rag.NewRetrieverTool(retriever, rag.WithTopK(5)),
        // ... other tools
    ),
)
The LLM can call retriever_tool with {"query": "..."} to fetch context on-demand.

Ingesting into PostgreSQL + pgvector

PGVectorStore implements both rag.Retriever and rag.DocumentStorer, so a single instance handles ingestion and retrieval. It automatically creates the vector extension and the chunk table on first use.
import "github.com/lioarce01/chainforge/pkg/rag"

embedder := embedders.NewOpenAIEmbedder(apiKey, "text-embedding-3-small")

store, err := rag.NewPGVectorStore(
    "postgres://user:pass@localhost:5432/mydb",
    embedder,
    // optional:
    rag.PGVWithTable("kb_chunks"),   // default: "chainforge_rag_chunks"
    rag.PGVWithSchema("public"),      // default: "public"
    rag.PGVWithMaxConns(10),          // default: 10
    rag.PGVWithHNSW(),                // recommended for >100k rows
)
defer store.Close()

// Ingest
ingestor := rag.NewIngestor(store)
docs, _ := loader.LoadFile("handbook.md")
ingestor.Ingest(ctx, "kb", docs, rag.WithChunkSize(512), rag.WithChunkOverlap(64))

// Auto-retrieval
agent, _ := chainforge.NewAgent(
    chainforge.WithAnthropic(key, "claude-sonnet-4-6"),
    chainforge.WithRetriever(store, rag.WithTopK(5)),
)

Session scoping

Store and RetrieveBySession both accept a sessionID. Use different IDs to maintain isolated knowledge bases in the same table:
// Ingest into separate namespaces
ingestorA := rag.NewIngestor(store)
ingestorA.Ingest(ctx, "product-docs", productDocs)
ingestorA.Ingest(ctx, "support-kb",   supportDocs)

// Retrieve only from "product-docs"
docs, _ := store.RetrieveBySession(ctx, "product-docs", query, 5)

// Or retrieve globally across all sessions
docs, _ := store.Retrieve(ctx, query, 5)

HNSW index

For collections over 100k rows, enable the HNSW approximate nearest neighbor index. It trades a ~10–30s extra migration step for significantly faster ANN search:
store, _ := rag.NewPGVectorStore(dsn, embedder, rag.PGVWithHNSW())
The index is created with vector_cosine_ops, matching the <=> cosine distance operator used in all queries.

Prerequisites

  • PostgreSQL 14+ with the pgvector extension installed
  • Environment: PGVECTOR_DSN=postgres://user:pass@host:5432/db
-- One-time DB setup (or let chainforge do it automatically on first use)
CREATE EXTENSION IF NOT EXISTS vector;

Custom Retriever

Implement rag.Retriever to connect any vector database:
type MyRetriever struct{ ... }

func (r *MyRetriever) Retrieve(ctx context.Context, query string, topK int) ([]rag.Document, error) {
    // search your vector store, return []rag.Document
}

core.Embedder interface

All embedders satisfy core.Embedder:
type Embedder interface {
    Embed(ctx context.Context, text string) ([]float32, error)
    Dims() uint64
}
Built-in implementations: embedders.OpenAI(apiKey), embedders.Ollama(host, model, dims).

See also

  • Vector Memory — using Qdrant as a conversation memory store
  • Tools — adding the RetrieverTool alongside other tools