chainforge.NewAgent(...) as functional options.
Provider & Model
| Option | Type | Description |
|---|---|---|
WithProvider(p) | core.Provider | Required. LLM provider implementation. |
WithModel(model) | string | Required. Model identifier (e.g. "claude-sonnet-4-6"). |
Provider shortcuts
These one-call shortcuts set both provider and model atomically:| Shorthand | Equivalent |
|---|---|
WithAnthropic(apiKey, model) | WithProvider(anthropic.New(apiKey)) + WithModel(model) |
WithOpenAI(apiKey, model) | WithProvider(openai.New(apiKey)) + WithModel(model) |
WithGemini(apiKey, model) | WithProvider(gemini.New(apiKey, model)) + WithModel(model) |
WithOllama(baseURL, model) | WithProvider(ollama.New(baseURL)) + WithModel(model) |
WithOpenAICompatible(apiKey, baseURL, name, model) | WithProvider(openai.NewWithBaseURL(...)) + WithModel(model) |
WithAnthropic followed by WithModel("other") uses "other".
Config file
Load provider configuration from a YAML file:config.yaml:
FromConfigFile wraps all errors with "chainforge: FromConfigFile: ..." for easy identification.
Behaviour
| Option | Default | Description |
|---|---|---|
WithMaxIterations(n) | 10 | Maximum agent loop iterations before returning ErrMaxIterations. |
WithToolTimeout(d) | 30s | Per-tool execution timeout. Errors are non-fatal and fed back to the LLM. |
WithRunTimeout(d) | 0 (none) | Per-run deadline applied to the entire Run/RunWithUsage call. Returns context.DeadlineExceeded if the loop does not finish in time. 0 means no timeout. |
WithMaxTokens(n) | 4096 | Max tokens per LLM call. |
WithTemperature(f) | 0.7 | Sampling temperature. |
WithMaxHistory(n) | 0 (unlimited) | Caps the number of history messages loaded from memory per Run call to the most recent n. Prevents context window overflow on long sessions. Has no effect when no memory store is attached. |
WithRetry(n) | — | Wraps the provider with automatic retry on transient errors. n is the total number of attempts (1 = no retry, 3 = 2 retries). Uses exponential backoff: 200 ms → 400 ms → 800 ms … capped at 10 s. Context cancellation and deadline errors are never retried. |
WithStreamBufferSize(n) | 16 | Sets the RunStream channel buffer capacity. Increase for high-throughput streaming with long tool chains to reduce back-pressure. |
WithToolConcurrency(n) | 0 (unlimited) | Caps the number of tool goroutines that run simultaneously during one dispatchTools call. Useful when tools make external API calls and you want to avoid bursting 50+ concurrent requests. 0 restores the unlimited default. |
Prompt & Tools
| Option | Description |
|---|---|
WithSystemPrompt(s) | System message prepended to every conversation. |
WithTools(tools...) | Register one or more tools. Can be called multiple times. |
WithMemory(m) | Attach a memory store for cross-run history persistence. Built-in stores: inmemory, sqlite, postgres, redis, qdrant. See the Memory guide. |
WithStructuredOutput(schema) | Validate every final LLM response against a JSON schema (json.RawMessage). Returns ErrInvalidOutput if the response is not valid JSON or its top-level type doesn’t match. Also injects a system prompt hint. See the Structured Output guide. |
WithHistorySummarizer(a) | When history exceeds WithMaxHistory, compress the overflow into a single [Summary: ...] message using agent a instead of dropping old messages. Requires WithMaxHistory > 0. The summarizer runs under "<sessionID>:summarizer". Errors from the summarizer propagate immediately to the caller. See the Memory guide. |
RAG (Retrieval-Augmented Generation)
| Option | Description |
|---|---|
WithRetriever(r, opts...) | Enable automatic RAG. Before the first iteration, the retriever is queried with the user message and the top-K documents are appended to the system prompt. See the RAG guide. |
opts is ...rag.RetrieveOption. Currently supported: rag.WithTopK(n int) — number of documents to retrieve (default: 5).
HITL (Human-in-the-Loop)
| Option | Description |
|---|---|
WithHITLGateway(g) | Gate every tool call through a hitl.Gateway before execution. Rejected calls receive an override message instead of running. See the HITL guide. |
MCP
| Option | Description |
|---|---|
WithMCPServer(s) | Register a single MCP server. Connection is deferred to the first Run call. |
WithMCPServers(servers...) | Register multiple MCP servers at once. |
Observability
| Option | Default | Description |
|---|---|---|
WithLogger(l) | slog.Default() | Structured slog.Logger for the agent loop. |
WithLogging(logger) | — | Wraps the provider with slog middleware. Logs every Chat/ChatStream call with latency and token counts. Falls back to slog.Default() if logger is nil. |
WithTracing() | — | Wraps the provider with OpenTelemetry spans. Session ID is automatically added to every span. Uses the global tracer — call otel.InitTracerProvider first. No-op if the global tracer is uninitialised. |
WithTraceAttributes(fn) | — | Appends extra OTel span attributes to every Chat/ChatStream call. fn receives the call context so it can extract request-scoped values (user ID, tenant, etc.). Requires WithTracing(). |
WithDebugHandler(fn) | — | Fires fn synchronously at each step of the agent loop: before/after every LLM call and before/after every tool execution. Use PrettyPrintDebugHandler(os.Stderr) for instant local tracing. Intended for development — use WithLogging/WithTracing for production. |
ProviderBuilder
ProviderBuilder is an explicit, ordered alternative to the WithLogging / WithRetry / WithTracing options. Use it when you need fine-grained control over wrapper ordering, or want to share a pre-built provider across multiple agents.
| Method | Description |
|---|---|
NewProviderBuilder(base) | Start building from a base provider |
.WithRetry(maxAttempts) | Add exponential-backoff retry |
.WithRateLimit(rps, burst) | Add token-bucket rate limiting; blocks until a token is available or context is cancelled |
.WithFallback(fallbacks...) | Add fallback chain; tries each provider in order on error |
.WithMetrics(reg) | Add Prometheus metrics (requests, latency, tokens) |
.WithLogging(logger) | Add slog logging (nil → slog.Default()) |
.WithTracing() | Add OpenTelemetry spans |
.Build() | Return the composed provider |
Build is idempotent — wrappers are applied in the order they were registered.
RunWithUsage
Run discards token counts. Use RunWithUsage to receive the total core.Usage accumulated across all iterations of the agent loop:
RunStreamCollect
A streaming one-liner that accumulates the full text and callsonDelta for each chunk. Useful when you want real-time display but also need the final string:
nil for onDelta to get usage without real-time display:
RunStream (raw)
RunStream emits usage on the final Done event:
WithLogging and WithTracing are applied after all other options are resolved, so their position relative to WithProvider does not matter. They can be stacked:
Errors
| Error | Condition |
|---|---|
ErrMaxIterations | Agent loop reached maxIterations without a stop reason. |
ErrToolNotFound | LLM called a tool name that isn’t registered. Non-fatal — returned as tool result. |
ErrNoProvider | NewAgent called without WithProvider. |
ErrNoModel | NewAgent called without WithModel. |
ErrInvalidOutput | WithStructuredOutput is set and the final LLM response fails JSON/schema validation. |
| provider init error | WithGemini (or other provider shortcuts) failed to create the provider — now surfaces the original error at NewAgent time instead of returning ErrNoProvider. |
| misconfiguration | WithHistorySummarizer without WithMaxHistory returns an error at NewAgent time. |
| tool name invalid | Tool with an empty name, non-alphanumeric/underscore/hyphen characters, or a duplicate name returns an error at NewAgent time. |
Error constructors
Use the typed constructors to build structured errors:Unwrap() so errors.Is / errors.As work transparently.