Skip to main content
chainforge provides two observability layers: structured logging (stdlib log/slog, zero deps) and OpenTelemetry tracing (OTLP gRPC export). Both are opt-in — pkg/core remains dependency-free.

Quick setup (agent options)

The simplest way to enable observability is via agent options. No extra imports needed:
import (
    "log/slog"
    chainforge "github.com/lioarce01/chainforge"
    cfgotel "github.com/lioarce01/chainforge/pkg/middleware/otel"
)

// Optional: initialise a real tracer provider before NewAgent.
tp, shutdown, _ := cfgotel.InitTracerProvider(ctx, "localhost:4317", "my-service", "1.0.0")
defer shutdown(ctx)

agent, _ := chainforge.NewAgent(
    chainforge.WithProvider(p),
    chainforge.WithModel("claude-sonnet-4-6"),
    chainforge.WithLogging(slog.Default()),  // structured logs per call
    chainforge.WithTracing(),                // OTel spans per call
)
  • WithLogging and WithTracing wrap the provider transparently — no changes to how you call Run or RunStream.
  • WithTracing() uses the global OTel tracer. If InitTracerProvider was not called, a noop tracer is used silently.
  • Both options can be stacked in any order. Recommended: WithLogging before WithTracing so log events nest inside the span.

Logging middleware (advanced)

For finer-grained control — or to wrap a memory store — use pkg/middleware/logging directly:
import "github.com/lioarce01/chainforge/pkg/middleware/logging"

// Wrap provider — emits debug/info/error events per call.
loggedProvider := logging.NewLoggedProvider(rawProvider, logger)

// Wrap memory store.
loggedMemory := logging.NewLoggedMemoryStore(rawStore, logger)

agent, _ := chainforge.NewAgent(
    chainforge.WithProvider(loggedProvider),
    chainforge.WithMemory(loggedMemory),
)

Log events emitted

EventLevelFields
provider.Chat: startDebugprovider, model, messages
provider.Chat: doneInfoprovider, model, duration, stop_reason, input_tokens, output_tokens
provider.Chat: errorErrorprovider, model, duration, error
provider.ChatStream: startDebugprovider, model, messages
provider.ChatStream: doneInfoprovider, model, duration, stop_reason, text_bytes
provider.ChatStream: stream errorErrorprovider, duration, error
memory.Get: doneDebugsession, messages, duration
memory.Append: doneDebugsession, count, duration
memory.Clear: doneInfosession, duration

OpenTelemetry tracing

pkg/middleware/otel wraps providers and memory with OTel spans. Requires an OTLP-compatible backend (Jaeger, Grafana Tempo, Honeycomb, etc.).

Setup

import (
    "go.opentelemetry.io/otel/trace/noop"
    cfgotel "github.com/lioarce01/chainforge/pkg/middleware/otel"
)

// In production: initialise a real tracer provider.
tp, shutdown, err := cfgotel.InitTracerProvider(
    ctx,
    "localhost:4317",   // OTLP gRPC endpoint
    "my-service",
    "1.0.0",
)
defer shutdown(ctx)

tracer := cfgotel.Tracer()

// In tests: use the noop tracer (no real backend needed).
tracer = noop.NewTracerProvider().Tracer("")

// Wrap at startup.
tracedProvider := cfgotel.NewTracedProvider(rawProvider, tracer)
tracedMemory   := cfgotel.NewTracedMemoryStore(rawStore, tracer)

agent, _ := chainforge.NewAgent(
    chainforge.WithProvider(tracedProvider),
    chainforge.WithMemory(tracedMemory),
)

Spans and attributes

Span nameAttributes
chainforge.provider.chatprovider, model, messages, session_id (auto), stop_reason, input_tokens, output_tokens
chainforge.provider.chat_streamprovider, model, messages, session_id (auto), stop_reason, input_tokens, output_tokens
chainforge.memory.getsession_id, message_count
chainforge.memory.appendsession_id, message_count
chainforge.memory.clearsession_id
The session_id attribute is injected automatically by the agent loop before every provider call — no user action required.

Custom span attributes

Use WithTraceAttributes to append arbitrary attributes to every span. The function receives the call context, allowing extraction of request-scoped values:
import (
    "go.opentelemetry.io/otel/attribute"
    chainforge "github.com/lioarce01/chainforge"
)

agent, _ := chainforge.NewAgent(
    chainforge.WithProvider(p),
    chainforge.WithModel("claude-sonnet-4-6"),
    chainforge.WithTracing(),
    chainforge.WithTraceAttributes(func(ctx context.Context) []attribute.KeyValue {
        return []attribute.KeyValue{
            attribute.String("user_id",  userIDFromCtx(ctx)),
            attribute.String("tenant",   tenantFromCtx(ctx)),
            attribute.String("session",  chainforge.SessionIDFromContext(ctx)),
        }
    }),
)
WithTraceAttributes has no effect when WithTracing is not set.

Streaming span lifecycle

chainforge.provider.chat_stream ends after the last event is drained, not after the channel is opened. This captures true end-to-end streaming latency including tool dispatch time.

Composing logging + tracing (advanced)

rawProvider    := anthropicProvider
loggedProvider := logging.NewLoggedProvider(rawProvider, logger)
tracedProvider := otel.NewTracedProvider(loggedProvider, tracer)

// tracedProvider → loggedProvider → rawProvider
// Both wrappers are active; order determines span/log nesting.
Or equivalently via options (recommended):
chainforge.WithLogging(logger),  // logging wraps rawProvider
chainforge.WithTracing(),        // tracing wraps the logged provider

Prometheus metrics

pkg/middleware/metrics records three metric families for every provider call.

Setup

import (
    "github.com/prometheus/client_golang/prometheus"
    "github.com/lioarce01/chainforge/pkg/middleware/metrics"
)

// Register on any prometheus.Registerer (use NewRegistry() in tests)
mp, err := metrics.New(provider, prometheus.DefaultRegisterer)
// or panic on registration error:
mp = metrics.MustNew(provider, prometheus.DefaultRegisterer)

agent, _ := chainforge.NewAgent(chainforge.WithProvider(mp), chainforge.WithModel(model))
Via ProviderBuilder:
p := chainforge.NewProviderBuilder(provider).
    WithMetrics(prometheus.DefaultRegisterer).
    Build()

Metrics emitted

MetricTypeLabelsNotes
chainforge_provider_requests_totalCounterprovider, status (ok|error)Incremented after each call completes
chainforge_provider_request_duration_secondsHistogramproviderLatency in seconds; for streams covers open → channel close
chainforge_provider_tokens_totalCounterprovider, token_type (input|output)From Usage in the response or Done stream event
For ChatStream, all three metrics are recorded after the channel is fully drained, not at stream-open time.

Per-tool metrics

Wrap individual tools to record per-tool latency and call counts:
import "github.com/lioarce01/chainforge/pkg/middleware/metrics"

reg := prometheus.NewRegistry() // or prometheus.DefaultRegisterer

// Share metric vectors across all tools in one agent.
toolReg, err := metrics.NewToolRegistry(reg)
if err != nil { ... }

agent, _ := chainforge.NewAgent(
    chainforge.WithProvider(p),
    chainforge.WithModel("claude-sonnet-4-6"),
    chainforge.WithTools(
        toolReg.Wrap(myTool),
        toolReg.Wrap(anotherTool),
    ),
)
MetricTypeLabels
chainforge_tool_calls_totalCountertool, status (ok|error)
chainforge_tool_duration_secondsHistogramtool
Useful PromQL queries:
# Error rate per tool
rate(chainforge_tool_calls_total{status="error"}[5m])
  / rate(chainforge_tool_calls_total[5m])

# p99 tool latency
histogram_quantile(0.99, rate(chainforge_tool_duration_seconds_bucket[5m]))

Server logs

The HTTP server emits JSON logs for every request:
{
  "time": "2026-03-17T12:00:00Z",
  "level": "INFO",
  "msg": "http request",
  "method": "POST",
  "path": "/v1/chat",
  "status": 200,
  "duration": "1.234s",
  "remote_addr": "10.0.0.1:54321"
}

Local tracing with Jaeger

# Start the full stack
cd deploy/
ANTHROPIC_API_KEY=sk-ant-... docker-compose up -d

# Enable OTel in config.yaml
echo "otel_enabled: true" >> config.yaml

# View traces at http://localhost:16686