Embedding the server (library usage)
The fastest way to serve an agent over HTTP ischainforge.Serve:
Serve blocks until SIGINT or SIGTERM, then performs a 30-second graceful
shutdown and calls agent.Close(). No other setup required.
Custom lifecycle
For applications that manage their own signal handling, useServeContext:
Endpoints exposed by Serve / ServeContext
| Method | Path | Description |
|---|---|---|
POST | /v1/chat | Synchronous chat |
POST | /v1/chat/stream | Streaming chat via SSE |
GET | /healthz | Liveness probe |
/readyz, /v1/info, custom CORS origins, or TLS — use pkg/server directly.
HTTP API
| Method | Path | Description |
|---|---|---|
POST | /v1/chat | Synchronous chat — waits for full response |
POST | /v1/chat/stream | Streaming chat via SSE |
GET | /healthz | Liveness probe |
GET | /readyz | Readiness probe |
GET | /v1/info | Provider/model metadata |
Request format
Synchronous response
Streaming (SSE)
Configuration
Configuration is loaded from a YAML file with environment variable overrides. API keys are env-only — they cannot appear in YAML files.Docker
distroless/static-debian12:nonroot — no shell, uid 65532.
docker-compose
Starts chainforge + Qdrant + OpenTelemetry Collector + Jaeger:Kubernetes
Security posture
The Deployment enforces:runAsNonRoot: true— container cannot run as rootreadOnlyRootFilesystem: true— no writes to container filesystemallowPrivilegeEscalation: falsecapabilities.drop: ["ALL"]
HPA
The default HPA scales between 2 and 10 replicas at 50% CPU:Helm
Key values
| Value | Default | Description |
|---|---|---|
config.provider.name | anthropic | Provider name |
config.model | claude-sonnet-4-6 | Model identifier |
config.otelEnabled | false | Enable OTel tracing |
hpa.enabled | true | Enable horizontal pod autoscaler |
secrets.anthropicApiKey | "" | Set at install time, never in values.yaml |
ingress.enabled | false | Enable ingress |