Engine won’t start
Symptom: KeyError: 'LLM_API_KEY'
The Engine couldn’t find a required environment variable.
Fix: confirm .env exists and has the required keys:
LLM_PROVIDER is not OpenAI —
embeddings always route through OpenAI today.
Symptom: container exits immediately with no output
Almost always a docker-compose env issue. Fix:engine service has the env you expect. If .env is
missing or malformed, compose silently falls back to defaults and the
Engine bails on startup.
Symptom: migration error on startup
A migration file failed to apply, usually because the brain is at a
schema version that the running Engine doesn’t recognize.
Fix:
If you’re on a fresh dev environment:
Symptom: bind: address already in use
Port 8000 is already bound by another process.
Fix: stop the other process or change the Engine’s port:
Health check fails
Symptom: /health returns 503
A subsystem is unhealthy.
Fix: look at the response body. The subsystems field tells you
which one:
database: error— SQLite can’t be opened. CheckSQLITE_DB_PATHand the volume mount.llm_provider: error— the configured provider is unreachable or rejecting auth. Try a curl directly to the provider.asset_directory: error— the AD encryption key is missing or wrong. SetAD_ENCRYPTION_KEY(or remove it if you don’t use MCP).
Symptom: /health returns nothing, connection refused
The Engine isn’t running or isn’t bound to the port you’re hitting.
Fix:
Authentication failures
Symptom: every request returns 401 INVALID_KEY
Mismatch between ENGINE_API_KEY and the value you’re sending.
Fix:
ENGINE_KEY_HASH (production-style), make sure the
hash actually matches your key:
Streaming issues
Symptom: SSE stream shows nothing until the turn ends
Buffering. Either curl or your reverse proxy is holding the response until it’s complete. Fix:- For curl: add
-N. - For requests (Python): set
stream=True. - For nginx/ALB: set
proxy_buffering off. - For corporate proxies: investigate; sometimes they’re hard-coded to buffer.
Symptom: stream ends abruptly with no thread_lifecycle: completed
The Engine crashed or was killed mid-turn.
Fix: check docker compose logs engine for the crash. Then
reproduce the request in isolation. If it’s reproducible, file an
issue with the trace.
In the meantime, you can resume with /execute/replay — but if the
Engine truly crashed, the channel state may be stale.
Sandbox issues
Symptom: bash tool calls all fail with bwrap: ...
Bubblewrap isn’t available or the kernel doesn’t support the namespace
configuration.
Fix:
The default Engine image includes bubblewrap. If you customized the
image, make sure it’s still installed:
Symptom: every command needs permission
ALLOWED_ROOTS is set too narrowly.
Fix: widen it:
Apple Silicon issues
Symptom: tests fail with seccomp probe errors
Some sandbox tests require x86 syscall semantics. The default test
container forces linux/amd64:
Symptom: image build is glacial
Docker for Mac on Apple Silicon emulates x86 builds. To speed up:- Use the arm64 image when possible (the default Dockerfile builds natively on arm64 for the prod target).
- For the test target, accept the slowness.
Performance
Symptom: turns are slow even with a fast model
Common causes:- No prompt caching. Check
usage.cache_hit_tokens— if it’s always 0, the system prompt is changing between calls. - Compaction firing every turn. Check for
compaction_eventevents. If they fire constantly, the task’s context is too long. - Slow tools. Check
tool_call/tool_resulttiming. - Container CPU limit. Docker for Mac defaults are tight; bump in Docker Desktop settings.
Symptom: brain queries are slow
Memory tables grow unbounded. Periodic cleanup helps:When all else fails
- Check the logs.
docker compose logs engine. The error is usually there. - Reproduce on a clean brain.
docker compose down --volumes && docker compose up engine. Eliminates “stale state” as a variable. - Bisect. If something used to work and now doesn’t, what changed? Last config change? Last code change? Last upgrade?
- File an issue. Include the version, the env (sanitized), the request, the response, and the relevant log lines.
See also
- Common tasks — the inverse of this page.
- Environment variables — every knob.
- Architecture overview — the mental model that helps with debugging.

