What you can see
| Source | What it tells you |
|---|---|
| SSE stream | Per-turn behavior in real time. |
usage events | Token counts and cache-hit ratio per call. |
/health | Subsystem status. |
observability_events table | Persistent event log queryable via API. |
| Engine logs | Structured logs with request IDs. |
In-stream observability
Every/execute call emits these as part of the stream:
usage
After each model call:
usage events to get the turn’s total. The
cache_hit_tokens is part of input_tokens — it’s the portion that
was billed at the cache rate.
tool_call and tool_result
Tells you which tools the agent used and what they returned. Useful for
“why did the agent do X?” investigations.
compaction_event
If compaction triggered:
heartbeat
Periodic. Mostly useful for liveness; not interesting per-turn.
Per-turn metrics to capture
For each turn, your application should capture:task_id- Turn duration (start to
thread_lifecycle: completed) - Model calls count
- Tool calls (names and counts)
- Total token usage (sum of
usageevents) - Cache hit ratio (
cache_hit_tokens / input_tokens) - Compaction count
- HITL request count
- Final stop reason
The observability_events table
The Engine writes a row to observability_events for every meaningful
event:
- Errors.
- Slow tool calls.
- Permission denials.
- Migration runs.
- Catalog reloads.
- Cache-hit-ratio alerts (when
CACHE_HIT_MONITOR_ENABLED=true).
raven query API.
The raven query API
The Engine exposes an internal observability API at src/raven/api.py.
It’s not currently surfaced as a public HTTP endpoint, but the
underlying functions are stable:
- Engine writes to
observability_events. - A sidecar (or scheduled job) reads new rows.
- Sidecar pushes to your observability backend (Datadog, Honeycomb, etc.).
Engine logs
SetENGINE_LOG_LEVEL=INFO for production. Logs are JSON-structured:
request_id is the same across all logs for one HTTP request. The
task_id is the same across all logs for one task.
DEBUG logs include MCP request/response traces, model call
parameters, and context-assembly steps. Useful for development; too
noisy for production.
WARNING and ERROR are what you want to alert on.
Tracing
The Engine doesn’t currently emit OpenTelemetry traces, but therequest_id propagated through logs gives you a poor man’s trace —
grep for the request ID across the logs to see the full request path.
For real distributed tracing, add OTel instrumentation as a wrapper
around the Engine’s HTTP client and the Asset Directory’s MCP client.
What to alert on
For external observability:- Error rate > 1%. Something is broken.
- Cache hit ratio drops sharply. Indicates a cache invalidation bug, often after a deploy.
- Per-turn token count spikes. Often a sign of unbounded loops or failing compaction.
- HITL count spikes. Either the user is being asked too much (annoying) or the agent’s permission policy is misconfigured.
/healthreturns non-ok. Engine is unhealthy.MIGRATION_REQUIREDerrors. Engine is at an old schema version.- MCP failure rate spikes. A connector is broken.
What to dashboard
For day-to-day visibility:- Tasks per minute (rate).
- Per-turn duration distribution.
- Per-turn cost distribution.
- Cache hit ratio trend.
- Tool-call distribution.
- Top error codes.
- Compaction frequency.
- HITL frequency by category.
See also
- Operations → Observability — the ops view.
- Streaming events —
usage,compaction_event, etc. - GET /health — liveness endpoint.

