Skip to main content

Documentation Index

Fetch the complete documentation index at: https://internal.september.wtf/llms.txt

Use this file to discover all available pages before exploring further.

The Engine emits structured logs, observability events, and per-turn usage data. This page covers what’s available to your application and how to use it to understand what the agent is doing in production. For the operational view (dashboards, alerts, log shipping), see Operations → Observability.

What you can see

SourceWhat it tells you
SSE streamPer-turn behavior in real time.
usage eventsToken counts and cache-hit ratio per call.
/healthSubsystem status.
observability_events tablePersistent event log queryable via API.
Engine logsStructured logs with request IDs.

In-stream observability

Every /execute call emits these as part of the stream:

usage

After each model call:
{
  "input_tokens": 1240,
  "output_tokens": 312,
  "cache_hit_tokens": 980
}
Sum across all usage events to get the turn’s total. The cache_hit_tokens is part of input_tokens — it’s the portion that was billed at the cache rate.

tool_call and tool_result

Tells you which tools the agent used and what they returned. Useful for “why did the agent do X?” investigations.

compaction_event

If compaction triggered:
{
  "before_tokens": 24000,
  "after_tokens": 8500,
  "strategy": "summary"
}
Frequent compaction events suggest your task is too long for the context window. Worth investigating.

heartbeat

Periodic. Mostly useful for liveness; not interesting per-turn.

Per-turn metrics to capture

For each turn, your application should capture:
  • task_id
  • Turn duration (start to thread_lifecycle: completed)
  • Model calls count
  • Tool calls (names and counts)
  • Total token usage (sum of usage events)
  • Cache hit ratio (cache_hit_tokens / input_tokens)
  • Compaction count
  • HITL request count
  • Final stop reason
Logging these per-turn lets you build dashboards over time.

The observability_events table

The Engine writes a row to observability_events for every meaningful event:
  • Errors.
  • Slow tool calls.
  • Permission denials.
  • Migration runs.
  • Catalog reloads.
  • Cache-hit-ratio alerts (when CACHE_HIT_MONITOR_ENABLED=true).
The schema:
CREATE TABLE observability_events (
  id INTEGER PRIMARY KEY,
  event_type TEXT,
  data_json TEXT,
  created_at TEXT
);
Query it directly via SQLite or via the raven query API.

The raven query API

The Engine exposes an internal observability API at src/raven/api.py. It’s not currently surfaced as a public HTTP endpoint, but the underlying functions are stable:
from raven import api

events = api.query(
    event_type="LLM_RATE_LIMITED",
    since="2026-04-20T00:00:00Z",
    limit=100,
)
For external dashboards, the typical pattern is:
  1. Engine writes to observability_events.
  2. A sidecar (or scheduled job) reads new rows.
  3. Sidecar pushes to your observability backend (Datadog, Honeycomb, etc.).

Engine logs

Set ENGINE_LOG_LEVEL=INFO for production. Logs are JSON-structured:
{
  "timestamp": "2026-04-27T12:34:56.123Z",
  "level": "INFO",
  "logger": "engine_core.coordinator",
  "request_id": "req-abc123",
  "task_id": "task-001",
  "message": "Starting execution",
  "extra": { ... }
}
The request_id is the same across all logs for one HTTP request. The task_id is the same across all logs for one task. DEBUG logs include MCP request/response traces, model call parameters, and context-assembly steps. Useful for development; too noisy for production. WARNING and ERROR are what you want to alert on.

Tracing

The Engine doesn’t currently emit OpenTelemetry traces, but the request_id propagated through logs gives you a poor man’s trace — grep for the request ID across the logs to see the full request path. For real distributed tracing, add OTel instrumentation as a wrapper around the Engine’s HTTP client and the Asset Directory’s MCP client.

What to alert on

For external observability:
  • Error rate > 1%. Something is broken.
  • Cache hit ratio drops sharply. Indicates a cache invalidation bug, often after a deploy.
  • Per-turn token count spikes. Often a sign of unbounded loops or failing compaction.
  • HITL count spikes. Either the user is being asked too much (annoying) or the agent’s permission policy is misconfigured.
  • /health returns non-ok. Engine is unhealthy.
  • MIGRATION_REQUIRED errors. Engine is at an old schema version.
  • MCP failure rate spikes. A connector is broken.

What to dashboard

For day-to-day visibility:
  • Tasks per minute (rate).
  • Per-turn duration distribution.
  • Per-turn cost distribution.
  • Cache hit ratio trend.
  • Tool-call distribution.
  • Top error codes.
  • Compaction frequency.
  • HITL frequency by category.

See also