Tasks live longer than HTTP connections. ADocumentation Index
Fetch the complete documentation index at: https://septemberai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
/execute call can run for
minutes; the user’s laptop can sleep; the SSE stream can drop. The
Engine is designed so none of those break the agent’s progress. This
page covers the durability primitives and how to use them from your
client.
What’s durable
In the brain database:- Task state. Every conversation thread, by
task_id. Persists indefinitely. - Channel state. Mid-execution checkpoints. TTL-bound (default 3 h active, 72 h HITL).
- Working memory. Within-task transient context. TTL-bound.
- Long-term memory. Episodes, knowledge, social graph. Indefinite.
- The HTTP connection. It can drop at any time.
- Sandbox state across tasks. Sandbox workspaces reset between tasks.
- Background processes. The watchdog kills orphans when the task ends.
Three failure scenarios
1. Client disconnects mid-stream
The client’s network drops, or the laptop sleeps, or the user closes the tab. The Engine is still running the turn. What to do: reconnect with replay.0 to replay everything available.
The events come from the channel state, which persists for
CHANNEL_STATE_TTL_ACTIVE (default 3 hours). After that, replay
returns nothing — the task continues but its event buffer is gone.
2. Engine restarts mid-turn
The Engine container gets killed (deploy, OOM, host failure) while a turn is in flight. What to do: nothing special — the Engine’s graceful shutdown drains in-flight turns. If the shutdown was abrupt (kill -9), the turn dies mid-stream. The task and all prior memory are intact; the in-flight turn is lost. The client should detect this (the SSE stream closes with no properthread_lifecycle: completed) and re-issue the call. The user’s
message is whatever you sent; resending it picks up where things were
before the half-completed turn.
3. HITL request, no immediate answer
The agent emits ahitl_request and waits. The user is at lunch.
What to do: nothing — the channel state holds the paused state for
CHANNEL_STATE_TTL_HITL (default 72 hours). When the user comes back
and your client posts to /hitl/respond, the loop resumes.
If the user doesn’t come back within the TTL, the task continues to
exist but the in-flight turn is lost. The next interaction with that
task starts a fresh turn.
Channel state, in depth
The Engine snapshots execution state tochannel_state_snapshots at
key boundaries:
- After every model call.
- After every tool result.
- When emitting a HITL request.
- On graceful shutdown.
task_id and timestamped. Replay reads
events emitted after a given timestamp.
Two TTLs apply:
| TTL | Default | Used for |
|---|---|---|
CHANNEL_STATE_TTL_ACTIVE | 10800 s (3 h) | Active turns — the agent is running. |
CHANNEL_STATE_TTL_HITL | 259200 s (72 h) | Paused turns waiting on HITL. |
Implementing resilient streaming
A robust client looks like this:last_ts per task; on disconnect, replay from there.
Resuming after the user closes the app
For mobile or tab-style apps where the user might close and reopen hours later:- Persist
task_idandlast_tsto the user’s device. - When the user reopens, call
/execute/replay?after=<last_ts>. - If the response is empty (TTL expired), the in-flight turn (if any) is gone. Surface “your last task expired” and let them start fresh.
- If a HITL request comes back through replay, surface the prompt to the user.
Active concurrency
Pertask_id, only one /execute runs at a time. A second concurrent
call returns 409 Conflict.
If you legitimately need to run two things at once for the same user,
use two task_ids. They share the brain (so they share long-term
memory) but their channel states are independent.
Cleanup
Tasks accumulate in the brain. Periodic cleanup is your responsibility or a future Engine feature:- Old channel states expire automatically (TTL).
- Old trajectories are processed by the Learning Centre but not deleted by default. Cap retention with a periodic cleanup job.
- Conversations persist until you delete them.
See also
- Streaming — the client side of all this.
- POST /execute — the request surface.
- Environment variables — TTL configuration.

