FAQ

This page collects questions we hear repeatedly. The list is short on purpose — every recurring question is a sign the docs missed something, and we prefer to fix the docs. If your question isn’t here, the answer probably lives in the section linked at the bottom of this page.

Is the Engine multi-tenant?

No. Each Engine instance is single-tenant — one brain, one user. To serve many users, run many Engines and route at the upstream layer. This is a deliberate design choice. See Architecture overview for the reasoning.

Does the Engine host LLM models?

No. The Engine talks to external LLM providers (Anthropic, OpenAI, Google Gemini) over their public APIs. You provide the credentials.

Can I run the Engine without an internet connection?

You can run the Engine itself locally. But the Engine needs to reach the LLM provider, so the model calls require outbound HTTPS. For an air-gapped deployment, you’d need a self-hosted LLM endpoint that implements one of the supported APIs.

Can I use a different LLM provider than the three you list?

If the provider implements OpenAI’s chat completions API, set LLM_PROVIDER=openai and configure the base URL. Otherwise, you’d need to add a provider adapter to src/herald/providers/.

Why SQLite and not Postgres?

SQLite is single-file, single-process, and embedded — it matches the single-tenant Engine model exactly. No separate DB to provision, back up, or manage. For multi-tenant workloads, you run multiple Engines each with their own SQLite, not one Engine against a multi-tenant Postgres. The Postgres adapter that briefly existed (src/infra/pg.py) was removed in v2.3.0.

How do I update the agent’s prompt without restarting?

Edit the agent definition in the catalog, then call:

curl -X POST "$ENGINE_URL/admin/reload-catalog" \
  -H "X-Engine-Key: $KEY"

The new catalog is live for the next request.

Does the Engine remember things across sessions?

Yes. Memory persists in the brain database. Two pieces:

Same task_id — the conversation history is on disk; reuse the ID to continue the thread.
Different task_id — long-term memory (episodes, knowledge, soul, social graph) is shared across all tasks for this user. The agent retrieves relevant past memory at the start of each turn.

How do I make the agent forget something?

For one episode or knowledge fact: delete the row directly via SQL (there’s no API for this today). For a whole task: use a new task_id. For everything: wipe the brain volume and let the Learning Centre re-fill it as new conversations happen. For one user’s data in a multi-tenant deployment: blow away that user’s Engine instance.

How long does context persist?

Conversation history — indefinitely; lives in conversations.
Working memory — TTL-bound (configurable; defaults to a few hours).
Long-term memory — indefinitely, with confidence-based aging on knowledge facts.
Channel state — TTL-bound: 3 hours active, 72 hours HITL.

Can the agent run code that takes minutes?

Yes. Code execution inside the sandbox can run for a long time. The SSE stream stays open. There’s a per-command timeout (configurable); beyond that, the command is killed. For really long-running work (hours), structure as multiple turns — the agent kicks off the work, you check status periodically.

What happens if my client disconnects mid-stream?

The Engine keeps running. Reconnect with GET /execute/replay?after=<timestamp> to catch up on missed events. Channel state survives for CHANNEL_STATE_TTL_ACTIVE (default 3 h). See Durability.

How do I make the agent NOT do something?

Three layers, from highest to lowest authority:

Permission policy. The Engine intercepts dangerous operations and prompts the user. Configure with ALLOWED_ROOTS, sandbox feature flags, or requires_permission on tool definitions.
System prompt. Add an explicit refusal: “You will not help with X.”
Tool removal. If the agent doesn’t have a tool, it can’t use it. Curate the catalog.

The model’s trained-in refusals are real but not reliable. Don’t depend on them alone for anything important.

How much does it cost to run the Engine?

The Engine itself is free. The cost is the LLM provider’s per-token charge plus your infrastructure (compute, storage, MCP server APIs). For order-of-magnitude estimates by task type, see Cost and latency.

Can the agent send emails?

Yes — connect a Gmail MCP server (or any other email provider with an MCP integration). After OAuth, the email actions register in the catalog and the agent can use them.

Does the Engine support voice?

Not natively today. To build voice, transcribe upstream (Whisper or similar) and pass the text to /execute. Synthesize the response upstream too. For Gemini deployments, the model can directly handle audio in the media field — see Multimodal.

Where do I report a bug?

For Engine code, the engine repo issues. For docs, the engine-docs issues. For security, see SECURITY.md.

Where do I get help?

For now, GitHub issues and email (hello@september.wtf). A community forum is in the works.

Get started

Capabilities

Build with the Engine

Agents and tools

Test and evaluate

API reference

Guides

Resources

Is the Engine multi-tenant?

Does the Engine host LLM models?

Can I run the Engine without an internet connection?

Can I use a different LLM provider than the three you list?

Why SQLite and not Postgres?

How do I update the agent’s prompt without restarting?

Does the Engine remember things across sessions?

How do I make the agent forget something?

How long does context persist?

Can the agent run code that takes minutes?

What happens if my client disconnects mid-stream?

How do I make the agent NOT do something?

How much does it cost to run the Engine?

Can the agent send emails?

Does the Engine support voice?

Where do I report a bug?

Where do I get help?

See also

Get started

Capabilities

Build with the Engine

Agents and tools

Test and evaluate

API reference

Guides

Resources

Documentation Index

​Is the Engine multi-tenant?

​Does the Engine host LLM models?

​Can I run the Engine without an internet connection?

​Can I use a different LLM provider than the three you list?

​Why SQLite and not Postgres?

​How do I update the agent’s prompt without restarting?

​Does the Engine remember things across sessions?

​How do I make the agent forget something?

​How long does context persist?

​Can the agent run code that takes minutes?

​What happens if my client disconnects mid-stream?

​How do I make the agent NOT do something?

​How much does it cost to run the Engine?

​Can the agent send emails?

​Does the Engine support voice?

​Where do I report a bug?

​Where do I get help?

​See also

Is the Engine multi-tenant?

Does the Engine host LLM models?

Can I run the Engine without an internet connection?

Can I use a different LLM provider than the three you list?

Why SQLite and not Postgres?

How do I update the agent’s prompt without restarting?

Does the Engine remember things across sessions?

How do I make the agent forget something?

How long does context persist?

Can the agent run code that takes minutes?

What happens if my client disconnects mid-stream?

How do I make the agent NOT do something?

How much does it cost to run the Engine?

Can the agent send emails?

Does the Engine support voice?

Where do I report a bug?

Where do I get help?

See also