Skip to main content

Documentation Index

Fetch the complete documentation index at: https://septemberai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

The Engine deploys as a small set of cooperating containers. This page is the C4 Level 2 view — what runs, what they share, and what crosses the process boundary between them.

The picture

The containers

The Engine ships as one process and one database file. The “containers” in C4 sense are the logical units that could be split apart, even though we run them as one process today.

engine — the API server

A single Uvicorn process running src.server:app (FastAPI). Default port 8000. Stateless from the request’s point of view — all durable state lives in the brain database. Authenticates every request with X-Engine-Key. Inside this process, the major subsystems are:
  • HTTP layer — FastAPI routes, CORS middleware, Unicode sanitization.
  • Agent loop / coordinator — drives execution, manages task slots, handles compaction.
  • Herald — unified LLM abstraction across Anthropic, OpenAI, and Gemini. Adds streaming, prompt caching, quota tracking, token budgets.
  • Memory subsystem — episodes, knowledge, social graph, working memory, soul.
  • Asset Directory — MCP connector registry, credential vault (Fernet), OAuth flow, refresh scheduling.
  • Sandbox — process isolation for tool execution.
  • Learning Centre — background batch scheduler that consolidates trajectories into knowledge.

brain.sqlite — the database

A single SQLite file (default path: /data/brain.sqlite) loaded with two extensions:
  • sqlite-vec for vector search (1536-dim embeddings).
  • FTS5 for full-text keyword search.
The file holds 20+ tables managed by the migration runner under src/infra/migrations.py. Migrations are SQL files in migrations/ and run on startup. A single Engine process holds an async connection pool against this file (see src/infra/sqlite_pool.py). The brain is single-user. Multi-user deployments run multiple Engine processes, each with its own brain file.

External LLM provider

The Engine never embeds model weights. All inference is a network call to one of:
  • Anthropicapi.anthropic.com, used when LLM_PROVIDER=anthropic.
  • OpenAIapi.openai.com, used when LLM_PROVIDER=openai. Also used for embeddings regardless of the chat provider.
  • Google Geminigenerativelanguage.googleapis.com, used when LLM_PROVIDER=gemini.
Provider selection is a deploy-time decision, not a per-request one. Embedding calls always go to OpenAI’s text-embedding model in the current implementation.

External MCP servers

Optional. When the user connects a service (Slack, Gmail, etc.), the Asset Directory stores the OAuth credentials and proxies tool calls to the remote MCP server during execution. Per-user. Credentials encrypted at rest with Fernet using AD_ENCRYPTION_KEY.

How they talk

FromToProtocolNotes
ClientengineHTTPS + SSEAll requests carry X-Engine-Key
enginebrain.sqliteaiosqlite + sqlite-vecAsync pool
engineLLM providerHTTPS streamingStreaming SSE / chunked
engineMCP servervaries (HTTP / WebSocket / stdio)OAuth or static credentials
enginesandboxunix pipes / signalsSubprocess with bwrap
Internal to the engine process, subsystems talk through:
  • Direct Python function calls.
  • An in-process event bus (engine_core/events.py) for the SSE stream.
  • The brain database for any state that has to survive a restart.
There is no internal queue, broker, or RPC system. Every internal call is in-process. This is a deliberate constraint: it keeps the Engine small, simple to reason about, and easy to deploy as a single container.

Deployment topologies

Single-node (default)

One Engine process, one brain file, one local sandbox. This is the model for local development and for solo users running their own Engine. The docker-compose.yml in the engine repo deploys this configuration.

Hosted, per-user

A control plane upstream creates one Engine process per active user. The upstream router (BAP — Backend Application Platform) holds the user-to- engine mapping and routes API calls. Each Engine sees only its assigned brain file. This is how the public Engine product runs.

Hybrid (in production)

A shared coordinator — bap-engine — provisions and manages many Engine workers. The coordinator handles lifecycle, health, quotas, and routing. Engines remain stateless from the coordinator’s perspective. See BAP Engine for the full docs; the source lives at bap-engine.

What lives where

Here’s the mapping from “subsystem” to “code path” so the rest of the docs can refer to a canonical location.
SubsystemPath
HTTP serversrc/server.py
Bootstrap & statesrc/bootstrap.py
Configurationsrc/config.py
Authenticationsrc/auth.py, src/key_rotation.py
Agent loopsrc/engine_core/coordinator.py, src/engine_core/agent_loop.py
Compactionsrc/engine_core/compaction_orchestrator.py
Channel statesrc/engine_core/channel_state.py
Eventssrc/engine_core/events.py
Memorysrc/memory/
Asset Directorysrc/asset_directory/
Utility Directorysrc/utility_directory/
Herald (LLM)src/herald/
Sandboxsrc/sandbox/
Learning Centresrc/learning_centre/
Infrastructuresrc/infra/
Observabilitysrc/raven/
Securitysrc/security/
Component-level detail for each of these lives in Components.