This page captures the Engine’s threat model: who we’re defending against, what they can do, what stops them, and what we accept as residual risk. It’s a living document. When the system changes, the threat model changes; update this page in the same PR.Documentation Index
Fetch the complete documentation index at: https://septemberai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Trust boundaries
Three boundaries: the Engine process, the sandbox, and the host.Adversary model
We consider three adversaries.1. Malicious end user
A user calls/execute with crafted prompts trying to make the agent do
something it shouldn’t (exfiltrate other users’ data, persist a back door,
escape the sandbox).
What stops them:
- Each Engine instance is single-tenant. There’s no other user’s data on this brain to exfiltrate.
- The sandbox prevents most filesystem-escape attempts.
- Permission prompts surface dangerous operations to the user, who is also the principal — so the attack model degenerates to “user attacks themselves,” which is mostly self-harm.
- The user can instruct the agent to call MCP connectors they own. We don’t try to prevent self-harm at this layer.
2. Malicious model output
The model — possibly steered by a prompt-injection in retrieved content or in tool output — emits tool calls trying to do something dangerous. What stops them:- The sandbox confines tool execution to the brain’s data directory
and pre-approved paths in
ALLOWED_ROOTS. seccompblocks dangerous syscalls.landlockenforces filesystem ACLs.- Permission prompts halt destructive operations (
rm,shred, writes outside allowed roots) and surface them as HITL. - Secret scanning (
src/security/secret_scanner.py) scrubs likely secrets out of model output before it streams to the client.
- A clever model could chain allowed operations to produce a result we consider harmful. We rely on permission prompts at the human boundary for this class.
3. Malicious MCP server
A connected MCP server returns crafted output trying to inject into the prompt or steal credentials. What stops them:- Tool output is plain text from the agent’s perspective; it carries no authority.
- Credentials are encrypted at rest with
AD_ENCRYPTION_KEYand only decrypted to make outbound calls. They never appear in the prompt or the SSE stream. - The model is instructed (in the system prompt) to treat tool output as data, not instructions. This is best-effort, not a guarantee.
- A circuit breaker disconnects MCP servers that error repeatedly.
- Prompt injection from tool output is an open problem industry-wide. We mitigate, we don’t eliminate.
Defenses by layer
Network
- HTTPS terminated upstream (BAP / load balancer). The Engine itself serves HTTP and assumes TLS is already done.
- Outbound: the Engine talks to LLM providers and (per-user) MCP servers. Outbound destinations are not currently allowlisted at the network layer in the default deployment; rely on the upstream firewall if your environment requires that.
Process
- The Engine process runs as a non-root user inside its container.
- The Engine never
execs untrusted binaries. Tool execution happens inside the sandbox, in a separate process.
Sandbox
bubblewrapprovides namespaced isolation: separate filesystem, network, IPC, and PID namespaces.seccompfilter blocks dangerous syscalls (mount,pivot_root, most module operations).landlockenforces filesystem ACLs at the kernel layer; even if a sandboxed process bypasses bwrap, landlock holds.- Permission prompts (
src/sandbox/permissions.py) gate operations the static rules can’t classify as safe — destructive shell commands, writes to high-value paths.
Storage
- Credentials encrypted at rest with Fernet using
AD_ENCRYPTION_KEY. - The brain database is single-user; a process gets exactly one brain.
- Migrations are versioned and run on startup; mismatched versions fail fast.
Auth
- API key validated on every request (
X-Engine-Key). - Hash-only deployment supported via
ENGINE_KEY_HASHso plaintext never appears in the Engine’s environment. - Key rotation with overlap window prevents in-flight requests from failing across a rotation.
What we don’t defend
- The host kernel. If the host is compromised, the Engine is.
- Side-channel attacks against the model provider (e.g. timing leaks from streaming).
- DoS by paying customers. The Engine doesn’t impose its own rate limit beyond what the upstream LLM provider does.
- Determined social engineering of the human in the HITL loop.
Process
Threat-model changes happen through PRs to this file. Anything that materially expands the attack surface (a new public endpoint, a new external integration, a new credential type) needs a security review before merge.See also
- Authentication — how the API key enforces the auth boundary.
- Sandbox — what’s inside the sandbox boundary in code.
- Data handling — what we keep, where we keep it, and for how long.

