Where data lives
| Data | Where | Encryption at rest |
|---|---|---|
| Conversation history | Brain SQLite, conversations + memory tables | Volume-level (file system) |
| Episodes, knowledge facts, social graph | Brain SQLite, episodes_*, knowledge_store_*, social_graph_* | Volume-level |
| Working memory | Brain SQLite, working_memory_log, TTL-bound | Volume-level |
| Soul | Brain SQLite | Volume-level |
| Trajectories | Brain SQLite, trajectories | Volume-level |
| MCP credentials | Brain SQLite, connections.credentials | Application-level (Fernet via AD_ENCRYPTION_KEY) |
| Channel state snapshots | Brain SQLite, channel_state_snapshots, TTL-bound | Volume-level |
| Observability events | Brain SQLite, observability_events | Volume-level |
| Engine logs | stdout → log shipping pipeline | Pipeline-dependent |
| Config (incl. secrets) | Container env, secret manager | Secret-manager-dependent |
What we store
For each user:Identity
- A soul object summarizing the user’s core identity, values, and patterns.
- Social graph nodes for people the user has mentioned.
Activity
- Every
/executecall’s message and the agent’s response, in conversation history. - Tool calls and results.
- Permission decisions.
Inferences
- Episodes summarizing past events.
- Knowledge facts inferred from observations.
- Social graph edges inferred from conversations.
External access
- MCP server connections and their (encrypted) credentials.
- Active scopes per connection.
What we don’t store
- The raw data passing through MCP servers (e.g. the body of every Slack message the agent reads). Only what gets surfaced into context during a turn.
- LLM provider responses beyond what’s in conversation history. We don’t keep a separate “model call log.”
- User credentials for the Engine itself (we keep only the hash, if
using
ENGINE_KEY_HASH). - Plaintext API keys for LLM providers (those live in env, not in data).
Retention
| Data | Default retention | Configurable |
|---|---|---|
| Conversation history | Indefinite | Yes (manual cleanup) |
| Long-term memory | Indefinite (with confidence aging on knowledge facts) | Yes |
| Working memory | TTL — typically a few hours | Yes (CHANNEL_STATE_TTL_*) |
| Trajectories | Indefinite by default | Recommended cleanup at 30 days |
| Channel state | TTL (3 h active, 72 h HITL) | Yes (CHANNEL_STATE_TTL_*) |
| Observability events | Indefinite by default | Recommended cleanup at 90 days |
| MCP credentials | Until disconnected or expired | Manual revocation via DELETE /assets/connections/{ref_id} |
| Engine logs | Pipeline-dependent | Yes |
Per-user isolation
The Engine is single-tenant. Each instance has one brain. There is no multi-user data in a single brain. For multi-user products, isolation is achieved by running separate Engine processes (and separate brains) per user. The upstream router (BAP) holds the user-to-engine mapping and ensures requests route to the correct Engine. If a routing layer makes a mistake and routes user A’s request to user B’s Engine, user A would see user B’s memory. This is a routing-layer bug; the Engine has no defense against being addressed incorrectly. It trusts that the request authenticated against this Engine’s API key belongs here.PII
Memory unavoidably contains PII:- The user’s name, in the soul and in social-graph nodes.
- Names of people the user mentions, in social-graph nodes.
- Email addresses, phone numbers, addresses if the user shares them.
- Quotes from emails, messages, or documents the user processed through the agent.
- Encrypt the volume at rest.
- Restrict access to the brain volume to the Engine’s service account only.
- Don’t ship logs containing user content to third-party log pipelines.
- Consider redaction at the log level for sensitive fields.
Deletion
One user’s data
Wipe their Engine’s brain volume:One conversation thread
There’s no API for this today. Direct SQL:One specific memory item
Direct SQL onepisodes, knowledge_store, or social_graph_*. The
embeddings tables (*_vec) cascade delete via foreign-key triggers
configured in migrations.
Right-to-erasure (GDPR / DPDP)
For a complete deletion, including:- The user’s brain.
- Backups containing their data.
- Log lines containing their data.
- Wipe the brain volume.
- Mark backup snapshots covering the user’s data for deletion at the next lifecycle cycle.
- Submit a redaction request to the log pipeline (most providers support this; runs over hours-to-days).
- Confirm deletion with the user via your application’s user-facing compliance flow.
Incident response
If the brain is compromised:- Take the affected Engine offline. Stop the container; preserve the volume for forensics.
- Revoke MCP credentials. Delete every connection in the affected brain so the credentials in the volume become inactive.
- Rotate all secrets that touched the affected environment.
- Investigate. What was compromised? What was extracted?
- Communicate. Notify the user and (if required by jurisdiction) regulators.
- Restore from a clean snapshot if appropriate, or stand up a fresh Engine with a clean brain.
Compliance posture
The Engine itself is a building block. Compliance is achieved at the deployment level:- GDPR. Right-to-erasure flows above. Data minimization via retention policy. Encryption at rest.
- India DPDP. Same as GDPR plus data localization for Indian users (host the Engine in an Indian region).
- SOC 2. Volume encryption + secret management + audit logging
(
observability_events). - HIPAA. Not a default fit; the Engine’s not designed for PHI. If required, additional controls would need to be added (BAA with the LLM provider, additional encryption, audit retention).
See also
- Threat model — what we defend against.
- Access control — who can do what.
- Database — backup, restore, cleanup.
- Secrets — handling sensitive config.

