Memory model
Memory with provenance, scope, and a lifecycle.
Every record in UrBrain knows where it came from, which workspace can see it, and what state it is in. Retrieval composes those facts into a single answer that an agent can trust.
Scopes
UrBrain is multi-tenant on purpose. Memory belongs to an account, lives inside an organization, and is partitioned across workspaces that map to a product, a team, a customer engagement, or any other boundary the operator chooses to draw.
Workspace scope is part of the data model, not a filter applied at
the edge. Object paths in storage are references, never authorization
boundaries. When an agent calls search_memory, the
service resolves the caller's account, workspace, and any agent or
actor scoping before it reaches the index. A query that names a
document the workspace cannot see returns nothing — there is no path
that walks past that check.
Source identity is first class
Memories carry three columns that travel with them through every ingest, embed, retrieve, and tombstone path:
source_system— the system that produced the memory (for example,structuredmerge, a connector name, or an agent runtime).source_ref— the stable external ID for the chunk, document, transcript event, or artifact.source_hash— the content hash that lets the importer skip unchanged work and detect supersession without reprocessing an entire corpus.
That triplet makes ingestion idempotent, makes deletes propagate, and makes audit, replay, and export tractable instead of forensic.
Lifecycle states
A memory progresses through explicit states rather than disappearing silently into a vector store. The states are part of the public contract for retrieval and audit:
- Active — currently retrievable. The default state for newly imported chunks and authored notes.
- Superseded — replaced by a newer version of the same source. Kept for audit and replay; excluded from default retrieval.
- Archived — intentionally retired by a workspace operator. Recoverable, but not surfaced.
- Tombstoned — explicitly removed because the underlying source chunk no longer exists, propagated from a StructuredMerge delete artifact or an authored deletion.
Retrieval
Retrieval combines vector similarity, lexical matching, and structural filters. Embeddings live in pgvector with cosine distance and HNSW indexing. Lexical recall comes from PostgreSQL trigram lookup over the same memories. The two ranked lists are fused with reciprocal rank fusion, then filtered by workspace, memory type, lifecycle state, and any caller-supplied scopes before results are returned.
Because the index lives next to the policy in the same database, retrieval queries can join across them without crossing a service boundary. That keeps the latency budget honest and keeps permission checks from being bolted on after the fact.
Memory types
Memory in UrBrain is heterogeneous. The schema separates explicit authored notes from chunks promoted out of source documents and from events captured by runtime adapters, so retrieval can ask for "what a teammate wrote about onboarding" without competing with every automated ingest in the workspace.
- Explicit memory written by a user or agent through MCP
remember. - Artifact chunks promoted from source documents through the StructuredMerge ingest path.
- Observed events captured by runtime adapters as durable provenance for what an agent did and saw.
- Authored overlays for human-written context that should rank above derived material.