loading…
Search for a command to run...
loading…
Audit-grade memory backbone for agent teams. Bi-temporal facts (event time + transaction time, with recall(as_of=...) replay), 6-step deterministic retrieval (n
Audit-grade memory backbone for agent teams. Bi-temporal facts (event time + transaction time, with recall(as_of=...) replay), 6-step deterministic retrieval (no LLM in the critical path), conversation ingest with speaker-locked dual-pass extraction, per-tenant Postgres row-level security, and Ed25519-signed provenance. Postgres + pgvector + Neo4j defaults.
The memory layer for agent teams. Self-hosted, deterministic retrieval, zero LLM in the critical path.
PyPI PyPI Downloads GitHub Stars Build Evals License: MIT
pip install attestor
| Version | 4.0.0a1 (alpha; greenfield rebuild — no v3 migration path) |
| PyPI | attestor |
| Import | attestor |
| Live site | https://attestor.dev/ |
| Repo | https://github.com/bolnet/attestor |
| License | MIT |
Attestor is a memory store for agent teams that need a shared, tenant-isolated memory with bi-temporal replay, deterministic retrieval, and an auditable supersession chain. It runs as a Python library, a Starlette REST service, or an MCP server — same API in all three.
It is built around three claims, each grounded in code:
valid_from / valid_until) and transaction time (t_created / t_expired). Nothing is deleted; everything is queryable forever (attestor/temporal/manager.py:43-73, core.py:888-890).attestor/retrieval/orchestrator.py:1-14).ADD / UPDATE / INVALIDATE / NOOP) resolver per fact. Every supersession carries an evidence_episode_id (attestor/extraction/conflict_resolver.py:98).pip install attestor # or: pipx install attestor
attestor setup local # writes attestor/infra/local/docker-compose.yml
docker compose -f attestor/infra/local/docker-compose.yml up -d
Postgres 16 ships with pgvector (document + vector roles). Neo4j 5 ships with GDS (graph role: PageRank, BFS, Leiden).
ollama pull bge-m3 # 1024-D, 8K context, local-first default
The provider chain in attestor/store/embeddings.py checks http://localhost:11434 first; cloud providers are fallbacks. Override via ATTESTOR_EMBEDDING_PROVIDER / ATTESTOR_EMBEDDING_MODEL.
attestor doctor
All four checks must be green for the default install: Document Store, Vector Store, Graph Store, Retrieval Pipeline. Graph (Neo4j) is required — the 6-step retrieval pipeline narrows on graph neighborhoods and the conversation ingest path writes typed edges (uses, authored-by, supersedes). The only hard dependency that cannot be down is the document store (Postgres); transient vector-probe failures are surfaced in the response trace rather than swallowed (retrieval/orchestrator.py — vector_error field).
from attestor import AgentMemory, AgentContext, AgentRole
mem = AgentMemory() # picks up env / ~/.attestor.toml automatically
ctx = AgentContext(
agent_id="researcher-1",
role=AgentRole.RESEARCHER,
namespace="acme-prod",
)
mem.add(
content="Alice is the engineering manager",
entity="alice",
category="role",
context=ctx,
)
results = mem.recall(query="who runs engineering?", context=ctx)
for r in results:
print(r.score, r.memory.content)
SOLO mode (zero-config). In v4,
AgentMemory().add('foo')auto-provisions a singletonlocaluser, an Inbox project (metadata.is_inbox=true), and a daily session — so the snippet above works on a fresh database without configuring identity (core.py:179-209). For multi-tenant production use, pass an explicitAgentContextwith a realnamespace.
Verify your install end-to-end against a tiny LongMemEval slice. Defaults match the canonical benchmark stack: openai/gpt-5.2 answerer, dual judges (openai/gpt-5.2 + anthropic/claude-sonnet-4.6), openai/gpt-5.2 distiller, OpenAI text-embedding-3-large truncated to 1024-D.
export OPENAI_API_KEY=...
.venv/bin/python scripts/lme_smoke_local.py --n 2
Every model and parameter is overridable via env var or CLI flag. See --help for the full table.
Every memory carries two time axes:
| Axis | Columns | Meaning |
|---|---|---|
| Event time | valid_from, valid_until |
When the fact is true in the world |
| Transaction time | t_created, t_expired |
When the row landed in the store |
Plus a superseded_by chain. Old facts are never deleted — they remain queryable forever (attestor/temporal/manager.py:30-66).
# What did we believe on March 1?
mem.recall(query="who runs engineering?", as_of="2026-03-01T00:00:00Z", context=ctx)
# Show me everything we knew about Alice between Feb and Apr
mem.recall(query="alice", time_window=("2026-02-01", "2026-04-01"), context=ctx)
as_of and time_window propagate end-to-end through the orchestrator and document store. Auto-supersession on write is wired into core.py:add() (core.py:762, 784-785): on every add, the temporal manager finds active rows with the same (entity, category, namespace) and different content, marks them superseded, sets valid_until=now, and links superseded_by=<new_id>. Detection is rule-based string equality today.
Every tenant table (users, projects, sessions, episodes, memories, user_quotas, deletion_audit) carries a tenant_isolation_* policy keyed off the attestor.current_user_id session variable. An empty / unset value fails closed — no rows visible (attestor/store/schema.sql:311-327).
Honest disclosure. Enforcement lives in Postgres, not Python. The
AgentRoleenum inattestor/context.py:49-56is metadata that flows onto memories for provenance; it does not gate operations in Python. RLS is what actually controls access. This is correct architecture for a memory backend, but worth knowing if you read the Python alone.
attestor/retrieval/orchestrator.py runs the same six steps for every query:
uses, authored-by, supersedes) injected as synthetic memoriesEvery call writes a JSONL trace to logs/attestor_trace.jsonl (disable via ATTESTOR_TRACE=0).
| Role | Purpose | Default | Alternatives |
|---|---|---|---|
| Document | Source of truth (content, tags, entity, ts, provenance, confidence) | Postgres 16 | AlloyDB, ArangoDB, DynamoDB, Cosmos DB |
| Vector | Dense embedding per memory | pgvector | AlloyDB ScaNN, ArangoDB, OpenSearch Serverless, Cosmos DiskANN |
| Graph | Entity nodes + typed edges | Neo4j 5 + GDS | Apache AGE on AlloyDB, ArangoDB, Neptune, NetworkX (Azure) |
Postgres is the source of truth. Neo4j is derived state, rebuildable from Postgres — but it's required for the canonical install: graph expansion is step 2 of the retrieval pipeline and conversation ingest writes typed edges. The only role that cannot be down is the document store; the orchestrator records transient vector-probe failures in the response trace (vector_error) instead of swallowing them.
A trigger-maintained content_tsv tsvector + GIN index lifts queries that embeddings under-recall (acronyms, IDs, rare proper nouns). Enabled when v4 schema is detected; fuses with the vector lane via Reciprocal Rank Fusion (RRF, k=60). Graceful no-op on backends without the column (core.py:122-130).
The heavyweight write path that turns conversation turns into auditable memories. core.py:ingest_round(turn) orchestrates four passes:
turn → extract_user_facts(user_turn) ┐
extract_agent_facts(assistant_turn) ┘ → resolve_conflicts → apply
attestor/extraction/round_extractor.py:216, 258 — separate prompts for user vs assistant turns. The user-turn extractor only emits facts attributable to the user; the assistant-turn extractor only emits facts the assistant introduced. Stops cross-attribution. The "+53.6 over Mem0" delta in our LongMemEval scores comes from this split.
attestor/extraction/conflict_resolver.py:40, 98 — for each newly-extracted fact, an LLM call against existing similar memories returns one of:
| Decision | Effect |
|---|---|
ADD |
New info, no existing match — write fresh memory |
UPDATE |
Same entity + predicate, refined value — keep existing id |
INVALIDATE |
Old memory contradicted — mark superseded (timeline replays) |
NOOP |
Already represented — skip |
Each Decision carries evidence_episode_id. Every supersession is auditable. Failsafe: parse failure on a single fact yields ADD-by-default — better a duplicate-ish row than a silent drop.
Two write paths, two contracts.
mem.add(...)runs the lightweight rule-based supersession (§Bi-temporal).mem.ingest_round(turn)runs the full four-decision pipeline. Pickingest_roundfor conversational data; pickaddfor structured writes where you've already done the conflict reasoning.
mem.consolidate() (core.py:526) re-extracts and synthesizes facts from recent episodes with a stronger model. Currently a Python-API-only call — no CLI command. Schedule it from your application (cron, systemd timer, ECS scheduled task) when you want fresher facts than the streaming extractor produces.
attestor/consolidation/reflection.py runs periodic synthesis across N episodes for one user. Outputs:
stable_preferences — patterns appearing in 3+ episodesstable_constraints — rules the user repeatedly invokeschanged_beliefs — preferences that shifted (old → new, with explicit invalidate)contradictions_for_review — flagged for HUMAN REVIEW, not auto-resolvedThe "do not auto-resolve" stance is the load-bearing piece for regulated chat systems. The prompt is explicit (reflection.py:35-66): "Do NOT auto-resolve contradictions. Flag them for human review."
pack = mem.recall_as_pack(query="who runs engineering?", context=ctx)
# pack.memories : list of {id, content, validity_window, confidence, source_episode_id}
# pack.prompt : default Chain-of-Note prompt with NOTES → SYNTHESIS → CITE → ABSTAIN → CONFLICT structure
The default prompt has explicit ABSTAIN and CONFLICT clauses — every frontier model defaults to confabulation otherwise.
AgentRole: ORCHESTRATOR, PLANNER, EXECUTOR, RESEARCHER, REVIEWER, MONITOR (attestor/context.py:49-56). The role flows onto every memory's metadata for provenance. Access enforcement happens at the Postgres RLS layer (see §Tenant isolation).
orchestrator = AgentContext.from_env(agent_id="orchestrator", namespace="project:acme")
planner = orchestrator.as_agent("planner", role=AgentRole.PLANNER)
executor = planner.as_agent("executor", role=AgentRole.EXECUTOR)
# Each child carries parent_agent_id + accumulating agent_trail.
# All three share the same scratchpad: Dict[str, Any] for typed handoff data.
as_agent() creates a child context with parent_agent_id, full agent_trail, and a shared scratchpad. The trail accumulates — useful for proving "this answer came from agent X who got it from agent Y."
AgentContext.token_budget (default 20 000) is enforced — recall() packs results greedily until the budget is exhausted (scorer.py:fit_to_budget). token_budget_used accumulates across calls in a session.
mem.set_quota(user_id, daily_writes=...) → enforced on add against the v4 user_quotas table (core.py:592-621). Optional; unset means unlimited.
Cross-link to §Tenant isolation. RLS policies are the access-control surface; the Python layer trusts them. Set attestor.current_user_id per connection.
Every memory carries agent_id, session_id, source_episode_id. The supersession chain (superseded_by) is preserved forever. Conversation episodes are stored verbatim, separate from the memories extracted from them — meaning you can always reconstruct which conversation turn produced which fact.
Hard deletes (e.g., GDPR purges) write a row to deletion_audit before the cascade — what was deleted, when, why, by whom. This is the carve-out for the otherwise-immutable schema.
mem.export_user(external_id="user-42") # full data export (memories + episodes + sessions + projects)
mem.purge_user(external_id="user-42", # cascading hard delete with audit trail
reason="GDPR right-to-erasure request 2026-04-27")
mem.deletion_audit_log(limit=100) # forensic readback
core.py:557-590. v4 only. Returns / writes everything Subject Access requires for Art. 15 / Art. 17.
Enable via config (signing.enabled = true). On every add, attestor signs the canonical payload id || agent_id || t_created || content_hash with an Ed25519 key. mem.verify_memory(memory_id) returns bool (core.py:623-640). Optional, off by default — turn on for adversarial-write contexts where you need cryptographic non-repudiation.
Same API across all three. Only configuration changes.
| Mode | Shape | When to use |
|---|---|---|
| A — Embedded library | AgentMemory(config) in-process; talks directly to Postgres + Neo4j |
Single-process agents, scripts, notebooks |
| B — Sidecar | attestor api on localhost:8080; language-agnostic HTTP client shares the same Postgres + Neo4j |
Polyglot agents on one box (Python + TS + Go) |
| C — Shared service | One Attestor service in front of an agent mesh (App Runner / Cloud Run / Container Apps) backed by managed Postgres + Neo4j | Production multi-agent platforms |
attestor api --port 8080 # Mode B / C — Starlette ASGI REST (HTTP)
attestor mcp --path ~/.attestor # MCP stdio server (zero-config; for Claude Desktop / Cursor / Windsurf)
attestor serve ~/.attestor # MCP stdio server (positional-path variant; equivalent transport)
| Backend | Document | Vector | Graph | Status |
|---|---|---|---|---|
| Postgres + Neo4j (default) | ✓ | pgvector | Neo4j + GDS | Production-ready |
| ArangoDB | ✓ | ✓ | ✓ | Production-ready (one engine, all 3 roles) |
| AWS | DynamoDB | OpenSearch Serverless | Neptune | Backend code + Terraform shipped |
| Azure | Cosmos DB | Cosmos DiskANN | NetworkX (in-process) | Backend code shipped, Terraform forthcoming |
| GCP | AlloyDB | AlloyDB ScaNN | AGE on AlloyDB | Backend code shipped, Terraform forthcoming |
Override the default via config:
# ~/.attestor.toml
backend = "postgres+neo4j" # or "arangodb" | "aws" | "azure" | "gcp"
Reference Terraform lives under attestor/infra/.
Provider auto-detect (attestor/store/embeddings.py:get_embedding_provider), in this order:
bge-m3 — 1024-D, 8K context — used when http://localhost:11434 is reachabletext-embedding-3-large (3072-D native; pin OPENAI_EMBEDDING_DIMENSIONS=1024 for schema compat)Local-first by design. Override:
export ATTESTOR_DISABLE_LOCAL_EMBED=1 # skip the Ollama probe entirely
export ATTESTOR_EMBEDDING_PROVIDER=openai
export ATTESTOR_EMBEDDING_MODEL=text-embedding-3-large
attestor --help lists everything. The most useful commands:
| Command | Purpose |
|---|---|
attestor init |
Create a starter config |
attestor setup local |
Generate Docker Compose for Postgres + Neo4j |
attestor doctor |
Health-check every store + the retrieval pipeline |
attestor add / recall / search / list |
CRUD-ish memory ops |
attestor timeline |
Entity timeline (uses bi-temporal manager) |
attestor stats |
Store statistics |
attestor export / import |
JSON dump / restore |
attestor compact |
Remove archived memories |
attestor update / forget |
Mutate / archive a memory |
attestor inspect |
Inspect raw database state |
attestor api |
Start the Starlette REST API |
attestor serve <path> |
Start MCP stdio server (positional-path variant) |
attestor mcp [--path …] |
Start MCP stdio server (zero-config; default for Claude Desktop / Cursor / Windsurf) |
attestor ui |
Read-only browser UI for the store |
attestor hook {session-start, post-tool-use, stop} |
Run a Claude Code lifecycle hook |
attestor lme / locomo / mab |
Built-in benchmark runners (see §Evaluation) |
attestor mcp (or attestor serve <path>) exposes an MCP stdio server with eight tools:
| Tool | Purpose |
|---|---|
memory_add |
Write a memory with provenance |
memory_get |
Fetch one memory by id |
memory_recall |
Run the full retrieval pipeline |
memory_search |
Filtered list (entity / category / time / namespace) |
memory_forget |
Archive a memory by id |
memory_timeline |
Chronology for an entity |
memory_stats |
Store statistics |
memory_health |
Per-role health snapshot — call this first when integrating |
Plus MCP resources (memory listings) and prompts (canned recall prompts for IDE assistants).
Three lifecycle hooks ship in attestor/hooks/:
session_start — injects relevant memories into the session context based on cwd / repopost_tool_use — auto-captures useful artifacts from Write / Edit / Bashstop — writes a session summary on exitWire them up via the installer (next section) or by hand in ~/.claude/settings.json.
Single instruction users can give Claude Code:
install attestor
(Or run /install-attestor.) The installer interviews you on:
~/.claude/.mcp.json) vs project (.mcp.json)postgres+neo4j, or arangodb / aws / azure / gcpsession-start / post-tool-use / stopThen it installs attestor via pipx, writes the MCP config, optionally writes settings.json hooks, and runs attestor doctor to verify.
Boundary statement. The dual-LLM judge stack is a benchmarking mechanism, not the runtime contract. Recall in production is single-pipeline and deterministic. Multiple judges score answers in evaluation only — never in user-facing reads.
| Runner | Source | Measures |
|---|---|---|
attestor lme |
LongMemEval (Google's long-memory benchmark) | answer accuracy under long history, distillation, dual-judge cross-family |
attestor locomo |
LoCoMo | conversational long-memory consistency |
attestor mab |
MultiAgentBench | multi-agent coordination |
| AbstentionBench (CI gate) | internal | when not to answer — known unknowns |
scripts/lme_smoke_local.py |
dual-LLM smoke | quick install verification (see Quick Start §6) |
The smoke driver mirrors the canonical published-benchmark stack exactly. See --help for the full env-var / CLI-flag override matrix.
attestor/
core.py -- AgentMemory (main public API)
client.py -- MemoryClient (HTTP drop-in for remote Attestor)
context.py -- AgentContext, AgentRole, Visibility
models.py -- Memory, RetrievalResult, ContextPack
cli.py -- attestor CLI entry point
api.py -- Starlette ASGI REST API
longmemeval.py -- LongMemEval benchmark runner (dual-judge)
locomo.py -- LoCoMo runner
doctor_v4.py -- v4 schema + invariant validator
init_wizard.py -- interactive install flow
store/
base.py -- DocumentStore / VectorStore / GraphStore protocols
registry.py -- backend selection
connection.py -- config layering / env resolution
embeddings.py -- provider auto-detect (Ollama / OpenAI / Bedrock / Vertex / Azure)
postgres_backend.py -- pgvector (document + vector roles)
neo4j_backend.py -- Neo4j + GDS (graph role)
arango_backend.py -- all 3 roles in one
aws_backend.py -- DynamoDB + OpenSearch Serverless + Neptune
azure_backend.py -- Cosmos DB DiskANN + NetworkX
gcp_backend.py -- AlloyDB pgvector + AGE + ScaNN
schema.sql -- v4 Postgres schema (RLS, bi-temporal columns, content_tsv)
conversation/
ingest.py -- ingest_round() pipeline
extraction/
round_extractor.py -- 2-pass speaker-locked extraction
conflict_resolver.py -- 4-decision contract (ADD/UPDATE/INVALIDATE/NOOP)
rule_based.py -- deterministic fact extraction (no LLM)
prompts.py -- shared prompt templates
consolidation/
consolidator.py -- sleep-time re-extraction
reflection.py -- cross-thread synthesis (stable patterns + flagged contradictions)
graph/
extractor.py -- entity / relation extraction
retrieval/
orchestrator.py -- 6-step semantic-first pipeline
tag_matcher.py
scorer.py -- MMR, confidence decay, entity boost, fit-to-budget
trace.py -- JSONL trace writer
temporal/
manager.py -- timelines, supersession, contradiction detection, as_of replay
identity/
signing.py -- Ed25519 provenance signing (optional)
defaults.py -- SOLO mode auto-provisioning
mcp/
server.py -- MCP server (tools, resources, prompts)
hooks/
session_start.py
post_tool_use.py
stop.py
ui/
app.py -- Starlette read-only viewer
static/, templates/ -- Evidence Board UI
utils/
config.py, tokens.py
infra/
local/ -- Docker Compose (Postgres + Neo4j)
aws_arango/ -- Reference Terraform
tests/ -- Unit tests; live cloud tests env-gated
evals/ -- LongMemEval / LoCoMo / MultiAgentBench / AbstentionBench harnesses
docs/ -- Architecture notes, ADRs
commands/ -- /install-attestor, etc.
scripts/ -- lme_smoke_local.py, etc.
poetry install
poetry run pytest tests/ -q # unit tests, no external services needed
ATTESTOR_LIVE_PG=1 poetry run pytest tests/live -q # live integration (env-gated)
Style: black formatting, isort imports, ruff lint, mypy types. PEP 8, type-annotated signatures, dataclasses for DTOs. Many small files (200–400 lines typical, 800 max).
Conventions worth knowing:
~/.attestor.toml → in-code overrides.add() for structured (lightweight rule-based supersession), ingest_round() for conversational (full 2-pass + 4-decision contract).Always call this first when integrating:
attestor doctor # CLI
mem = AgentMemory()
print(mem.health()) # Python API
// MCP
{ "tool": "memory_health" }
It probes Document Store (Postgres), Vector Store (pgvector), Graph Store (Neo4j), and the retrieval pipeline. All four are required for the default topology — graph expansion is step 2 of the canonical pipeline, not an optional accelerator. Transient vector-probe failures surface in the recall() trace (vector_error) so callers can distinguish a degraded result from a clean one.
io.github.bolnet/attestorMIT. See LICENSE.
Add this to claude_desktop_config.json and restart Claude Desktop.
{
"mcpServers": {
"attestor": {
"command": "npx",
"args": []
}
}
}