loading…
Search for a command to run...
loading…
Persistent memory and session intelligence for AI coding assistants. Auto-tracks mistakes, decisions, and context via hooks. Mines your full session history for
Persistent memory and session intelligence for AI coding assistants. Auto-tracks mistakes, decisions, and context via hooks. Mines your full session history for patterns, predictions, and cross-session search.
Persistent memory and session intelligence for AI coding assistants. Hooks into Claude Code's lifecycle to auto-track mistakes, decisions, and context — then mines your full session history to surface patterns, predict what you'll need, and search across everything you've ever discussed.
Zero manual effort. Works with any MCP-compatible client.
Automatic (hooks — zero invocation):
<engram-precheck> with closest-name suggestions<engram-blast-radius>)session_mine(reflect)) — and feeds it back: a bounded per-kind multiplier (0.8-1.2) tunes how eagerly memories injectSession Mining (automatic, background):
session_mine(reflect) reports that precision + LLM-synthesized patternsOn-demand (MCP tools):
memory — store, search, archive, and manage memoriessession_mine — search past conversations (taggable by kind: decision / next-step / error), find decisions, replay file history, detect patterns, and surface what you said you'd do this session (commitments)work — log decisions and mistakes with reasoningcheckpoint_save/restore/list; handoff_* are deprecated aliases), convention tracking, impact analysis, symbol lookup (deps_map(symbol="X") — defining file, signature, importers from the code index)All MCP tools carry annotations (read-only / idempotent hints + a title), so clients and permission systems know which are safe to call without a prompt.
How I actually use it, since I built it:
Mostly it just works in the background — you don't have to think about it. The few things worth doing on purpose:
/engram when you want Claude to actively reach for the tools — the command loads the reference so Claude knows what's there and uses it. (Background tracking happens either way; this is for the on-demand stuff.)The less you poke at it, the better it works.
This is a work in progress — if something's off or you hit a bug, please open an issue.
Claude Code
|
+-- Hooks (remind.py) <- Intercepts every tool call
| SessionStart / Edit / Bash / Error / Compact / Stop
|
+-- Session Mining (mining/) <- Background intelligence
| JSONL parser -> Extractors -> Search index -> Pattern detection
|
+-- MCP Server (server.py) <- Tools for manual operations
| memory, session_mine, work, scope, context, ...
|
+-- Scorer/Hook Daemon (scorer_server.py) <- Persistent encoder + warm hook dispatch
TCP localhost, batch embeddings; runs on GPU automatically when
torch+CUDA is present (~1.1GB RAM on CPU with the default model)
High-frequency hooks run as thin clients (one round trip, full
in-process fallback when the daemon is down)
Hooks fire on every tool call (1-2s budget each). Heavy processing happens in a background subprocess after session end. The scorer server stays loaded in memory for fast semantic scoring.
Retrieval (recall@k): LongMemEval 0.966 R@5 / 0.982 R@10 (500 questions), ConvoMem 0.960 (250 items), LoCoMo 0.649 R@10 (~2k questions); ~43ms/query, 112ms cross-session over 7,310 chunks.
Product behavior: eight integration suites green — decision capture (97.8% precision), error auto-capture (100% recall), compaction survival (6/6), multi-project isolation (11/11), edit-loop detection (12/12), session mining (27/27), Obsidian-vault compat (25/25).
Full tables and the tests/bench_*.py reproduction commands are in the library-book.
| Platform | What Works | Auto-Capture |
|---|---|---|
| Claude Code (CLI, desktop, VS Code, JetBrains) | Everything | Full — hooks + session mining |
| Cursor | MCP tools (memory, search, etc.) | No hooks |
| Windsurf | MCP tools | No hooks |
| Continue.dev | MCP tools | No hooks |
| Zed | MCP tools | No hooks |
| Any MCP client | MCP tools | No hooks |
| Obsidian vaults | Full (with CLAUDE.md at root) | Full with Claude Code |
git clone https://github.com/20alexl/claude-engram.git
cd claude-engram
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -e . # Core
pip install -e ".[semantic]" # + embedding model for vector search and semantic scoring
python install.py # Configure hooks, MCP server, and /engram skill
python install.py --setup /path/to/your/project
Or copy .mcp.json to your project root.
Note: The CLAUDE.md in this repo is engram-specific documentation — it's not required for engram to work. Hooks fire automatically and the /engram skill provides a quick reference on demand. If you already have a CLAUDE.md for your project, keep it as-is and don't copy ours over it. If you want engram docs alongside your project rules, rename it to CLAUDE-ENGRAM.md (or similar) so it doesn't clobber your existing file — Claude will see it when relevant.
cd claude-engram
git pull
pip install -e ".[semantic]" # Reinstall if dependencies changed
python install.py # Re-run to update hooks and /engram skill
Hooks and MCP tools pick up code changes immediately (editable install). Reconnect the MCP server in Claude Code (/mcp) to reload the server process.
Data migrations run automatically: a cheap inline check fires on the next SessionStart, and a full migration runs in the background. install.py also runs migrations synchronously (step 9). Migrations are forward-only, idempotent, and downgrade-safe — no data is lost.
Already deep in a project? Install normally. On first session, engram auto-detects your existing Claude Code session history and mines it in the background — extracting decisions, mistakes, and patterns from all past conversations. No manual effort.
Memory — hybrid search (keyword + vector + rerank, no ChromaDB); path-aware scored injection (top 3 by file/tags/recency/importance, with age shown); tiered hot/cold storage (rules and manual mistakes never archive; stale auto-captured one-off mistakes self-archive to keep banners high-signal); per-sub-project scoping with cascading workspace rules.
Session mining — structural extraction (conversation flow, not template matching) over conversation and tool content; cross-session semantic/keyword/hybrid search, typed by kind (decision / next-step / error) and filterable; session_mine(commitments) reads the live transcript for open loops the post-session index can't see; pattern detection, predictive context, cross-project learning; retroactive bootstrap on first install.
Lifecycle — auto-captured decisions + mistakes; survives compaction (per-project checkpoints in a durable ring); edit-loop detection; subagent-aware; automatic, idempotent, downgrade-safe migrations on upgrade.
Internals, the full feature list, gotchas, and API reference live in the library-book.
| Variable | Default | Description |
|---|---|---|
CLAUDE_ENGRAM_MODEL |
gemma3:12b |
Ollama model — optional. Used only by scout_search, memory(consolidate), and session_mine(reflect) insight synthesis |
CLAUDE_ENGRAM_EMBED_MODEL |
BAAI/bge-base-en-v1.5 |
sentence-transformers embedding model (~440MB on first use, ~1.1GB scorer RAM). Decision-capture semantic F1 measured 37.7% (all-MiniLM-L6-v2) vs 72.7% (default). Set all-MiniLM-L6-v2 for a ~90MB lightweight setup. google/embeddinggemma-300m scores 67.3% but is license-gated (HF account + token + sentence-transformers>=5). Also settable via embed_model in ~/.claude_engram/config.json |
CLAUDE_ENGRAM_EMBED_DIM |
model native | Matryoshka truncation dim (e.g. 256 for embeddinggemma). Embedding stores are signature-stamped: changing the model rebuilds them automatically, never mixes vector spaces |
CLAUDE_ENGRAM_DEVICE |
auto | Embedding device. Auto-detects cuda when a CUDA-enabled torch is installed (a broken CUDA runtime degrades to cpu). Install GPU torch with e.g. pip install torch --index-url https://download.pytorch.org/whl/cu128. Vectors are device-identical — switching never rebuilds stores |
CLAUDE_ENGRAM_LIVE_MINE |
300 |
Live mining tick interval in seconds — at most one incremental mine per interval, triggered at turn end, keeps search/extractions/code-index fresh during long sessions. 0/off disables (mining then happens only at SessionEnd) |
CLAUDE_ENGRAM_ARCHIVE_DAYS |
14 |
Days until inactive memories archive |
CLAUDE_ENGRAM_SCORER_TIMEOUT |
1800 |
Embedding server idle timeout (seconds) |
CLAUDE_ENGRAM_DIR |
~/.claude_engram |
Override the storage location (also the test-isolation seam) |
CLAUDE_ENGRAM_SESSION_RETENTION_DAYS |
0 (keep all) |
Prune session-search embedding shards older than N days |
CLAUDE_ENGRAM_LAST_FILE_PATH |
unset | If set, the Read hook mirrors the last-read file path to this file (statusline integration) |
CLAUDE_ENGRAM_HOOK_DEBUG |
unset | Set to 1 to print a stderr breadcrumb per hook (daemon vs fallback) |
If search quality is poor or you want to rebuild after an update:
python scripts/reindex.py "/path/to/your/workspace" --force # rebuild search index
python scripts/reindex.py "/path/to/your/workspace" --force --extract # also re-extract decisions/mistakes
Or via MCP: session_mine(operation="reindex", mode="bootstrap")
Library Book — design philosophy, internals, full usage guide, API reference, gotchas, and changelog.
/engram — slash command with quick tool reference (installed by install.py).
MIT
Выполни в терминале:
claude mcp add claude-engram -- npx CSA PROJECT - FZCO © 2026 IFZA Business Park, DDP, Premises Number 31174 - 001
Безопасность
Низкий рискАвтоматическая эвристика по публичным данным — не гарантия безопасности.