loading…
Search for a command to run...
loading…
A local MCP memory server giving LLMs a persistent, auditable memory fabric with temporal awareness, relationship tracking, and contradiction detection.
A local MCP memory server giving LLMs a persistent, auditable memory fabric with temporal awareness, relationship tracking, and contradiction detection.
PyPI Python License: MIT CI Docker GHCR
Temporal Hierarchical Object Union & Graph Hybrid Toolkit — a local MCP memory server that gives any LLM a persistent, auditable memory fabric on your own machine.
OB1 stores your thoughts. Karpathy's wiki compiles your knowledge. THOUGHT remembers with provenance, understands relationships, detects contradictions, never forgets what used to be true — and routes every query to the right mathematical structure before touching a single byte of data.
v0.2 specialises the same architecture for the workflow with the strongest natural fit: AI-assisted coding. THOUGHT now parses your source with tree-sitter, builds a real function-call graph as typed edges, and stamps every fact with its git commit. The bi-temporal as_of queries you already had now answer "what did the codebase look like at commit X?" for free.
thought ingest-code src/ # tree-sitter ingest, multi-language
thought ingest-git . --mode full # stamp every commit
thought callers GraphLayer.personalized_pagerank
# # score type entity file
# 1 0.0132 method Dispatcher._dispatch_code Dispatcher
# 2 0.0130 method Dispatcher._dispatch_fact Dispatcher
# 3 0.0122 method CodeLayer.impact_set CodeLayer
# 4 0.0110 method CodeLayer.callers_of CodeLayer
thought impact authenticate_user # what's affected if I change this?
thought diff --from v1.0 --to HEAD # set diff between two commits
Real measurement on this codebase: 38 files → 425 entities → 575 CALLS edges in <250 ms. The killer-demo query "who calls GraphLayer.personalized_pagerank" returns the four real callers ranked by Personalized PageRank in 60 ms on a 1086-edge graph.
What's new:
CALLS, IMPORTS, INHERITS_FROM, OVERRIDES, DEFINES as typed edges. The Graph Layer's HippoRAG-style PageRank then powers ranked callers / impact-set queries.code_commit_sha. thought diff --from <sha1> --to <sha2> returns the set difference of functions between two commits.ingest-code, ingest-git, callers, impact, diff.See CHANGELOG.md for the full v0.2 list. The v0.1 horizontal-memory surface below is unchanged — v0.2 is purely additive.
THOUGHT is a memory server for LLMs. You install it on your machine, wire it into your AI coding assistant (Claude Code, Cursor, Cline, Continue, Windsurf), and now your assistant has a brain that persists across conversations and across projects.
Everything runs locally — your memory is a single SQLite file on your laptop. No cloud, no account, no sync service, no API key.
Out of the box, AI coding assistants have goldfish memory. Every new conversation starts blank. If you told it last week to "always use Postgres for v2 features," you'll be telling it again today. If you decided in March that "the auth module is being rewritten," by April that context is gone.
Existing fixes don't really solve this — they trade one problem for another:
| Common workaround | What goes wrong |
|---|---|
| Stuff context into your system prompt | You hit token limits fast, and the model can't tell what's current vs. obsolete. |
| Cloud memory (ChatGPT, Claude Projects) | Locked to one vendor, no audit log, can't query "as of last week," no contradiction handling. |
| RAG over your notes (mem0, Letta, …) | Stores facts as flat vectors. No relationships between facts, no time tracking, no provenance, no notion of "this used to be true." |
| An LLM-maintained Markdown wiki (Karpathy's gist) | Lossy by design (the LLM summarises everything), grows linearly, no semantic search, no temporal queries. |
THOUGHT fixes the structural issues, not just the symptoms.
Once installed, your AI assistant gains two new tools it can call automatically when the conversation implies it:
remember(content) — "note that we decided X." THOUGHT extracts the entities and relationships, embeds them for similarity search, and links everything to its source so you can audit later.recall(query) — "what did we decide about X?" THOUGHT figures out what kind of question you asked, routes it to the right retrieval strategy, and returns at most 10 hits — each tagged with how trustworthy it is.You can also drive it from your terminal (CLI) or use the Python API directly.
The TL;DR, in plain English:
It knows when facts changed. Every fact carries two timestamps: when it was true in the world, and when the system learned it. "What did we say about pricing on Jan 15?" actually works — even if pricing changed on Feb 3.
It tracks how facts relate. Functions, classes, people, projects, decisions — they're all entities in a typed graph (CALLS, OWNS, INHERITS_FROM, CONTRADICTS, …). Asking "who calls authenticate_user?" is a real graph query, not a fuzzy text match.
It refuses to hallucinate relationships. Every edge has a mandatory pointer back to the source document that produced it. If a fact has no source, it doesn't exist. No more "the model invented a connection that was never in the data."
It surfaces contradictions instead of silently overwriting. When you say "auth is now using sessions" after previously saying "auth is JWT," both facts stay. A CONTRADICTS edge is created. recall can then answer "what facts about auth are currently disputed?"
It picks the right retrieval method per question. Fuzzy associative queries hit vector similarity. Relationship queries hit graph traversal. Time-travel queries hit the temporal layer. The wrong question never hits the wrong index.
It bounds output. No matter how big the knowledge base gets, recall returns at most 10 hits. Your context window doesn't get blown up by a runaway retrieval.
It's append-only. Nothing is ever deleted. When facts go stale, they're retired (their validity window closes), not erased. Full forensic audit of every change.
It's natively multi-user. scope='shared' for project-wide facts, scope='private' with owner_id for personal notes. Five devs on one repo each get their own private memory plus a shared common pool.
Plus eleven cutting-edge retrieval techniques from 2024–2026 literature (Anthropic Contextual Retrieval, HippoRAG-style PageRank, bi-temporal Graphiti, CRAG, MetaRAG confidence, …) stacked on top — see the Frontier techniques section below for the full list with citations.
The technical capability matrix vs. the closest comparable systems:
| OB1 (pgvector) | Karpathy LLM-Wiki | THOUGHT | |
|---|---|---|---|
| Relationship logic | flat rows | flat markdown | typed graph edges |
| Temporal awareness | none | none | bi-temporal (world-time + learned-time) |
| Provenance | informal tag | informal citation | mandatory source_ref on every edge |
| Multi-user | RLS bolted on | single-user | native two-zone graph |
| Query routing | always vector | always inject | VIBE / FACT / CHANGE / CODE / HYBRID router |
| Contradiction model | absent | LLM lint only | CONTRADICTS typed edge, write-time |
| Bounded result size | unbounded | unbounded | ≤10 enforced |
This section walks through everything from install to advanced workflows, with explanations of why each step exists. If you just want the 30-second version, skip to Quickstart.
Three ways. Pick one:
# Option 1 (recommended) — full bundle, everything you'll use
pip install 'thought-mcp[all]'
# Option 2 — minimal: CLI + MCP server only (no production embeddings)
pip install thought-mcp
# Option 3 — zero install: uvx fetches it on demand
uvx thought-mcp install --client cursor
uvx is what the MCP client configs use internally, so option 3 is fine if you don't want a global install. After install, verify with:
thought doctor
You should see all green. Any red items will print the exact command to fix them.
The one-line happy path for connecting THOUGHT to your AI client:
thought start --client cursor # or claude-code, cline, continue, windsurf
Then restart your AI client (close every window, reopen). Done. The next conversation will have the remember and recall tools available.
If you're not sure which client to pick, run thought install --detect first — it shows every supported client's config path and whether it's installed on your machine.
thought start actually doesKnowing what changed makes troubleshooting easier later:
.thought/thought.db in your current directory. This is your memory. Back it up like any database.CLAUDE.md in your current directory. This tells your AI assistant how the memory tools work and when to use them. You can edit it to add project-specific conventions like "always tag finance decisions with scope=private."thought.toml with sensible defaults. Most people never need to touch it.thought as a server. The previous config is backed up to <config>.thought.bak.127.0.0.1:8765.After your AI client restarts, it discovers thought and gains the two new tools.
Once wired up, your AI assistant calls remember and recall automatically when the conversation implies it. You usually don't need to invoke them explicitly. Here's what that looks like:
Telling it to remember something:
You: We've decided to use Postgres for the v2 release.
AI: Got it. [calls remember internally]
Stored. We can recall this with "database plan" or similar later.
The AI parses out the facts (v2 → Postgres), links them to the conversation as the source, and writes them to memory. You don't see the tool call.
Asking about something it remembered:
You: What's our database plan?
AI: [calls recall internally]
Based on what's in memory: for v2 we're using Postgres
(source_grounded, recorded 3 days ago).
The AI also sees how trustworthy each hit is via its confidence_class (source_grounded, inferred, or hallucination_risk). If a hit comes back as hallucination_risk, a well-prompted AI will flag the uncertainty rather than presenting it as fact.
Time travel — what was true at some past date:
You: What was our pricing on January 15?
AI: [calls recall with as_of="2026-01-15"]
On Jan 15: Free / Pro $19 / Team $99.
Pricing changed on Feb 3 to Free / Pro $25 / Team $129, so the
current prices are different.
This is THOUGHT's most distinctive capability. Existing memory tools structurally can't answer this — they overwrite old facts. THOUGHT keeps both, marks the older one retired, and as_of queries see the world as it was.
Catching contradictions automatically:
You: Remember that auth is now using sessions.
AI: Noted. [calls remember]
Heads up: a previously-stored fact says "auth is JWT."
Both are now in memory, linked by a CONTRADICTS edge.
Want me to retire the JWT fact?
The AI sees the conflict at write time and prompts you. The JWT fact isn't deleted — it's marked as superseded but still queryable for audit.
Private vs. shared scope (multi-user / multi-project):
You: Remember as a private note: I prefer 4-space indentation.
AI: Stored in your private scope. Won't surface in shared recalls.
Use scope='private' for personal preferences. Use scope='shared' for project decisions everyone on the team should see. A shared recall returns public facts plus the requester's own private facts; never another user's.
If your AI is being lazy and skipping recall, try phrases like:
To insist on storing something:
The single highest-leverage thing is the CLAUDE.md file that thought init drops in your project. Edit it to add project-specific conventions. The AI reads it on every session start, so rules like "always remember architectural decisions, never remember code snippets" are honored consistently.
If you're using THOUGHT for AI-assisted coding (the v0.2 specialisation), there's a separate ingest path that parses your source files via tree-sitter and builds a real function-call graph:
# Ingest a codebase — entities are functions / classes / methods / modules
thought ingest-code src/
# Ingest with full git history so as_of queries work for code
thought ingest-git . --mode full
# Ask who calls a function (ranked by importance)
thought callers authenticate_user
# Ask what's affected if you change a function
thought impact authenticate_user
# Show the set difference of entities between two commits
thought diff --from v1.0 --to HEAD
After ingestion, the AI's regular recall tool also gains code awareness. Natural-language questions like "who calls authenticate_user?" route through the call-graph machinery automatically.
THOUGHT works fine from a terminal without an AI assistant. The CLI is most useful for bulk operations and inspection:
# Add a single fact
thought ingest "Alice owns Acme Corp. Acme is part of HoldCo."
# Bulk-ingest a directory of Markdown notes
thought ingest --glob 'docs/**/*.md'
# Pipe in from any tool that emits one fact per line
git log --since='1 week ago' --format='%s' | thought ingest --stdin
# Query directly
thought recall "who owns Acme"
# Open an interactive REPL — type queries, type +text to add facts
thought repl
# See what's currently in memory
thought stats
# Soft-delete entities matching a SQL LIKE pattern (audit-logged, not destroyed)
thought forget "kendra%"
When a new version of THOUGHT ships:
pip install --upgrade thought-mcp # pull the new package
thought upgrade --all # re-pin every MCP client config to the new version
# Restart your AI client to pick up the new server.
thought upgrade --all solves the "uvx is still using its cached old version" problem by re-pinning your MCP client configs to the exact version you just installed (with the required extras included).
--detect can't find your client)If thought install --detect doesn't find a client you have installed, the JSON block to add manually is:
{
"mcpServers": {
"thought": {
"command": "uvx",
"args": ["--from", "thought-mcp[mcp,sqlite-vec]", "thought", "serve"]
}
}
}
Per-client locations:
~/.claude.json (top-level mcpServers block)~/.cursor/mcp.jsonglobalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json (or ~/.cline/cline_mcp_settings.json)~/.continue/config.json~/.codeium/windsurf/mcp_config.jsonTHOUGHT exists because of:
| # | Technique | Source |
|---|---|---|
| 1 | Contextual Retrieval — LLM-generated chunk context prepended before embedding | Anthropic, Sept 2024 |
| 2 | HippoRAG 2 — Personalized PageRank memory | Gutiérrez et al., NeurIPS 2024 (repo) |
| 3 | Bi-temporal Graphiti — separate valid-time and transaction-time | Zep, arXiv 2501.13956 (repo) |
| 4 | Atomic fact decomposition + Jaccard dedup | Wanner et al., 2024 |
| 5 | BGE-M3 hybrid embeddings (sparse + dense + ColBERT) | BAAI |
| 6 | Matryoshka two-pass retrieval | Kusupati et al.; OpenAI text-embedding-3 |
| 7 | CRAG (Corrective RAG) — retrieval evaluator + fallback | Yan et al., 2024 |
| 8 | MetaRAG epistemic uncertainty — confidence_class per hit |
arXiv 2504.14045 |
| 9 | Ebbinghaus decay scoring — strength × e^(-λ·days) × recall-boost |
@sachitrafa/YourMemory |
| 10 | Context-engineering budget per query class | Karpathy & community, 2025 |
| 11 | Append-only writes (Mem0 2026) — never UPDATE/DELETE | Mem0 State of Memory 2026 |
Built on: MCP Python SDK (@modelcontextprotocol), sqlite-vec (Alex Garcia), pgvector (Andrew Kane), Pydantic, Typer, structlog. spaCy (Explosion AI) is an optional extra.
Claude Code · Cursor · Cline · Continue · Windsurf
┬───────────────────────────────────────────────────
│ (auto-wired by `thought install`)
▼
┌──────────────────────────────────────────────────────────────────┐
│ MCP server (Streamable HTTP · async handlers) │
│ remember(content, ...) recall(query, ...) │
└──────────────────────────┬───────────────────────────────────────┘
│
▼
┌───────────────────────────┐ LRU recall cache
│ Router │ (write-version keyed)
│ VIBE FACT CHANGE HYBRID│ ↳ rules.yaml (user-editable)
│ + CRAG confidence eval │
└───────────┬───────────────┘
┌───────────┼───────────────┐
▼ ▼ ▼
┌─────────────┐ ┌──────────┐ ┌────────────┐
│ Vector L. │ │ Graph L. │ │ Temporal L.│
│ Matryoshka │ │ HippoRAG │ │ bi-temporal│
│ + GraphRAG │ │ PPR (+ │ │ as_of │
│ + sqlite- │ │ scipy. │ │ (valid + │
│ vec MATCH │ │ sparse + │ │ learned) │
│ │ │ local │ │ │
│ │ │ push) │ │ │
└──────┬──────┘ └────┬─────┘ └─────┬──────┘
│ │ │
▼ ▼ ▼
┌───────────────────────────────────────┐
│ StorageBackend (ABC) │
│ SQLite + sqlite-vec | pgvector │
│ sources · entities · edges · triples │
│ embeddings · strength_cache · log │
│ + bulk source-provenance JOIN │
│ + touch-access flush queue │
└──────────────┬────────────────────────┘
│
▼
┌─────────────────────────┐
│ Consolidation Engine │ background thread
│ Ebbinghaus · cold/warm │ + `thought consolidate` CLI
│ · dedup · audit log │
└─────────────────────────┘
Bi-temporal axis: every entity and edge tracks (valid_from, valid_until) (world-time) and (learned_at, unlearned_at) (transaction-time). "What did we know about X on date Y" and "what was true about X on date Y" are different queries; THOUGHT answers both via recall(..., as_of=Y, as_of_kind='valid' | 'learned').
These are capabilities neither OB1 nor the Karpathy wiki structurally supports — adding them would require rewriting their data layer:
recall(query, as_of=<past>) returns the world as it was, not as it is.confidence_class ∈ {source_grounded, inferred, hallucination_risk} so the LLM knows what to trust.CONTRADICTS typed edge with detected_at and confidence_score, queryable, not LLM lint notes.(scope, owner_id) filter at the storage layer, inherited by every retrieval path.valid_until close, never an UPDATE/DELETE — full forensic audit is guaranteed.These numbers come from tests/comparison/run.py — same workload, same deterministic embedder, three architectures. Reproducible: python -m tests.comparison.run.
| System | VIBE | FACT | CHANGE | HYBRID | overall |
|---|---|---|---|---|---|
| THOUGHT | 100% | 100% | 68% | 66% | 83.5% |
| OB1 | 100% | 100% | 32% | 100% | 83.0% |
| Karpathy wiki | 100% | 30% | 0% | 100% | 57.5% |
THOUGHT and OB1 tie on overall recall@10, but the CHANGE column (68% vs 32%) is the headline number — THOUGHT is 2.1× more accurate on the queries where temporal correctness matters. Karpathy wiki is 0% on temporal: it has no notion of time.
| System | rate |
|---|---|
| THOUGHT | 68% |
| OB1 | 32% |
| Karpathy wiki | 0% |
| System | count |
|---|---|
| THOUGHT | 2 |
| OB1 | 0 |
| Karpathy wiki | 0 |
(From python -m tests.comparison.ablation → docs/ablation.md)
| Variant | Overall | FACT | CHANGE | HYBRID |
|---|---|---|---|---|
| Full v0.1 (all Tier A) | 83.5% | 100% | 68% | 66% |
| − HippoRAG bidirectional PPR | 66.0% | 30% | 68% | 66% |
| − Bi-temporal edge retirement | 75.0% | 100% | 34% | 66% |
| − Query router (force VIBE) | 65.5% | 30% | 32% | 100% |
Each disabled technique costs THOUGHT real measurable accuracy on the dimension it was added to improve. HippoRAG is worth +70pp on FACT queries; bi-temporal supersession is worth +34pp on CHANGE; the router is worth +35pp overall.
THOUGHT went through three performance passes. Each one targeted the bottleneck the previous one exposed.
v0.2 pass — architectural (sqlite-vec + scipy.sparse + local push PPR):
use_binary_quantization=True; another ~8-16× over the float path on production models.scipy.sparse vectorised Personalized PageRank — one CSR matvec per iteration in place of the dict-of-lists power loop.O(1/(ε·(1−α))) nodes, automatically used when the in-scope KB exceeds 5k entities.v0.3 pass — system + UX:
5. Batched ingest — all writes from one remember() in one transaction; remember_many() batches across N items in one transaction with one embed_many call → 2-4× ingest throughput.
6. LRU recall cache keyed by (write_version, query, ...) — repeat queries become µs-scale (~130,000× over cold-recall p50).
7. Touch-access batched flush queue — eliminates the per-hit UPDATE on the recall hot path, batches into one executemany periodically.
8. PPR transition-matrix cache with write_version invalidation — repeat FACT recalls skip the COO→CSR matrix rebuild entirely.
9. One-query bulk source-provenance fetch — replaced N+M roundtrips (edges_to per hit + SELECT per source) with a single JOIN.
10. WAL tuning — 64 MiB page cache, 256 MiB mmap, synchronous=NORMAL, busy_timeout=5s.
11. Async MCP tool handlers — asyncio.to_thread lets the Streamable HTTP transport service concurrent recalls.
Same workload (Entity{i} owns Company{i%50} Corp.), same Windows laptop, deterministic embedder, 30 unique queries (no cache hits) for cold recall measurement:
| KB size | v0.1 recall p50 | v0.2 recall p50 | v0.3 recall p50 | v0.3 ingest (bulk) | v0.3 cache-hit p50 |
|---|---|---|---|---|---|
| 1,000 | 50.3 ms | 12.3 ms | 8.5 ms | 0.67 s | 0.7 µs |
| 5,000 | 261.6 ms | 42.5 ms | 37.8 ms | 3.73 s | 0.7 µs |
| 10,000 | 521.4 ms | 61.6 ms | 93.6 ms¹ | 7.47 s | 0.7 µs |
| 25,000 | ~1,300 ms² | 171.8 ms | 186.0 ms | 17.18 s | 0.7 µs |
¹ v0.3 honest-cold-cache numbers are slightly higher than v0.2's warm-cache numbers at the same KB size — v0.2 measured 20 repeats of the same query without a cache, which our profiler flattered. With the v0.3 LRU cache, repeated queries become essentially free (0.7 µs), so the real-world latency curve is the cold-cache row for first-time queries and the cache-hit column for everything else.
² Original v0.1 took >10s per recall at 25k entities; numbers extrapolated from the linear growth pattern.
Overall vs v0.1: 5-7× faster cold recalls, ~10,000-130,000× faster cache hits, 2-4× faster ingest (bulk).
Growth pattern: 25× more data → ~22× more latency in v0.3 — closer to linear at the high end because the deterministic embedder is itself O(N) on the brute-force fallback; with sentence-transformers/all-MiniLM-L6-v2 (production embedder, dense vectors), sqlite-vec's index becomes sub-linear and you get the full architectural win.
Also unchanged:
len(hits) ≤ 10 always, verified at every KB size.| Capability | THOUGHT | OB1 | Karpathy wiki |
|---|---|---|---|
| bi-temporal as_of | ✅ | ✗ | ✗ |
| source-grounded confidence class | ✅ | ✗ | ✗ |
| contradiction as typed edge | ✅ | ✗ | ✗ |
| multi-user scope isolation | ✅ | partial (RLS) | ✗ |
| append-only audit log | ✅ | ✗ | ✗ |
| Personalized PageRank retrieval | ✅ | ✗ | ✗ |
| Ebbinghaus decay scoring | ✅ | ✗ | ✗ |
| CRAG-style low-confidence flag | ✅ | ✗ | ✗ |
| Matryoshka 2-pass ANN | ✅ | ✗ | ✗ |
| Anthropic Contextual Retrieval | ✅ | ✗ | ✗ |
| query router (VIBE/FACT/CHANGE) | ✅ | ✗ | ✗ |
| forecasting (TLogic, v0.2) | planned | ✗ | ✗ |
Full architectural discussion in plan.md. Short version of the philosophy:
A memory system should know what kind of question is being asked before it searches anything, store facts with their origin and validity, and never lose history in the act of updating.
The three-layer split (Vector / Graph / Temporal) plus the Router is the architectural answer: each query class is dispatched to the mathematical structure that fits it. The eleven frontier techniques stack 1.5-3× gains on orthogonal axes; together they take the system from "pgvector wrapper" to "memory fabric."
Honest framing: no single 2024-2026 technique gives a 10× recall jump. The "1000× more useful" claim isn't about recall@10; it's about capabilities competitors structurally cannot have (the matrix above) compounded with stacked accuracy gains (the ablation table).
Default config (thought.toml, written by thought init):
db_path = ".thought/thought.db"
[embedding]
choice = "auto" # "auto" picks sentence-transformers if installed,
# else deterministic (zero-dep test embedder).
# Override: "minilm" | "bge-m3" | "openai" | "deterministic"
dim = 384
[server]
host = "127.0.0.1"
port = 8765
[consolidation]
enabled = true
cycle_seconds = 60.0
cold_demotion_days = 30
staleness_days = 30
batch_size = 100
[llm] # optional — enables Contextual Retrieval enrichment
enabled = false
provider = "none" # "anthropic" | "openai" | "ollama"
thought walks the directory tree (git-style) looking for a thought.toml, so you don't need a --config flag when running from a subfolder of your project.
Environment overrides: THOUGHT_DB_PATH, THOUGHT_EMBEDDER.
thought init [--quick] [--embedder auto|minilm|deterministic]
# write config + db + CLAUDE.md
thought install --detect # show every detected MCP client config path
thought install --client cursor # wire one client (with backup, idempotent)
thought install --all # wire every detected client
thought start [--client cursor] # init-if-needed + install + serve in one command
thought serve [--host ... --port ... --skip-precheck]
# start MCP server on Streamable HTTP
thought doctor # deep environment health check
thought --version
thought ingest "Alice owns Acme Corp."
thought ingest --file notes.md
thought ingest --glob 'docs/**/*.md'
cat changelog.txt | thought ingest --stdin
# Per-item scope
thought ingest --file private-notes.md --scope private --owner-id alice
thought recall "who owns Acme"
thought recall "what did we say about pricing" --as-of 2026-01-01
thought recall "auth changes" --as-of 2026-01-01 --as-of-kind learned
thought recall "alice" --json # raw JSON for piping into other tools
thought stats # entities / edges / sources / contradictions / top accessed
thought repl # interactive shell — type queries, +text to remember
thought forget 'kendra%' # soft-delete by SQL LIKE pattern (audit-logged)
thought consolidate # run one consolidation cycle
thought ingest-code <path> [--glob '**/*.py'] [--lang python|typescript|auto]
# tree-sitter ingest — functions / classes / methods as entities
thought ingest-git <repo> [--mode snapshot|full] [--paths '*.py,*.ts']
# commit-stamped ingest; --mode full walks every commit
thought callers <name> [--file path] [--limit 10]
# direct callers ranked by HippoRAG PageRank
thought impact <name> [--file path] [--limit 20]
# transitive impact set: what's affected if you change <name>
thought diff --from <sha1> --to <sha2> [--file path]
# set diff of entities between two ingested commits
docker build -t thought-mcp .
docker run --rm -p 8765:8765 -v thought-data:/data thought-mcp
The image runs as a non-root user, exposes :8765, persists state at /data, and runs thought serve as the default command. Once tagged releases are pushed, an upstream image is published at ghcr.io/<owner>/thought-mcp:<version> and :latest.
thought install --detect says my client path doesn't existMost clients only create their config file after first launch. Open the client once, then re-run thought install --client <name>. The installer will create the file if its parent directory exists.
sqlite enable_load_extension reports NO in thought doctorYou're on a Python build without loadable-extension support — most commonly Anaconda's bundled Python. Two fixes:
# Option A — install python.org Python and use that interpreter
# Option B — use pysqlite3-binary
pip install pysqlite3-binary
THOUGHT falls back to a pure-Python ANN path automatically, so this is a performance issue, not a correctness one.
low_confidence: true with no resultsThe CRAG evaluator flags this when the top hit's score is below threshold. Common causes:
thought stats to confirm.embedder = "auto" in thought.toml and reinstall sentence-transformers: pip install 'thought-mcp[embeddings-local]'.repl to iterate.thought doctor # confirm MCP SDK + vec extension load
thought serve --skip-precheck # try without the precheck
# Then inspect the client's MCP logs — most surface "failed to start" with a path
If uvx thought-mcp serve is in your mcpServers config and uvx isn't on PATH for the GUI client, switch the command to an absolute path to the thought entrypoint (which thought / where thought).
recall after startup is slowThe first call lazy-loads the embedder (downloads all-MiniLM-L6-v2, ~80 MB, on first run). After that it's warm. Use thought init (without --quick) to pre-download.
The CLI reconfigures stdout/stderr to UTF-8 at startup. If you're piping through a tool that still uses cp1252, set PYTHONIOENCODING=utf-8 in your shell.
pytest tests/unit -q # 56 unit tests
pytest tests/perf -m perf # 4 performance benchmarks
python -m tests.comparison.run # rebuilds docs/comparison.md
python -m tests.comparison.ablation # rebuilds docs/ablation.md
Coverage target: 85% on src/thought. CI matrix runs Python 3.11/3.12/3.13 × Ubuntu/macOS/Windows on every push (see .github/workflows/ci.yml). Tagging v* triggers release.yml (PyPI trusted publishing) and docker.yml (multi-arch GHCR image).
Current (shipped) — 11 Tier A frontier techniques (Contextual Retrieval, HippoRAG PageRank, bi-temporal Graphiti, atomic-fact triples + Jaccard dedup, BGE-M3 hybrid embeddings, Matryoshka 2-pass retrieval, CRAG evaluator, MetaRAG confidence class, Ebbinghaus decay, context-engineering budget per query class, append-only writes); comparison + ablation harnesses; two MCP tools; multi-platform CLI with auto-install for five MCP clients; LRU recall cache + PPR matrix cache + sqlite-vec + scipy.sparse PageRank + local push PPR + batched ingest (the three perf passes described above); Docker + PyPI release workflows.
v0.2 fast-follow — RAPTOR hierarchical summary trees at WARM→COLD demotion (Sarthi et al., ICLR 2024); sleep-time compute pre-computation (Letta + UCB, April 2025); TLogic temporal-rule forecasting (arXiv 2112.08025); Reflexion-style self-edit (Shinn et al., NeurIPS 2023); multi-hop deep recall (IRCoT/PRISM); introspective thought audit (transformer-circuits, 2025).
v0.3+ — RankZephyr local reranker, PIKE-RAG domain rationale extraction, DSPy-learned retrieval policies, real Postgres backend, REST API alongside MCP, encryption-at-rest (SQLCipher / pgcrypto), tenant isolation, OpenTelemetry traces/metrics.
MIT — see LICENSE.
Выполни в терминале:
claude mcp add thought -- npx PRs, issues, code search, CI status
автор: GitHubDatabase, auth and storage
автор: SupabaseSecure file operations with configurable access controls.
Reference / test server with prompts, resources, and tools.
Не уверен что выбрать?
Найди свой стек за 60 секунд
Автор?
Embed-бейдж для README
Похожее
Все в категории development