loading…
Search for a command to run...
loading…
A local, SQLite-backed code index for Claude Code, exposed over MCP, enabling targeted code retrieval without external APIs.
A local, SQLite-backed code index for Claude Code, exposed over MCP, enabling targeted code retrieval without external APIs.
A local, SQLite-backed code index for Claude Code, exposed over MCP. It
replaces blind Read / Grep / Glob exploration with targeted retrieval —
"where is parseAuthToken defined", "what calls Indexer.reindex_all", "find
the rate-limiting code" — answered in milliseconds against an offline index.
No API keys. No external services. The embedder runs locally on your machine.
getUserAuthToken → get user auth token) so search matches both styles.jina-embeddings-v2-base-code (768-dim) via sentence-transformers..claude/index.db (SQLite + sqlite-vec + FTS5).PostToolUse hook that incrementally re-indexes touched files.| Tool | Purpose |
|---|---|
code_search |
Hybrid (vector + FTS) search for conceptual queries (e.g., "auth flow", "where do we parse JSON"). |
symbol_lookup |
Exact-name lookup of functions / classes / methods / types. Prefer over code_search for identifiers. |
file_outline |
Symbols (with signatures) in a file, in source order. Use instead of Read when you only need shape. |
module_outline |
Symbols across a directory subtree in one call. Use instead of looping file_outline. |
where_am_i |
Given path + line, returns the innermost symbol and the full enclosing chain. |
get_symbol_body |
Full chunk for a symbol_id from symbol_lookup / code_search / file_outline. |
get_symbol_bodies |
Batch version of get_symbol_body (up to 20 ids per call). |
callers |
Symbols that CALL the given symbol. depth (1-5) expands transitively. |
callees |
Symbols that the given symbol CALLS. depth (1-5) expands transitively. |
references |
Non-call uses (subclasses, free identifier references). Companion to callers / callees. |
trace |
Build a call-graph tree from an entry symbol; flat=true returns nodes/edges for cheap LLM scans. |
file_imports |
Files this file imports (direction=imports) or that import it (direction=imported_by). |
recent_changes |
Files touched in the last N git commits. |
propose_rename |
v1: same-file rename. Returns an edit list the agent applies via its own Edit tool; refuses on clash. |
| Tool / op | Purpose |
|---|---|
admin op=init |
Build or refresh the index. Incremental by default; force=true rebuilds from scratch. |
admin op=setup_check |
Diagnose hook wiring + embedder + host. Round-trip-tests the hook end-to-end. |
admin op=install_hook |
Wire the auto-reindex PostToolUse hook into .claude/settings.json. Idempotent. |
admin op=stats |
Read-only: file counts by language, symbol totals, embed model fingerprint, last-index time. |
admin op=verify |
Integrity sweep: orphan rows, parse-failure files, dangling edges. |
embed_query_debug is a dev-only ranking diagnostic, hidden from list_tools
unless CODE_INDEX_DEBUG=1 is set.
All tools return bounded JSON; large bodies use get_symbol_body rather than
inlining whole files.
sqlite-vec).PYTHON_CONFIGURE_OPTS=--enable-loadable-sqlite-extensions pyenv install 3.12.x.uv / uvx (install) — recommended runner. Or pip if you prefer a permanent install.One command, no API keys:
claude mcp add-json -s user code-index "$(cat <<'JSON'
{
"type": "stdio",
"command": "uvx",
"args": ["--refresh", "--from", "mcp-code-index", "code-index-mcp"]
}
JSON
)"
Then open Claude Code in any repo and ask:
"Build the code index for this repo."
Claude calls the init MCP tool, which writes .claude/index.db. From then on,
ask things like "where is parseAuthToken defined?" or "what calls
Indexer.reindex_all?" — Claude routes them through symbol_lookup /
callers / code_search instead of grepping.
What
--refreshdoes — fetches the latest PyPI release on every Claude Code launch. Convenient during preview; drop it once you want to pin a version (saves ~1s of startup).Project-only install — drop
-s userto register the server in the current project's.claude/settings.jsoninstead of the global~/.claude.json.First-run model download — the first
initpullsjina-embeddings-v2-base-code(~600 MB) into~/.cache/huggingfaceand caches it forever. Subsequent runs are fully offline. If your network blocks Hugging Face, pre-warm the cache from a machine that has access.Already installed without
--refresh? Runclaude mcp remove code-indexfirst, then re-run the command above.
pip install mcp-code-index
claude mcp add -s user code-index -- code-index-mcp
Without a hook, the index drifts when files change outside the agent (mv,
git checkout, IDE saves) until you call init again. With one, every
Edit / Write / MultiEdit Claude performs triggers an incremental reindex
of the touched file.
Easiest path: ask Claude. On first use in a new project, ask "set up the
code-index" — Claude calls setup_check → install_hook → init. The hook
command is derived from how the MCP server was launched (uvx-aware), so it
uses the same Python toolchain. Hook output goes to .claude/code-index-hook.log
so failures are debuggable.
Manual install — add this block to the project's .claude/settings.json
under hooks.PostToolUse (the version you want depends on how you launch the
server — install_hook derives the right one for you):
{
"matcher": "Edit|Write|MultiEdit",
"hooks": [
{
"type": "command",
"command": "uvx --with 'sentence-transformers<5' --with 'numpy<2' --from mcp-code-index code-index-hook"
}
]
}
The server speaks standard MCP over stdio, so any client that supports MCP
servers works (Cursor, Continue, Cody, Zed, etc.). Configure the client to
launch uvx --refresh --from mcp-code-index code-index-mcp (or
code-index-mcp after pip install mcp-code-index). Once connected, call the
init tool from inside the client to bootstrap the index. Drop --refresh
when you want to pin to a stable version instead of always pulling latest.
git clone https://github.com/achreftlili/code-index
cd code-index
pip install -e .
code-index init # CLI alternative to the `init` MCP tool
code-index-mcp # starts the MCP server on stdio (for manual wiring)
All settings are optional — the defaults work out of the box. Override them via
environment variables. Inside Claude Code, set them in the env block of your
code-index server entry in ~/.claude.json (then reconnect the MCP server).
Common knobs (most users only ever touch these):
| Var | Default | When to set it |
|---|---|---|
CODE_INDEX_EMBED_DEVICE |
auto | Force the torch device: cpu, mps, or cuda. Set cpu on Apple Silicon if init fails with MPS out-of-memory. |
CODE_INDEX_EMBED_BATCH |
32 |
Encode batch size. Lower (e.g. 8 or 4) to cut peak GPU memory while staying on mps/cuda. |
CODE_INDEX_DB |
.claude/index.db |
Override the SQLite index path (e.g. to share an index across sibling worktrees). |
Advanced (rarely needed):
| Var | Default | Notes |
|---|---|---|
CODE_INDEX_EMBEDDER |
jina |
Only jina (local sentence-transformers) is supported today; the variable exists for future expansion. |
CODE_INDEX_EMBED_MODEL |
jinaai/jina-embeddings-v2-base-code |
HuggingFace model id. Only override if you know the model is dim-compatible (768d). |
CODE_INDEX_EMBED_DIM |
768 |
Must match the embedding model's output dimension. |
init fails with MPS backend out of memory on Apple Silicon. A large
file produced a chunk batch bigger than your GPU's free VRAM. Quickest fix —
re-run on CPU (slower but bulletproof):
"env": {
"CODE_INDEX_EMBED_DEVICE": "cpu"
}
To stay on the GPU, shrink the batch instead: "CODE_INDEX_EMBED_BATCH": "8".
Reconnect the MCP server (/mcp → reconnect, or restart Claude Code) so the
new env takes effect. init is incremental — already-embedded files are
skipped on the retry.
init fails with a Hugging Face network error on first run. Your network
is blocking model downloads. Pre-warm the cache on a machine that has access:
huggingface-cli download jinaai/jina-embeddings-v2-base-code
# then copy ~/.cache/huggingface/ to the offline machine
sqlite3.OperationalError: not authorized or sqlite-vec fails to load.
Your Python build doesn't have loadable SQLite extensions. See
Requirements — install via python.org or a pyenv build with
PYTHON_CONFIGURE_OPTS=--enable-loadable-sqlite-extensions.
code_search / symbol_lookup returns stale paths after a refactor or
branch checkout. The auto-reindex hook only fires on Claude's Edit /
Write / MultiEdit. After bulk file moves outside the agent (mv,
git checkout, IDE rename), re-run init (it's incremental). Or wire up the
hook so the index keeps up with
agent edits automatically.
src/code_index/
db.py SQLite schema, connection, sqlite-vec loading
parser.py Tree-sitter wrapper, symbol + edge extraction
imports.py Per-language import target → file path resolution
chunker.py Per-symbol chunks, identifier expansion
embedder.py Local Jina (sentence-transformers) backend
indexer.py Pipeline: walk → parse → chunk → embed → write
reindexer.py Per-root engine cache; one entry point for "reindex one file"
retriever.py Hybrid search (vector + FTS5) with RRF
watcher.py File watcher (watchdog)
admin.py setup_check / install_hook / init logic (pure, no MCP state)
mcp_server.py MCP wiring, shared helpers, schema fragments
tool_registry.py Shared `@_tool` decorator + `_TOOLS` registry
tools/ Per-domain MCP handlers (graph, paths, refactor, …)
hook.py `code-index-hook` console script — the PostToolUse entry point
cli.py init / reindex / watch / stats
Run in your terminal:
claude mcp add code-index -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.