loading…
Search for a command to run...
loading…
An MCP server for efficient code indexing and symbol retrieval using tree-sitter AST parsing to fetch specific functions or classes without loading entire files
An MCP server for efficient code indexing and symbol retrieval using tree-sitter AST parsing to fetch specific functions or classes without loading entire files. It significantly reduces AI token costs by providing O(1) byte-offset access to code components across multiple programming languages.
MCP server for efficient code indexing and symbol retrieval. Index GitHub repos or local folders once with tree-sitter AST parsing, then let AI agents retrieve only the specific symbols they need — instead of loading entire files.
Simple 1 file binary distribution for trivial deployments.
Cut code-reading token costs by up to 99%.
The index is stored locally in ~/.code-index/ (configurable). Incremental re-indexing only re-parses changed files.
The server automatically indexes the working directory on startup (incremental, non-blocking). Optionally set ASTLLM_WATCH=1 to also watch for file changes and re-index automatically. The watcher skips noisy directories (node_modules, .git, dist, .next, etc.) by default — see Watch excludes below.
Python, JavaScript, TypeScript, TSX, Go, Rust, Java, PHP, Dart, C#, C, C++, Dart/Flutter, Swift
Download the binary for your platform from the GitHub Releases page:
| Platform | File |
|---|---|
| macOS ARM (M1/M2/M3) | astllm-mcp-macosx-arm |
| Linux x86-64 | astllm-mcp-linux-x86 |
| Linux ARM64 | astllm-mcp-linux-arm |
# Example for Linux x86-64
curl -L https://github.com/tluyben/astllm-mcp/releases/latest/download/astllm-mcp-linux-x86 -o astllm-mcp
chmod +x astllm-mcp
./astllm-mcp # runs as an MCP stdio server
No Node.js, no npm, no build tools required.
Requires Node.js 18+ and a C++20-capable compiler (for tree-sitter native bindings).
git clone https://github.com/tluyben/astllm-mcp
cd astllm-mcp
CXXFLAGS="-std=c++20" npm install --legacy-peer-deps
npm run build
Note on Node.js v22+: The
CXXFLAGS="-std=c++20"flag is required because Node.js v22+ v8 headers mandate C++20. The--legacy-peer-depsflag is needed because tree-sitter grammar packages target slightly different tree-sitter core versions.
Option A — claude mcp add CLI (easiest):
# Pre-built binary, project-scoped (.mcp.json)
claude mcp add astllm /path/to/astllm-mcp-linux-x86 --scope project
# Pre-built binary, user-scoped (~/.claude.json)
claude mcp add astllm /path/to/astllm-mcp-linux-x86 --scope user
# From source (Node.js), project-scoped
claude mcp add astllm node --args /path/to/astllm-mcp/dist/index.js --scope project
Option B — manual JSON config:
Add to ~/.claude.json (global) or .mcp.json in your project root (project-scoped):
Pre-built binary:
{
"mcpServers": {
"astllm": {
"command": "/path/to/astllm-mcp-linux-x86",
"type": "stdio"
}
}
}
From source (Node.js):
{
"mcpServers": {
"astllm": {
"command": "node",
"args": ["/path/to/astllm-mcp/dist/index.js"],
"type": "stdio"
}
}
}
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
Pre-built binary:
{
"mcpServers": {
"astllm": {
"command": "/path/to/astllm-mcp-macosx-arm"
}
}
}
From source (Node.js):
{
"mcpServers": {
"astllm": {
"command": "node",
"args": ["/path/to/astllm-mcp/dist/index.js"]
}
}
}
index_repoIndex a GitHub repository. Fetches source files via the GitHub API, parses ASTs, stores symbols locally.
repo_url GitHub URL or "owner/repo" slug
generate_summaries Generate one-line AI summaries (requires API key, default: false)
incremental Only re-index changed files (default: true)
storage_path Custom storage directory
index_folderIndex a local folder recursively.
folder_path Path to index
generate_summaries AI summaries (default: false)
extra_ignore_patterns Additional gitignore-style patterns
follow_symlinks Follow symlinks (default: false)
incremental Only re-index changed files (default: true)
storage_path Custom storage directory
list_reposList all indexed repositories with file count, symbol count, and last-indexed time.
get_repo_outlineHigh-level overview: directory breakdown, language distribution, symbol kind counts.
repo Repository identifier ("owner/repo" or short name if unique)
get_file_treeFile and directory structure with per-file language and symbol count. Much cheaper than reading files.
repo Repository identifier
path_prefix Filter to a subdirectory
include_summaries Include per-file summaries
get_file_outlineAll symbols in a file as a hierarchical tree (methods nested under their class).
repo Repository identifier
file_path File path relative to repo root
get_symbolFull source code for a single symbol, retrieved by byte-offset seek (O(1)).
repo Repository identifier
symbol_id Symbol ID from get_file_outline or search_symbols
verify Check content hash for drift detection (default: false)
context_lines Lines of context around the symbol (0–50, default: 0)
get_symbolsBatch retrieval of multiple symbols in one call.
repo Repository identifier
symbol_ids Array of symbol IDs
search_symbolsSearch symbols by name, kind, language, or file pattern. Returns signatures and summaries — no source loaded until you call get_symbol.
repo Repository identifier
query Search query
kind Filter: function | class | method | type | constant | interface
file_pattern Glob pattern, e.g. "src/**/*.ts"
language Filter by language
limit Max results 1–100 (default: 50)
search_textFull-text search across indexed file contents. Useful for string literals, comments, config values.
repo Repository identifier
query Case-insensitive substring
file_pattern Glob pattern to restrict files
limit Max matching lines (default: 100)
invalidate_cacheDelete a repository's index, forcing full re-index on next operation.
repo Repository identifier
Symbol IDs have the format file/path::qualified.Name#kind, for example:
src/auth/login.ts::AuthService.login#method
src/utils.go::parseURL#function
lib/models.py::User#class
Get IDs from get_file_outline or search_symbols, then pass them to get_symbol.
Every response includes a _meta envelope:
{
"_meta": {
"timing_ms": 2.1,
"tokens_saved": 14823,
"total_tokens_saved": 89412,
"cost_avoided_claude_usd": 0.222345,
"cost_avoided_gpt_usd": 0.148230,
"total_cost_avoided_claude_usd": 1.34118
}
}
Set one of these environment variables to enable one-line symbol summaries:
# Anthropic Claude Haiku (recommended)
export ANTHROPIC_API_KEY=sk-ant-...
# Google Gemini Flash
export GOOGLE_API_KEY=...
# OpenAI-compatible (Ollama, etc.)
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_MODEL=llama3
Summaries use a three-tier fallback: docstring first-line → AI → signature.
| Variable | Default | Description |
|---|---|---|
CODE_INDEX_PATH |
~/.code-index |
Index storage directory |
GITHUB_TOKEN |
— | GitHub API token (higher rate limits, private repos) |
ASTLLM_MAX_INDEX_FILES |
500 |
Max files to index per repo |
ASTLLM_MAX_FILE_SIZE_KB |
500 |
Max file size to index (KB) |
ASTLLM_LOG_LEVEL |
warn |
Log level: debug, info, warn, error |
ASTLLM_LOG_FILE |
— | Log to file instead of stderr |
ASTLLM_WATCH |
0 |
Watch working directory for source file changes and re-index automatically (1 or true to enable). Excluded dirs are never watched — see Watch excludes. |
ASTLLM_PERSIST |
0 |
Persist the index to ~/.astllm/{path}.json after every index, and pre-load it on startup (1 or true to enable) |
ANTHROPIC_API_KEY |
— | Enable Claude Haiku summaries |
GOOGLE_API_KEY |
— | Enable Gemini Flash summaries |
OPENAI_BASE_URL |
— | Enable local LLM summaries |
Legacy
JASTLLM_*variable names are also accepted for compatibility with the original Python version's indexes.
When ASTLLM_WATCH=1, the watcher walks the directory tree selectively — it opens one inotify watch per non-excluded directory, not per file, and skips the following by default:
node_modules .git dist .next .nuxt
build out __pycache__ .cache target
vendor venv .venv .tox coverage
.nyc_output .gradle .idea .vscode .DS_Store
eggs .mypy_cache .pytest_cache .ruff_cache
All hidden directories (.foo) are also skipped, except .github.
To add custom excludes, create ~/.astllm/exclude — one name per line, # for comments:
# ~/.astllm/exclude
my_large_assets_dir
some_vendor_folder
generated
Each name is matched against directory basenames anywhere in the tree, so generated excludes src/generated, lib/generated, etc.
By default Claude will use Grep/Glob/Read to explore code. To make it prefer the MCP tools, add the following to your project's CLAUDE.md:
## Code search
An astllm-mcp index is available for this project. Prefer MCP tools over Grep/Glob/Read for all code exploration:
- `search_symbols` — find functions, classes, methods by name (use this first)
- `get_file_outline` — list all symbols in a file before deciding to read it
- `get_repo_outline` — understand project structure without reading files
- `get_symbol` — read a specific function/class source (O(1), much cheaper than reading the file)
- `get_symbols` — batch-read multiple symbols in one call
- `search_text` — full-text search for strings, comments, config values
- `get_file_tree` — browse directory structure with symbol counts
Only fall back to Grep/Read when the MCP tools cannot cover the case (e.g. a file type not indexed by tree-sitter).
The repo identifier to pass to MCP tools is local/<folder-name> for locally indexed folders (e.g. local/src). Use list_repos if unsure.
.env, *.pem, *.key, credentials, etc.)Uses Bun to produce self-contained executables. All JS and native tree-sitter .node addons are embedded — users just download and run, no npm install or Node.js needed.
Prerequisites: install Bun once (curl -fsSL https://bun.sh/install | bash), then:
npm run build:macosx-arm # → dist/astllm-mcp-macosx-arm (run on macOS ARM)
npm run build:linux-x86 # → dist/astllm-mcp-linux-x86 (run on Linux x86)
npm run build:linux-arm # → dist/astllm-mcp-linux-arm (run on Linux ARM)
Each build script must run on the matching platform. The grammar packages ship prebuilt
.nodefiles for all platforms, but thetree-sittercore is compiled from source on install.scripts/prep-bun-build.mjs(run automatically before each binary build) copies the compiled.nodeinto the location Bun expects. For CI, use a matrix — Linux x86 and Linux ARM can both build on Linux via Docker/QEMU; macOS ARM requires a macOS runner.
How it works: tree-sitter and all grammar packages support
bun build --compilevia a statically-analyzablerequire()path. Bun embeds the correct native addon for the target and extracts it to a temp directory on first run.
npm run build # compile TypeScript → dist/
npm run dev # run directly with tsx (no compile step)
The project is TypeScript ESM. All local imports use .js extensions (TypeScript NodeNext resolution).
~/.code-index/
<owner>/
<repo>/
index.json # symbol index with byte offsets
files/ # raw file copies for byte-offset seeking
src/
auth.ts
...
_savings.json # cumulative token savings
This tool was inspired by: https://github.com/jgravelle/jcodemunch-mcp
I needed simplified distribution and a bunch of features this did not have.
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"astllm-mcp": {
"command": "npx",
"args": []
}
}
}PRs, issues, code search, CI status
Database, auth and storage
Reference / test server with prompts, resources, and tools.
Secure file operations with configurable access controls.