loading…
Search for a command to run...
loading…
A semantic proxy that reduces AI agent token usage by exposing only three core tools and using local vector embeddings to search for and execute hundreds of und
A semantic proxy that reduces AI agent token usage by exposing only three core tools and using local vector embeddings to search for and execute hundreds of underlying MCP tools. It streamlines communication between agents and MCP Routers by identifying relevant tools through natural language queries.
A semantic MCP proxy that sits between AI agents and MCP Router, exposing only 4 tools instead of hundreds. Uses local vector embeddings to find the right tool on demand — no OpenAI key required.
When you have 150+ MCP tools, passing all of them to an AI agent costs ~30,000 tokens per request. This proxy exposes just 4 tools (discover_tools, execute_tool, batch_execute, refresh_tools). The agent searches semantically for what it needs, then calls it — reducing token usage by ~93%.
Without proxy: 151 tools × ~200 tokens = 30,860 tokens per request
With proxy: 4 tool definitions + search results = ~500 tokens
MCP Router (all your servers)
│ stdio
▼
mcp-vector-proxy (tray-managed background process, port 3456)
- Local embeddings: EmbeddingGemma-300M q8 (~150MB, runs offline)
- LanceDB vector store (persistent, handles 1M+ tools, no server)
- Hybrid search: dense vector + BM25 keyword + RRF fusion
- Auto-syncs when tools change (MCP notifications + polling)
- HTTP: Streamable HTTP + SSE legacy
│
├── Claude Code / other agents (HTTP → :3456/mcp)
│
└── Claude Desktop (stdio-bridge → HTTP)
System tray (node dist/tray.js, auto-starts on login)
- Green = connected, N tools indexed
- Yellow = MCP Router reconnecting
- Red = proxy down / crashed (auto-restarts)
- Right-click → Restart Proxy / Open Health URL / Exit
First run: EmbeddingGemma-300M (~150MB) downloads automatically on first startup and is cached to
.model-cache/. Subsequent starts are instant.
cp .env.example .env
# Edit .env and replace "your-mcp-router-token-here" with your real MCPR_TOKEN
The .env file is gitignored. Alternatively, set MCPR_TOKEN as a system environment variable — it takes precedence over the .env file.
npm install
npm run build
Windows:
npm run setup
# or: powershell -ExecutionPolicy Bypass -File setup.ps1
macOS / Linux:
npm run setup
# or: bash setup.sh
This registers the tray to start on every login and launches it immediately.
Claude Code (~/.claude.json):
{
"mcpServers": {
"mcp-vector-proxy": {
"type": "http",
"url": "http://127.0.0.1:3456/mcp"
}
}
}
Claude Desktop (%APPDATA%\Claude\claude_desktop_config.json on Windows, ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"mcp-vector-proxy": {
"command": "node",
"args": ["/absolute/path/to/mcp-proxy/dist/stdio-bridge.js"],
"env": { "PROXY_URL": "http://127.0.0.1:3456/mcp" }
}
}
}
Any other agent — point it at http://127.0.0.1:3456/mcp (Streamable HTTP) or http://127.0.0.1:3456/sse (SSE legacy).
| Variable | Default | Description |
|---|---|---|
MCPR_TOKEN |
(from .env) | MCP Router auth token — required |
HTTP_PORT |
(none = stdio mode) | Port for HTTP server |
HTTP_HOST |
127.0.0.1 |
Bind address |
POLL_INTERVAL_MS |
15000 |
Tool change polling interval |
DISCOVER_LIMIT |
10 |
Default max results from discover_tools |
| Tool | Description |
|---|---|
discover_tools |
Hybrid semantic + keyword search — find relevant tools by natural language query |
execute_tool |
Execute any MCP tool by exact name with arguments |
batch_execute |
Execute multiple MCP tools in parallel in a single call |
refresh_tools |
Force re-index all tools from MCP Router immediately |
npm run build # Compile TypeScript → dist/
npm run setup # Register auto-start + launch tray (platform-detected)
npm run update # Build + restart tray (platform-detected)
After changing source code:
npm run update
This rebuilds everything and restarts the tray (which restarts the proxy).
GET http://127.0.0.1:3456/health
{
"status": "ok",
"routerConnected": true,
"tools": 151,
"indexedAt": "2026-02-17T15:51:21.620Z",
"sessions": { "streamable": 1, "sse": 0 }
}
Status is "ok" when MCP Router is connected and tools are indexed. "disconnected" means the proxy is up but MCP Router is unreachable (it will auto-reconnect).
src/
index.ts — Main proxy server (HTTP + stdio modes, hybrid vector search)
stdio-bridge.ts — Thin stdio→HTTP forwarder for Claude Desktop
launch-router.ts — Spawns MCP Router CLI with windowsHide:true
tray.ts — Cross-platform system tray (systray2)
dist/ — Compiled output (generated by npm run build)
.env.example — Template for .env (copy and fill in MCPR_TOKEN)
.env — Your config (gitignored, never commit this)
setup.ps1 — Windows: register auto-start + launch tray
setup.sh — macOS/Linux: register auto-start + launch tray
restart-tray.ps1 — Windows: kill + restart tray
restart-tray.sh — macOS/Linux: kill + restart tray
.lancedb/ — LanceDB vector store (auto-generated, gitignored)
.tool-meta.json — Tool fingerprint cache (auto-generated, gitignored)
.model-cache/ — Downloaded embedding model (~150MB, gitignored)
tools/list_changed notification when servers change → immediate re-indexdiscover_tools uses hybrid search for best accuracy:
This combination handles both vague queries ("something to do with files") and precise queries ("browser_screenshot") accurately at any scale.
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"mcp-vector-proxy": {
"command": "npx",
"args": []
}
}
}