loading…
Search for a command to run...
loading…
Refines development prompts and stores semantic memory to save tokens in complex Antigravity tasks, enabling token savings and enriched context.
Refines development prompts and stores semantic memory to save tokens in complex Antigravity tasks, enabling token savings and enriched context.
Powered by OpenCode Protocol MCP
This repository contains the OpenCode MCP Server, a high-performance orchestration layer based on the Model Context Protocol (MCP). It is designed to act as a Proactive Architectural Assistant for Antigravity, transforming how AI interacts with your codebase.
If your local AI setup was a company:
The main goal of this MCP is to drastically save tokens and prevent hallucinations by ensuring your main AI model only processes highly refined, contextualized prompts.
The OpenCode MCP Server acts as an orchestration layer between the AI Client and local specialized tools.
graph TD
Client["AI Client (Antigravity/Claude)"] -- "MCP Protocol (Stdio/HTTP)" --> Server["OpenCode MCP Server"]
subgraph "OpenCode Engine"
Server --> Tools["Tools: refine_prompt / learn_context"]
Tools --> Memory["Memory Manager"]
Tools --> Cache[("LRU Cache (5m TTL)")]
Cache --> Docs["Context7 Fetcher"]
end
subgraph "Local Infrastructure"
Memory -- "Store/Search" --> LDB[("LanceDB Vector Store")]
Memory -- "Generate Embeddings" --> OLL["Ollama: nomic-embed-text"]
end
subgraph "External Resources"
Docs -- "HTTPS/SSE" --> C7["Context7 API (Upstash)"]
end
Server -- "Refined Prompt + Local/External Context" --> Client
sequenceDiagram
participant User as User / Developer
participant AG as AI Client (Antigravity)
participant OC as OpenCode MCP Server
participant LDB as LanceDB (Memory)
participant OLL as Ollama (Local AI)
participant Cache as LRU Cache
participant C7 as Context7 (API)
User->>AG: "How do I fix the auth bug?"
Note over AG: Rule 1: Notify User & Refine
AG->>User: "Refining with OpenCode for precision..."
AG->>OC: refine_prompt("How do I fix the auth bug?")
OC->>OLL: Rewrite prompt & infer technologies (e.g., llama3)
OLL-->>OC: JSON { refinedPrompt, technologies: ["react"] }
OC->>OLL: Generate embedding for refined prompt (nomic-embed-text)
OLL-->>OC: Vector representation
OC->>LDB: Vector search for top-relevant context
LDB-->>OC: Snippets: "Auth uses JWT" (Returns Confidence Score)
opt If Confidence < Threshold & Tech Inferred
OC->>Cache: Check for cached docs (e.g., "react")
alt Cache Hit
Cache-->>OC: Return cached docs
else Cache Miss
OC->>C7: Fetch documentation (Context7)
C7-->>OC: Latest API reference & examples
OC->>Cache: Store docs in LRU Cache (5m TTL)
end
end
Note over OC: Merge Refined Prompt + Local Memory + External Docs
OC-->>AG: "SUPER PROMPT + <semantic_memory> + <external_documentation>"
Note over AG: Generate high-precision answer
AG->>User: Technical solution with codebase context
The solution is built using a modern and efficient stack designed for high performance and local privacy:
nomic-embed-text to generate high-quality vector embeddings locally, ensuring your technical data never leaves your machine.Before starting, you need to set up the development environment. We recommend using NVM (Node Version Manager) to manage Node.js versions on Windows.
nvm-setup.exe installer from nvm-windows.nvm install 22
nvm use 22
Ollama is heavily used by OpenCode MCP for two distinct local tasks:
Embeddings: Generating vectors for semantic memory (nomic-embed-text).
Local Refinement: Rewriting vague prompts and inferring technologies dynamically (llama3 by default).
Open PowerShell as Administrator and run:
winget install ollama
After installation, restart the terminal and download the required models:
# Model for Vector Embeddings (Required)
ollama pull nomic-embed-text
# Model for Prompt Refinement & Inference (Recommended: llama3, qwen2.5:0.5b, etc.)
ollama pull llama3
Check if the tools are ready in PowerShell:
node -v # Should return v22.x.x or higher
ollama --version
Follow the steps below to configure the OpenCode MCP Server using PowerShell:
Clone the repository:
git clone <repository-url>
cd open-code-as-mcp
Install dependencies:
npm install
Build the project:
npm run build
To integrate this MCP server with Antigravity, you must choose between Local mode (running on the same machine) or Remote mode (running on a server/cloud).
Use this option if the server is on the same machine as the client.
Memory will be shared across all projects and stored in the server folder.
{
"mcpServers": {
"opencode": {
"command": "node",
"args": ["D:/IA/MCP/open-code-as-mcp/build/index.js"]
}
}
}
For each project to have its own isolated memory inside the project's .mcp_memory folder.
[!IMPORTANT] Always use absolute paths in the
MCP_MEMORY_PATHenvironment variable when configuring the server in a global MCP config (like Claude Desktop). This ensures the server finds the correct folder regardless of the current working directory.
{
"mcpServers": {
"opencode": {
"command": "node",
"args": ["D:/IA/MCP/open-code-as-mcp/build/index.js"],
"env": {
"MCP_MEMORY_PATH": "D:/IA/MCP/open-code-as-mcp/.mcp_memory/vectors"
}
}
}
}
Note: Be sure to add .mcp_memory/ to your .gitignore if you don't want to version the database.
[!TIP] Ensure Ollama is running and you have downloaded the model with
ollama pull nomic-embed-text.
To enable real-time documentation retrieval, you can add your Context7 API key (get it at context7.com). OpenCode uses a local model via Ollama to intelligently rewrite your prompt and infer which technologies are mentioned before fetching their official documentation.
{
"mcpServers": {
"opencode": {
"command": "node",
"args": ["D:/IA/MCP/open-code-as-mcp/build/index.js"],
"env": {
"CONTEXT7_API_KEY": "your_api_key_here",
"ENABLE_CONTEXT7": "true",
"USE_HYBRID": "true",
"LOCAL_CONFIDENCE_THRESHOLD": "0.7",
"MCP_INFERENCE_MODEL": "llama3"
}
}
}
}
| Variable | Default | Description |
|---|---|---|
USE_HYBRID |
true |
When enabled, only calls Context7 if local memory confidence is below the threshold. |
LOCAL_CONFIDENCE_THRESHOLD |
0.7 |
Value between 0-1. Higher values force more frequent external documentation lookups. |
[!TIP] Ensure you have pulled the inference model configured in
MCP_INFERENCE_MODEL(e.g.,ollama pull llama3) to allow the local refinement step to work properly.
Use this option if the server is running remotely. The server uses the modern Streamable HTTP transport, which is more robust and efficient.
{
"mcpServers": {
"opencode": {
"url": "http://your-remote-server:3000/mcp"
}
}
}
Note: The server also maintains backward compatibility for legacy clients at http://your-remote-server:3000/sse.
To ensure Antigravity consistently follows best practices, the Global Rules are stored in two key locations:
GEMINI.md file, located in your user profile: %USERPROFILE%\.gemini\GEMINI.md (a copy is available in this repo as GEMINI.md)..cursorrules file in the root of this repository.You can reference the global rules path by setting an environment variable in your terminal or system configuration:
$env:ANTIGRAVITY_RULES_PATH = "$HOME\.gemini\GEMINI.md"
To ensure Antigravity uses this MCP correctly, configure the following rules in your System Prompt:
opencode:refine_prompt.opencode:learn_context. Briefly inform the user that this knowledge is being persisted in OpenCode's semantic memory.[!TIP] You can find the raw version of these rules in the .cursorrules or GEMINI.md file for easy copying into your System Prompt.
The OpenCode MCP provides the following tools:
refine_promptRefines a development prompt to make it clearer and more efficient, injecting targeted context via XML tags.
prompt: (string) The original prompt that needs refinement.categoryFilter: (string, optional) Optional category to filter memories (e.g., 'architecture', 'style') to increase precision and reduce token usage.learn_contextMemorizes important information (preference, technical rule, context) for future use in semantic memory.
information: (string) The information to be remembered.category: (string, optional) Information category (e.g., 'preference', 'architecture', 'style').search_memoryDirectly queries the semantic memory without refining a prompt.
query: (string) The search query.category: (string, optional) Filter results by category.limit: (number, optional) Number of results to return.index_codebasePerforms a recursive scan of the project to build a structural map in memory.
path: (string, optional) Root path to scan.You can visualize your memory health and stats using the local dashboard:
node dashboard.cjs
The server supports remote access via SSE (Server-Sent Events). To run in remote mode in PowerShell, use:
$env:MCP_MODE="sse"; $env:PORT="3000"; npm start
To run the server in development mode with hot-reload in PowerShell:
npm run dev
You can test the server locally by running in PowerShell:
node build/test-mcp.js
A technical analysis was performed to measure the efficiency of semantic retrieval vs. full-context injection.
| Metric | Traditional (Full Context) | MCP (Semantic Retrieval) | Efficiency Gain |
|---|---|---|---|
| Characters Sent | ~8,000 | ~950 | ~88% Savings |
| Tokens (Est. 1:4) | ~2,000 | ~238 | ~88% Savings |
| Response Accuracy | Medium (Noise risk) | High (Exact context) | Qualitative Boost |
| Metric | Without Context7 (Local Only) | With Context7 Enabled | Impact |
|---|---|---|---|
| Tokens (Est.) | ~285 tokens | ~1,712 tokens | +1,427 tokens |
| Context Quality | Limited to local codebase memory | High (Injected official Node.js/Express docs) | Drastic Contextual Boost |
| Risk of Hallucination | High (Model relies on generic training data) | Zero (Grounded by official <external_documentation>) |
Precise Answers |
| Metric | Context7 Always On | Hybrid Mode (Local First) | Benefit |
|---|---|---|---|
| Avg. Tokens (Mixed Workload) | ~1,800 tokens | ~550 tokens | ~70% Savings |
| Avg. Latency | ~450ms (Network heavy) | ~120ms (Local first) | ~73% Faster |
| Documentation Quality | Maximum (Always fresh) | Optimized (Local patterns preferred) | Reduced "Context Bloat" |
Conclusion: The Hybrid Routing strategy is the "Golden Ratio" of MCP performance. It ensures that when you ask a project-specific question, you aren't paying the token/latency tax of external lookups, while still providing a safety net for framework-level queries. Combined with the LRU Cache, it makes the OpenCode MCP one of the most cost-efficient orchestrators available.
To maximize efficiency, the server actively implements:
architecture or style), significantly reducing noise and allowing the result limit to be tightened.<semantic_memory> and <context_item>). This aligns with how modern LLMs best parse context, eliminating attention dilution.Выполни в терминале:
claude mcp add opencode-mcp-server -- npx Да, OpenCode Server MCP бесплатный — установка в один клик через Unyly без оплаты.
Нет, OpenCode Server работает без API-ключей и переменных окружения.
Self-hosted: сервер запускается локально на твоей машине командой из раздела установки.
Открой OpenCode Server на unyly.org, выбери вкладку своего клиента (Claude Desktop, Claude Code, Cursor) и нажми Install — конфиг сгенерируется автоматически, без правки JSON.
Не уверен что выбрать?
Найди свой стек за 60 секунд
Автор?
Embed-бейдж для README
Похожее
Все в категории development