loading…
Search for a command to run...
loading…
MCP server that wraps the Grok CLI to enable code review, adversarial testing, and chat with xAI's Grok model, integrating into any MCP host as a peer reviewer,
MCP server that wraps the Grok CLI to enable code review, adversarial testing, and chat with xAI's Grok model, integrating into any MCP host as a peer reviewer, adversary, and consultant.
Bring xAI's Grok Build CLI into any MCP host as a peer reviewer, adversary, and consultant — alongside whatever main model you're already running.
grok-build-mcp is a small Model Context Protocol server that wraps the grok CLI. Your existing agent (Claude Code, Cursor, Cline, OpenClaw, …) can call into Grok for second-opinion code review, adversarial testing, or extended chat — without leaving its session.
繁體中文版:README.zh-TW.md
Four tools, all stateless, all stdout-only:
| Tool | Use it for |
|---|---|
grok_chat |
One-shot prompt → Grok's reply |
grok_review |
Pass a unified diff (or auto-grab git diff main...HEAD) and get a per-dimension code review |
grok_consult |
Replay a message history for multi-turn — caller owns the thread |
grok_challenge |
Adversarial: ask Grok to find every bug, race, edge case, and security hole |
curl -fsSL https://x.ai/cli/install.sh | bash
grok # first run handles auth
npm install -g grok-build-mcp
# or use npx — no install needed
npx grok-build-mcp
claude mcp add grok npx -- grok-build-mcp
Or edit ~/.claude.json directly:
{
"mcpServers": {
"grok": {
"command": "npx",
"args": ["-y", "grok-build-mcp"]
}
}
}
Create .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"grok": {
"command": "npx",
"args": ["-y", "grok-build-mcp"]
}
}
}
Settings → Cline → MCP Servers:
{
"grok": {
"command": "npx",
"args": ["-y", "grok-build-mcp"]
}
}
grok-build-mcp speaks plain stdio MCP. Point any client at npx -y grok-build-mcp and it works.
grok_chat{ "prompt": "Explain consistent hashing in two sentences." }
Optional: model to override the default Grok model; timeout (seconds) to extend the per-call limit for long grok-4 reasoning. All four tools accept timeout.
grok_review{ "base_ref": "main", "focus": "security" }
If diff is omitted, runs git diff <base_ref>...HEAD in cwd (defaults to your host's working directory). Returns a markdown review with verdict, per-dimension scores (correctness / readability / architecture / security / performance), and concrete fix-it items.
grok_consult{
"messages": [
{ "role": "system", "content": "You are a senior backend engineer." },
{ "role": "user", "content": "How would you cache this query?" },
{ "role": "assistant", "content": "Two options..." },
{ "role": "user", "content": "What's the failure mode of option 2?" }
]
}
The server is stateless — the caller passes the full thread each time. Most MCP hosts handle this naturally.
grok_challenge{
"code": "function transfer(from, to, amount) { from.balance -= amount; to.balance += amount; }",
"context": "Node.js, called concurrently from HTTP handlers"
}
Returns severity-ranked issues (Critical / High / Medium / Low) with concrete reproductions and patches.
| Env var | Default | Purpose |
|---|---|---|
GROK_MCP_BIN |
grok |
Path to the grok binary |
GROK_MCP_TIMEOUT |
300000 |
Default per-call timeout in milliseconds |
Authentication and model defaults live in the Grok CLI itself (~/.grok/config.toml).
grok-4 is a reasoning model and long prompts routinely take longer than two minutes. The server's default per-call limit is 300s (5 min). You can change it three ways:
timeout (seconds) to any tool: { "prompt": "...", "timeout": 600 }.GROK_MCP_TIMEOUT (milliseconds) in the MCP server's env.MCP_TIMEOUT (server startup) and MCP_TOOL_TIMEOUT (per tool call), both in milliseconds.On timeout the error includes any partial output Grok produced before the deadline, so you don't lose a near-complete answer.
grok_consult can take a conversation_idprogress notificationsgit clone https://github.com/howardpen9/grok-mcp.git
cd grok-mcp
npm install
npm test
npm run build
MIT
Run in your terminal:
claude mcp add grok-build-mcp -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.