loading…
Search for a command to run...
loading…
Adversarial multi-model reasoning verification for AI agents. Claude, Grok, and DeepSeek challenge each decision — returns ALLOW or HOLD with JWKS-signed attest
Adversarial multi-model reasoning verification for AI agents. Claude, Grok, and DeepSeek challenge each decision — returns ALLOW or HOLD with JWKS-signed attestation. x402-gated on Base.
MCP server for ThoughtProof — verify AI reasoning with adversarial multi-model consensus.
3–4 LLMs (Grok, Gemini, DeepSeek, Sonnet) independently evaluate every claim. A dedicated red-team model critiques their verdicts. A synthesizer (Sonnet) weighs everything and returns ALLOW, BLOCK, or UNCERTAIN with confidence score and objections.
{
"mcpServers": {
"thoughtproof": {
"command": "npx",
"args": ["-y", "thoughtproof-mcp"],
"env": {
"THOUGHTPROOF_API_KEY": "tp_op_your_key_here"
}
}
}
}
Works with Claude Desktop, Cursor, Windsurf, Cline, and any MCP-compatible client.
verify_claimVerify any claim or AI-generated reasoning before acting on it.
| Parameter | Type | Default | Description |
|---|---|---|---|
claim |
string | (required) | The text to verify |
stakeLevel |
low / medium / high / critical |
medium |
Risk level — higher stakes trigger deeper verification |
domain |
financial / medical / legal / code / general |
general |
Domain context for specialized verification |
speed |
fast / standard / deep |
standard |
Verification depth |
check_agent_scoreLook up an agent's composite trust score on the ERC-8004 registry.
| Parameter | Type | Description |
|---|---|---|
agentId |
string | Agent ID to look up |
domain |
string | Optional domain filter |
In Claude Desktop or Cursor, just ask:
"Verify the claim: GPT-5 achieves 95% accuracy on MMLU-Pro"
The tool returns:
⚠️ UNCERTAIN (42% confidence)
Claim: "GPT-5 achieves 95% accuracy on MMLU-Pro"
Objections:
- Insufficient public benchmark data to confirm
- Historical accuracy claims have been overstated
- MMLU-Pro methodology has known ceiling effects
⚡ 3.2s | Adversarial Multi-Model Consensus
Your AI Agent
│
▼
┌──────────────────┐
│ thoughtproof-mcp │ ← MCP Server (this package)
└──────────────────┘
│
▼
┌──────────────────┐
│ ThoughtProof API │ ← api.thoughtproof.ai (RV)
└──────────────────┘
│
▼
┌───────────────────────────────────────────┐
│ Stage 1: Independent Evaluation │
│ 3–4 LLMs (Grok, Gemini, DeepSeek, │
│ Sonnet) each examine the claim │
│ │
│ Stage 2: Red-Team Critique │
│ 1 dedicated model challenges all │
│ initial verdicts │
│ │
│ Stage 3: Synthesis │
│ Sonnet weighs verdicts + critique │
│ → final decision │
└───────────────────────────────────────────┘
│
▼
ALLOW / BLOCK / UNCERTAIN
+ confidence % + objections
| Speed | Models | Cost per verification |
|---|---|---|
| fast | 2 | $0.008 |
| standard | 4 | $0.02 |
| deep | 5+ | $0.08 |
Payment: API key (operator account) or x402 micropayment (USDC on Base).
Get an operator API key at thoughtproof.ai. Without a key, verifications use x402 micropayments automatically.
| Environment Variable | Default | Description |
|---|---|---|
THOUGHTPROOF_API_KEY |
(none) | Operator API key |
THOUGHTPROOF_BASE_URL |
https://api.thoughtproof.ai |
API base URL |
git clone https://github.com/ThoughtProof/thoughtproof-mcp.git
cd thoughtproof-mcp
npm install
npm run build
npm test
npm run dev # Run with tsx (hot reload)
npm run inspect # Test with MCP Inspector
MIT — ThoughtProof
Выполни в терминале:
claude mcp add thoughtproof-mcp -- npx CSA PROJECT - FZCO © 2026 IFZA Business Park, DDP, Premises Number 31174 - 001
Безопасность
Низкий рискАвтоматическая эвристика по публичным данным — не гарантия безопасности.