loading…
Search for a command to run...
loading…
Compare AI inference pricing across 9 providers in real time. Routing recommendations, spend tracking, and budget alerts for AI agents.
Compare AI inference pricing across 9 providers in real time. Routing recommendations, spend tracking, and budget alerts for AI agents.
The compute price oracle for AI agents.
Auto-configure Cursor and Claude Desktop in one command:
npx volthq-mcp-server --setup
Detects installed clients, merges config without overwriting your existing MCP servers.
Cursor — add to .cursor/mcp.json:
{
"mcpServers": {
"volthq": {
"command": "npx",
"args": ["-y", "volthq-mcp-server"]
}
}
}
Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"volthq": {
"command": "npx",
"args": ["-y", "volthq-mcp-server"]
}
}
}
| Tool | Description |
|---|---|
volt_check_price |
Compare pricing across providers for a model |
volt_recommend_route |
Get optimal provider recommendation with savings estimate |
volt_get_spend |
Spending summary by provider and model (today/7d/30d) |
volt_get_savings |
Actual spend vs optimized spend comparison |
volt_set_budget_alert |
Set daily/weekly/monthly budget threshold alerts |
> volt_check_price { "model": "llama-70b" }
Price comparison for "llama-70b" — 14 offerings found
────────────────────────────────────────────────────────────
1. DeepInfra — Llama-70B
Input: $0.20/M tokens | Output: $0.27/M tokens | Avg: $0.24/M
Quality: 88% | Region: global
2. Hyperbolic — Llama-70B (FP8) on H100-SXM
Input: $0.40/M tokens | Output: $0.40/M tokens | Avg: $0.40/M
Quality: 85% | Region: global
3. Hyperbolic — Llama-70B (BF16) on H100-SXM
Input: $0.55/M tokens | Output: $0.55/M tokens | Avg: $0.55/M
Quality: 88% | Region: global
4. Groq — Llama-70B
Input: $0.59/M tokens | Output: $0.79/M tokens | Avg: $0.69/M
Quality: 88% | Region: global
5. Fireworks AI — Llama-70B
Input: $0.90/M tokens | Output: $0.90/M tokens | Avg: $0.90/M
Quality: 88% | Region: global
6. Together AI — Llama-70B
Input: $0.88/M tokens | Output: $0.88/M tokens | Avg: $0.88/M
Quality: 88% | Region: global
7. Akash — Llama-70B (FP8) on H100-SXM
Input: $3.49/M tokens | Output: $8.72/M tokens | Avg: $6.11/M
Quality: 85% | Region: global
8. Akash — Llama-70B (FP8) on A100-80GB
Input: $5.24/M tokens | Output: $13.11/M tokens | Avg: $9.18/M
Quality: 85% | Region: global
Cheapest is 97% less than most expensive option.
DeepInfra at $0.24/M, Hyperbolic at $0.40/M, Groq at $0.69/M, Fireworks AI at $0.90/M — all vs GPT-4o at $6.25/M.
Volt collects anonymous usage metadata by default to improve routing recommendations. This includes: provider name, model name, tool response time, and success/failure status.
What is never collected: prompts, outputs, API keys, token counts, or any user-identifiable content. IPs are hashed and truncated server-side.
To opt out, set the environment variable:
VOLT_OBSERVATIONS=false
MIT
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"volthq-mcp-server": {
"command": "npx",
"args": []
}
}
}