loading…
Search for a command to run...
loading…
MCP server for Mistral AI providing OCR, TTS, and STT capabilities via Voxtral.
MCP server for Mistral AI providing OCR, TTS, and STT capabilities via Voxtral.
Mistral AI MCP server + CLI. OCR documents, TTS text-to-speech, STT speech-to-text via Voxtral.
npm install -g mistral-ai-mcp
Requires Node.js 18+.
mistral-ai ocr <file-or-url> # Extract text from documents/images
mistral-ai tts <text> # Generate speech from text
mistral-ai stt <audio> # Transcribe audio to text
mistral-ai config ... # Manage configuration
# Local PDF
mistral-ai ocr ./document.pdf > output.md
# From URL
mistral-ai ocr https://arxiv.org/pdf/2301.00001.pdf > paper.md
# Tables
mistral-ai ocr ./document.pdf --table-format html
# Preset voice
mistral-ai tts "Hello, world!" --voice-id alice
# Voice cloning via reference audio
mistral-ai tts "Hello from me!" --ref-audio ./my-voice.wav
# Output format
mistral-ai tts "Hello!" --voice-id bob --format wav > output.wav
# Basic transcription
mistral-ai stt ./audio.mp3
# Realtime mode (low latency)
mistral-ai stt ./audio.mp3 --realtime
# Speaker diarization
mistral-ai stt ./meeting.mp3 --diarize
# Specific language
mistral-ai stt ./audio.mp3 --language en
| Tool | Input Formats | Output Formats |
|---|---|---|
| OCR | PDF, DOCX, PPTX, XLSX, PNG, JPEG, AVIF | Markdown + YAML |
| TTS | Text | MP3, WAV, PCM, FLAC, Opus |
| STT | MP3, WAV, FLAC, OGG, WebM | Text + JSON |
mistral-ai config api_key <your-key>
Or via environment:
export MISTRAL_API_KEY=your-key
~/.mistral-ai/config.json%USERPROFILE%\.mistral-ai\config.jsonOverride with MISTRAL_AI_CONFIG_DIR env var.
mistral-ai config show
mistral-ai config api_key <key> # Set API key
mistral-ai config base_url <url> # API endpoint (default: https://api.mistral.ai/v1)
mistral-ai config model <model> # Default OCR model
Start server with stdio transport:
mistral-ai-mcp
| Tool | Description |
|---|---|
ocr_pdf |
Extract text from PDF sources |
tts_speech |
Generate speech from text |
stt_transcribe |
Transcribe audio to text |
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
const transport = new StdioClientTransport({
command: 'node',
args: ['node_modules/.bin/mistral-ai-mcp'],
env: { MISTRAL_API_KEY: 'your-key' },
});
const client = new Client({ name: 'test', version: '1.0.0' }, { capabilities: {} });
await client.connect(transport);
// OCR
const ocrResult = await client.callTool({
name: 'ocr_pdf_url',
arguments: { pdf_url: 'https://example.com/doc.pdf' },
});
// TTS
const ttsResult = await client.callTool({
name: 'tts_speech',
arguments: { text: 'Hello!', voice_id: 'alice' },
});
// STT
const sttResult = await client.callTool({
name: 'stt_transcribe',
arguments: { audio_source: './audio.mp3' },
});
npm install
npm run dev # Run with tsx (watch mode)
npm run build # Compile TypeScript
npm run clean # Remove dist/
npm run lint # ESLint check
npm run format # Prettier check
npm run format:write # Prettier fix
npm run typecheck # Type check only
npm run test # Run tests
Husky hooks enforce code quality:
npm run lint && npm run formatnpm run typecheck && npm run testGitHub Actions workflows:
.github/workflows/ci.yml): Lint, build, typecheck on push/PR.github/workflows/test.yml): Run tests on push/PRISC
Run in your terminal:
claude mcp add mistral-ai-mcp -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.