loading…
Search for a command to run...
loading…
MCP server for the Supertone TTS API. Generate natural speech, browse and preview the voice catalog, predict synthesis cost, and create cloned voices — directly
MCP server for the Supertone TTS API. Generate natural speech, browse and preview the voice catalog, predict synthesis cost, and create cloned voices — directly from Claude Desktop, Cursor, or any MCP-compatible client. Supports Korean, English, Japanese, and 20+ other languages, with speed, pitch, and emotion-style control.
MCP server for Supertone TTS API. Generate high-quality speech, browse the voice catalog, predict synthesis cost, and create cloned voices — directly from Claude Desktop, Cursor, or any MCP-compatible client.
supertone-inc/supertone-mcp MCP server
Synthesis
text_to_speech — Convert text (≤300 chars) to audio. Output as files, MCP resources, or both.predict_duration — Estimate audio length (and credit cost) without synthesizing.Voice discovery (preset)
search_voice — Filter the catalog by language, gender, age, use_case, style, model, name, or description.get_voice — Full detail for one voice.preview_voice — Sample audio URLs for a voice (filterable by language/style/model).get_credit_balance — Check remaining credits.Custom voice cloning
clone_voice — Create a cloned voice from a local WAV/MP3 (≤3MB).search_custom_voice — List/filter cloned voices.edit_custom_voice — Update name and/or description.delete_custom_voice — Permanently delete (irreversible).Supports Korean, English, Japanese, and 20+ other languages. Speed (0.5x–2.0x), pitch shift (-24 to +24 semitones), and emotion styles.
Breaking change in v0.2:
list_voiceswas removed and replaced bysearch_voice. To reproduce the old behavior, callsearch_voicewith no arguments.
# Using uvx (recommended)
uvx supertone-mcp
# Using pip
pip install supertone-mcp
Add to claude_desktop_config.json:
{
"mcpServers": {
"supertone-tts": {
"command": "uvx",
"args": ["supertone-mcp"],
"env": {
"SUPERTONE_API_KEY": "your-api-key-here"
}
}
}
}
Add to your Cursor MCP settings (same JSON shape as above).
| Variable | Required | Default | Description |
|---|---|---|---|
SUPERTONE_API_KEY |
Yes | — | Your Supertone API key |
SUPERTONE_MCP_VOICE_ID |
No | preset voice (Aiden, multilingual) | Default voice_id for text_to_speech / predict_duration |
SUPERTONE_OUTPUT_DIR |
No | ~/supertone-tts-output/ |
Directory where audio files are saved |
SUPERTONE_MCP_OUTPUT_MODE |
No | files |
One of files, resources, both. Controls how text_to_speech returns audio (see below) |
SUPERTONE_MCP_AUTOPLAY |
No | true |
Auto-play generated audio on macOS via afplay (enabled by default). Set false/0/no to disable |
text_to_speech)| Mode | Returns | Use when |
|---|---|---|
files (default) |
Plain text with the saved file path + metadata | You want the file on disk |
resources |
MCP AudioContent + TextContent (no file written) |
The client renders audio inline (e.g., Claude.ai chat) |
both |
File on disk and AudioContent/TextContent |
You want both — preview inline, keep the file |
Natural language phrasing — the MCP client routes these to the right tool automatically.
Synthesis
"Read this aloud: Hello, how are you today?" "한국어로 '안녕하세요' 천천히 읽어줘"
Estimate before synthesizing
"이 문단 합성하면 몇 초쯤 나와?" → calls
predict_duration
Browse / pick a voice
"Find me a female Korean voice for narration" → calls
search_voice(language="ko", gender="female", use_case="narration")"그 중에 첫 번째 목소리 샘플 들어보자" → calls
preview_voice(voice_id=...)and returns sample URLs
Check credits
"내 크레딧 얼마 남았어?" → calls
get_credit_balance
Clone a voice from a local file
"이 파일로 클론 만들어줘: ~/recordings/sample.wav, 이름은 MyVoice" → calls
clone_voice(name="MyVoice", audio_path="~/recordings/sample.wav")
Manage cloned voices
"내가 만든 커스텀 보이스 목록 보여줘" →
search_custom_voice"MyVoice 이름을 NarratorA로 바꿔" →edit_custom_voice"MyVoice 삭제해" →delete_custom_voice(prompts for confirmation; irreversible)
text_to_speech| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string | Yes | — | Text to convert (≤300 chars; longer text is auto-chunked) |
voice_id |
string | No | env or preset | Voice identifier (browse via search_voice) |
language |
string | No | ko |
Language code (ko, en, ja, …) |
output_format |
string | No | mp3 |
mp3 or wav |
model |
string | No | sona_speech_1 |
TTS model |
speed |
float | No | 1.0 |
0.5–2.0 |
pitch_shift |
int | No | 0 |
-24 to +24 semitones |
style |
string | No | — | Emotion style (varies by voice) |
predict_durationSame parameter schema as text_to_speech (no auto-chunking — hard 300-char limit). Returns "Predicted duration: 2.34s (credit usage is proportional to duration).".
search_voiceAll parameters optional. With no filters → full catalog. With any filter → first response line is Filters applied: ....
| Parameter | Type | Description |
|---|---|---|
language |
string | e.g., ko, en, ja |
gender |
string | e.g., male, female |
age |
string | e.g., young_adult, child |
use_case |
string | e.g., narration, advertisement |
style |
string | e.g., neutral, happy |
model |
string | e.g., sona_speech_1 |
name |
string | partial match |
description |
string | partial match |
get_voice / preview_voice| Tool | Required | Optional |
|---|---|---|
get_voice |
voice_id |
— |
preview_voice |
voice_id |
language, style, model (filter samples) |
clone_voice| Parameter | Type | Required | Description |
|---|---|---|---|
name |
string | Yes | Display name (non-empty) |
audio_path |
string | Yes | Local WAV or MP3 path (≤3MB). Supports ~ expansion |
description |
string | No | Optional note |
| Tool | Required | Optional |
|---|---|---|
search_custom_voice |
— | name, description (partial match) |
edit_custom_voice |
voice_id |
name, description (at least one required) |
delete_custom_voice |
voice_id |
— (IRREVERSIBLE) |
# Clone and install
git clone https://github.com/supertone-inc/supertone-mcp.git
cd supertone-mcp
uv sync
# Run tests
uv run pytest -q
# Run with coverage
uv run pytest --cov=src --cov-report=term-missing
MIT
Выполни в терминале:
claude mcp add supertone-mcp -- npx Безопасность
Низкий рискАвтоматическая эвристика по публичным данным — не гарантия безопасности.