loading…
Search for a command to run...
loading…
Transcribe any video URL to text with one command, supporting 1000+ sites via yt-dlp and multiple ASR providers.
Transcribe any video URL to text with one command, supporting 1000+ sites via yt-dlp and multiple ASR providers.
Transcribe any video URL to text with one command.
An MCP Server + Agent Skill for AI coding assistants, and a standalone CLI tool. Give your AI agent the ability to transcribe any video — just paste a link.
Supports Bilibili, YouTube, TikTok/Douyin, Twitter/X, Vimeo, and 1000+ sites via yt-dlp.
pip install vidscribe[mcp]
Add to your MCP client config (e.g. claude_desktop_config.json or Cursor MCP settings):
{
"mcpServers": {
"vidscribe": {
"command": "vidscribe-mcp"
}
}
}
Then just ask your AI: "Transcribe this video: https://..."
Copy the skill into your personal skills directory:
# Cursor
cp -r agent-skill ~/.cursor/skills/vidscribe
# Windsurf
cp -r agent-skill ~/.windsurf/skills/vidscribe
# Codex
cp -r agent-skill ~/.codex/skills/vidscribe
Then just tell your AI agent: "Transcribe this video: https://..." — it will know how to use vidscribe automatically.
pip install vidscribe
# Set up any one provider (see below)
export OPENAI_API_KEY="sk-..."
# Transcribe!
vidscribe "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o transcript.txt
# Core (with cloud providers that need no extra deps)
pip install vidscribe
# With MCP server support
pip install vidscribe[mcp] # MCP Server
# With specific provider support
pip install vidscribe[openai] # OpenAI Whisper API
pip install vidscribe[deepgram] # Deepgram
pip install vidscribe[aliyun] # Aliyun DashScope
pip install vidscribe[local] # Local faster-whisper (offline)
pip install vidscribe[all] # Everything (MCP + all providers)
System dependencies:
brew install ffmpeg / apt install ffmpeg)| Provider | Best For | Price | Speed | Setup |
|---|---|---|---|---|
| Volcengine | Chinese content | ~$0.11/hr | Fast | Guide |
| OpenAI | Multilingual | $0.36/hr | Fast | Guide |
| Aliyun | Chinese content | ~$0.17/hr | Fast | Guide |
| Deepgram | Speed & cost | $0.26/hr | Fastest | Guide |
| Local | Privacy / offline | Free | Slow (CPU) | Guide |
Cheapest cloud option for Chinese content. Powered by Doubao (豆包) ASR.
export VOLC_APP_KEY="your_app_id"
export VOLC_ACCESS_KEY="your_access_token"
Get credentials: Volcengine Console
Best multilingual quality. Works globally.
export OPENAI_API_KEY="sk-..."
Get API key: OpenAI Platform
Good for Chinese. Free trial: 3 months, 2 hours/day.
export DASHSCOPE_API_KEY="sk-..."
Get API key: Bailian Console
Fastest transcription. $200 free credit on signup, no credit card needed.
export DEEPGRAM_API_KEY="..."
Get API key: Deepgram Console
Free and offline. No API key needed. Runs on your CPU (slower but private).
pip install vidscribe[local]
vidscribe "https://..." -p local
# Auto-detect provider from environment variables
vidscribe "https://www.bilibili.com/video/BV1xxx"
# Specify provider explicitly
vidscribe "https://youtu.be/xxx" -p openai
# Save to file
vidscribe "https://youtu.be/xxx" -o transcript.txt
# Language hint (helps accuracy for non-Chinese/English)
vidscribe "https://youtu.be/xxx" -l ja-JP
# Keep the downloaded audio file
vidscribe "https://youtu.be/xxx" -o transcript.txt --keep-audio
# Use as Python module
python -m vidscribe "https://..."
Bilibili, YouTube, TikTok, Douyin, Twitter/X, Vimeo, Dailymotion, Twitch, Instagram, Facebook, Reddit, NicoNico, and 1000+ more.
Chinese, English, Japanese, Korean, French, German, Spanish, Russian, Arabic, and 50+ more (varies by provider).
Video URL --> yt-dlp (download audio) --> ASR Provider --> Text
|
[cloud providers]
upload to temp host
|
[local provider]
process locally
For a typical 15-minute video:
| Provider | Cost | Notes |
|---|---|---|
| Volcengine | Cheapest cloud | |
| OpenAI | ~$0.09 | Best quality |
| Aliyun | Free trial available | |
| Deepgram | ~$0.06 | $200 free credit |
| Local | Free | Requires CPU time (~20 min) |
You can set environment variables in your shell profile (~/.zshrc, ~/.bashrc) or use a .env file:
cp .env.example .env
# Edit .env with your keys
vidscribe auto-detects providers in this priority order:
VOLC_APP_KEY is set)OPENAI_API_KEY is set)DASHSCOPE_API_KEY is set)DEEPGRAM_API_KEY is set)faster-whisper is installed)Override with -p <provider> flag.
vidscribe can run as an MCP (Model Context Protocol) server, exposing video transcription as a tool that any MCP-compatible AI client can call directly.
What is MCP? Model Context Protocol is an open standard for connecting AI assistants to external tools. It lets AI models call your tools natively — no shell commands, no copy-paste.
Install & run:
pip install vidscribe[mcp]
Configure your MCP client:
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"vidscribe": {
"command": "vidscribe-mcp"
}
}
}
Add to your Cursor MCP settings (.cursor/mcp.json):
{
"mcpServers": {
"vidscribe": {
"command": "vidscribe-mcp"
}
}
}
Exposed MCP tools:
| Tool | Description |
|---|---|
transcribe |
Transcribe a video URL to text. Params: url, provider, lang, output_path |
list_available_providers |
List all ASR providers and whether they are configured |
Environment variables: The MCP server reads the same env vars as the CLI (VOLC_APP_KEY, OPENAI_API_KEY, etc.). Set them in your shell profile before starting the MCP server.
vidscribe ships with a ready-to-use agent-skill/SKILL.md that works with any AI coding assistant that supports Agent Skills (Cursor, Windsurf, Codex, etc.).
What is an Agent Skill? Agent Skills are reusable capabilities you can add to AI coding assistants. Once installed, the AI automatically knows when and how to use the tool — no manual prompting needed.
Install:
# Clone the repo
git clone https://github.com/XFWang522/vidscribe.git
# Copy the skill to your assistant
cp -r vidscribe/agent-skill ~/.cursor/skills/vidscribe
How it works:
vidscribe with the right parametersWorks with any video platform — Bilibili, YouTube, TikTok, Twitter/X, and 1000+ more.
| MCP Server | Agent Skill | |
|---|---|---|
| Protocol | Standard MCP (JSON-RPC over stdio) | Markdown file read by AI |
| Works with | Claude Desktop, Cursor, any MCP client | Cursor, Windsurf, Codex |
| Setup | pip install + config JSON |
Copy a folder |
| Tool discovery | Automatic via MCP protocol | AI reads SKILL.md |
| Best for | Claude Desktop users; standardized tool integration | Cursor/Windsurf users who prefer skills |
You can use both simultaneously — they don't conflict.
Run in your terminal:
claude mcp add vidscribe -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.