loading…
Search for a command to run...
loading…
Enables AI-powered video editing in IDEs through a pipeline-driven workflow, including proxy ingestion, motion graphics generation, and high-fidelity rendering.
Enables AI-powered video editing in IDEs through a pipeline-driven workflow, including proxy ingestion, motion graphics generation, and high-fidelity rendering.
The Omni-Video Studio MCP is an enterprise-grade, autonomous Model Context Protocol (MCP) server that empowers any LLM-enabled IDE (Cursor, Claude Code, Antigravity) to act as a professional video editor.
It evolves simple transcription-based editing into a deterministic, token-efficient, and pipeline-driven workflow, featuring agent-native motion graphics (Hyperframes), visual metadata proxies, and high-fidelity final renders.
takes_packed.md (audio mapping) and a Visual Scene Graph. The agent edits using text proxies, cutting costs and accelerating reasoning.Prerequisites:
python 3.10+ffmpeg (must be installed on your system path)uv (recommended for dependency management)# Clone the repository
git clone https://github.com/your-org/omni-video-mcp.git
cd omni-video-mcp
# Install dependencies
uv venv
source .venv/bin/activate
uv pip install -e .
# Install Playwright browsers (for Hyperframes)
playwright install chromium
Add the server to your IDE's MCP settings file (e.g., ~/.gemini/antigravity/mcp_config.json, ~/.cursor/mcp.json, or Claude Desktop config):
{
"mcpServers": {
"omni-video-mcp": {
"command": "uv",
"args": [
"run",
"/path/to/omni-video-mcp/server.py"
],
"env": {
"ELEVENLABS_API_KEY": "your_api_key_here"
}
}
}
}
Note: The ELEVENLABS_API_KEY is currently required for high-fidelity word-level transcription mapping during ingestion.
When the agent uses this MCP server, it follows a 4-phase architecture:
omni_video_ingest)
The agent scans your raw .mp4 / .mov files, extracting a packed markdown transcript and an initial Visual Scene Graph.omni_video_preview)
The agent uses the transcript to construct an EDL (Edit Decision List) of the best takes. Ambiguous cuts can be visually verified by generating filmstrip PNGs via the preview tool.omni_video_generate_vfx)
The agent generates HTML/CSS motion graphics (lower thirds, b-roll layouts) and the server renders them deterministically into transparent .webm videos via Hyperframes.omni_video_render)
The agent passes the EDL, VFX timestamps, and render settings to the server, which builds a complex FFmpeg graph to concatenate the footage, grade it, restore the audio, and export a final master.Contributions are welcome! If you're adding new render pipeline capabilities (like auto-tracking or local whisper fallbacks), please open a PR. Ensure that any added Python dependencies are added to the pyproject.toml using uv add <package>.
MIT License
Add this to claude_desktop_config.json and restart Claude Desktop.
{
"mcpServers": {
"omni-video-studio-mcp": {
"command": "npx",
"args": []
}
}
}