loading…
Search for a command to run...
loading…
Enables users to generate AI-powered music videos by analyzing visual content to compose matching soundtracks using Google's Lyria model. The server automatical
Enables users to generate AI-powered music videos by analyzing visual content to compose matching soundtracks using Google's Lyria model. The server automatically merges audio and media into playable video artifacts that can be rendered inline within MCP-compatible chatbots.
An MCP (Model Context Protocol) server that generates AI-powered music videos. Give it an image or video and it will analyze the visual content, compose a matching soundtrack using Google's Lyria 3 model, merge everything with FFmpeg, and return a playable video artifact.
Source Media (image/video URL)
→ Gemini Vision analyzes the visual content (if no prompt given)
→ Lyria 3 generates a 30-second AI music track
→ FFmpeg merges audio + media into a single .mp4
→ Uploads to Google Cloud Storage
→ Returns an HTML artifact with an inline video player
PATH# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
lyria-002 + Gemini gemini-2.0-flash-001)gcloud auth application-default login
Clone and install:
git clone https://github.com/joshndala/music-media-mcp.git
cd music-media-mcp
python -m venv .venv
source .venv/bin/activate
pip install -e .
Configure environment:
cp .env.example .env
# Edit .env with your GCP project ID and GCS bucket name
Set up GCS CORS (required for video playback in chatbot artifacts):
# Create cors.json
echo '[{"origin":["*"],"method":["GET"],"responseHeader":["Content-Type","Content-Length","Range"],"maxAgeSeconds":3600}]' > cors.json
gsutil cors set cors.json gs://YOUR_BUCKET_NAME
# stdio transport (for Claude Desktop and other MCP desktop clients)
python server.py
# SSE transport (for web-based MCP clients)
python server.py --transport sse --port 8000
# Test with MCP Inspector
npx @modelcontextprotocol/inspector
# Then connect to http://localhost:8000/sse
# Build the container
gcloud builds submit \
--tag us-central1-docker.pkg.dev/YOUR_PROJECT/YOUR_REPO/music-media-server \
--project YOUR_PROJECT
# Deploy
gcloud run deploy music-media-server \
--image us-central1-docker.pkg.dev/YOUR_PROJECT/YOUR_REPO/music-media-server \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--set-env-vars "GCP_PROJECT_ID=YOUR_PROJECT,GCS_BUCKET_NAME=YOUR_BUCKET,GCP_LOCATION=us-central1" \
--memory 2Gi \
--timeout 300 \
--project YOUR_PROJECT
Your SSE endpoint will be at: https://YOUR_SERVICE_URL/sse
claude_desktop_config.json){
"mcpServers": {
"music-media": {
"command": "/path/to/.venv/bin/python",
"args": ["/path/to/server.py", "--transport", "stdio"],
"env": {
"GCP_PROJECT_ID": "your-project-id",
"GCS_BUCKET_NAME": "your-bucket-name",
"GCP_LOCATION": "us-central1"
}
}
}
}
Point your MCP client to your deployed Cloud Run URL:
https://your-service-url.run.app/sse
generate_and_merge_media| Parameter | Type | Required | Description |
|---|---|---|---|
source_media_url |
string |
✅ | Direct URL to a source image or video |
music_prompt |
string |
❌ | Music style description (auto-generated if omitted) |
Returns: A complete HTML document with an inline video player.
Example prompts:
"Upbeat electronic dance music with synth arpeggios""Calm ambient piano piece evoking a misty morning""Cinematic orchestral score with soaring strings"| Variable | Required | Default | Description |
|---|---|---|---|
GCP_PROJECT_ID |
✅ | — | Google Cloud project ID |
GCS_BUCKET_NAME |
✅ | — | GCS bucket for video uploads |
GCP_LOCATION |
❌ | us-central1 |
Vertex AI region |
MIT
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"music-media-mcp-server": {
"command": "npx",
"args": []
}
}
}Transcripts, channel stats, search
AI image generation using various models.
Unified GPU inference API with 30 AI services (LLM, image gen, video, TTS, whisper, embeddings, reranking, OCR) as MCP tools. Pay-per-use via x402 USDC or API k
A powerful image generation tool using Google's Imagen 3.0 API through MCP. Generate high-quality images from text prompts with advanced photography, artistic,