loading…
Search for a command to run...
loading…
An MCP server that enables analysis of local media files, including video, audio, and documents, by uploading them to EnriProxy for server-side extraction. It s
An MCP server that enables analysis of local media files, including video, audio, and documents, by uploading them to EnriProxy for server-side extraction. It supports various complex file formats and provides structured model analysis for media types often unsupported by standard MCP clients.
EnriVision is a Model Context Protocol (MCP) server over stdio that uploads local media to EnriProxy and returns server-side extraction + model analysis.
This is useful for media types that many MCP clients cannot read reliably (videos, audio, scanned PDFs, HEIC/AVIF, large files), while keeping the MCP server itself lightweight.
>= 22 (recommended: Node 24 LTS)POST /v1/uploadsHEAD /v1/uploads/:idPATCH /v1/uploads/:idPOST /v1/vision/analyze# Global install
npm install -g @bedolla/enrivision
# Or run without installing
npx -y @bedolla/enrivision@latest --help
npm install
npm run typecheck
npm run build
EnriVision runs as an MCP server over stdio. Your MCP host is responsible for launching the process.
Example: global install
{
"EnriVision": {
"type": "stdio",
"command": "enrivision",
"args": [],
"env": {
"ENRIPROXY_URL": "http://127.0.0.1:8787",
"ENRIPROXY_API_KEY": "YOUR_ENRIPROXY_API_KEY",
"ENRIVISION_DEFAULT_LANGUAGE": "es"
}
}
}
Example: no install (always uses whatever npm currently tags as latest)
{
"EnriVision": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@bedolla/enrivision@latest"],
"env": {
"ENRIPROXY_URL": "http://127.0.0.1:8787",
"ENRIPROXY_API_KEY": "YOUR_ENRIPROXY_API_KEY",
"ENRIVISION_DEFAULT_LANGUAGE": "es"
}
}
}
{
"EnriVision": {
"type": "stdio",
"command": "node",
"args": ["C:\\\\Users\\\\Administrator\\\\Projects\\\\EnriVision\\\\dist\\\\index.js"],
"env": {
"ENRIPROXY_URL": "http://127.0.0.1:8787",
"ENRIPROXY_API_KEY": "YOUR_ENRIPROXY_API_KEY",
"ENRIVISION_DEFAULT_LANGUAGE": "es"
}
}
}
EnriVision is configured via environment variables:
ENRIPROXY_URL (string, optional, default: http://127.0.0.1:8787)ENRIPROXY_API_KEY (string, required)ENRIVISION_TIMEOUT_MS (string, optional, default: 1800000)ENRIVISION_DEFAULT_LANGUAGE (string, optional)language.EnriVision exposes this MCP tool:
analyze_mediaGeneral notes:
arguments).path or paths is required.server_url/api_key overrides (these are configured via env vars).analyze_mediaInputs:
path (string, optional): absolute local file path.paths (string[], optional): absolute local image paths (useful for UI screenshot sets).context (string, optional): high-level hint (examples: ui, diagram, chart, error, code, meeting, tutorial, photo).question (string, optional): what you want to extract/answer.language (string, optional): preferred response language (ISO 639-1; e.g., es, en). If omitted, uses ENRIVISION_DEFAULT_LANGUAGE when set.analysis_mode (string, optional): auto | single | multipass.max_frames (number, optional): single-pass video frames (1..20).transcribe (boolean, optional): enable/disable transcription (videos).transcription_language (string, optional): whisper hint (auto, es, en, ...).Video targeting:
video.clip_start_seconds (number, optional)video.clip_duration_seconds (number, optional)Multipass tuning (advanced; used only for analysis_mode: multipass):
video.segment_seconds (number, optional)video.max_segments (number, optional)video.max_frames_per_segment (number, optional)document.max_pages_total (number, optional)document.pages_per_batch (number, optional)document.max_images_per_batch (number, optional)document.scanned_text_threshold_chars (number, optional)audio.timestamps (boolean, optional)audio.segment_seconds (number, optional)audio.max_segments (number, optional)images.max_images_total (number, optional)images.images_per_batch (number, optional)images.max_dimension (number, optional)Output:
analysis (string): model-produced analysis.media_type (string): detected media type (video, audio, image, document, image_set).extraction (object): safe metadata summary (internal routing details are stripped).Example arguments object:
{
"path": "C:\\\\path\\\\to\\\\video.mp4",
"question": "What are the key steps demonstrated?",
"analysis_mode": "auto",
"transcribe": true,
"language": "es"
}
Many MCP clients include a built-in Read(...) tool that can ingest local files and attach them to the model request.
This is convenient, but the set of supported formats is limited and can change across client versions.
If the file you need to analyze is not reliably supported by your client (for example .avif, .heic, .svg, videos,
audio, or Office documents), prefer EnriVision MCP so the client can upload bytes and EnriProxy can do extraction reliably.
EnriProxy determines media type using content-type and extension allow-lists.
Videos:
.mp4, .mov, .avi, .mkv, .webm, .m4v, .wmv, .flv, .3gp, .3g2, .ts, .mts, .m2ts, .mpeg, .mpg, .gifAudio:
.mp3, .mp1, .mp2, .mpa, .mpga, .wav, .aiff, .aif, .aifc, .caf, .flac, .m4a, .m4b, .m4r, .aac, .ogg, .oga, .wma, .opus, .weba, .mkaImages:
.png, .apng, .jpg, .jpeg, .gif, .webp, .avif, .heic, .heif, .tiff, .tif, .bmp, .svg, .icoDocuments:
.pdf, .docx, .pptx, .xlsx, .jsonlДобавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"enrivision": {
"command": "npx",
"args": []
}
}
}