loading…
Search for a command to run...
loading…
Enables querying MIMIC-IV medical data using natural language through MCP clients, with support for local DuckDB and cloud BigQuery backends.
Enables querying MIMIC-IV medical data using natural language through MCP clients, with support for local DuckDB and cloud BigQuery backends.
Query MIMIC-IV medical data using natural language through MCP clients
Transform medical data analysis with AI! Ask questions about MIMIC-IV data in plain English and get instant insights. Choose between local demo data (free) or full cloud dataset (BigQuery).
📺 Prefer video tutorials? Check out step-by-step video guides covering setup, PhysioNet configuration, and more.
uvx)We use uvx to run the MCP server. Install uv from the official installer, then verify with uv --version.
macOS:
brew install uv
Linux (or macOS without Homebrew):
curl -LsSf https://astral.sh/uv/install.sh | sh
# macOS - enable for GUI apps like Claude Desktop:
sudo ln -s $(which uv) $(which uvx) /usr/local/bin/
Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Verify installation:
uv --version
Skip this if using DuckDB demo database.
Install Google Cloud SDK:
brew install google-cloud-sdkAuthenticate:
gcloud auth application-default login
Opens your browser - choose the Google account with BigQuery access to MIMIC-IV.
Supported clients: Claude Desktop, Cursor, Goose, and more.
DuckDB (Demo or Full Dataset) To create a m3 directory and navigate into it run:
If you want to use the full dataset, download it manually from PhysioNet and place it into
Replace Demo dataset (16MB raw download size) downloads automatically on first query. Full dataset (10.6GB raw download size) needs to be downloaded manually. |
BigQuery (Full Dataset) Requires GCP credentials and PhysioNet access. Paste this into your client config JSON file:
Replace |
That's it! Restart your MCP client and ask:
| Feature | DuckDB (Demo) | DuckDB (Full) | BigQuery (Full) |
|---|---|---|---|
| Cost | Free | Free | BigQuery usage fees |
| Setup | Zero config | Manual Download | GCP credentials required |
| Data Size | 100 patients, 275 admissions | 365k patients, 546k admissions | 365k patients, 546k admissions |
| Speed | Fast (local) | Fast (local) | Network latency |
| Use Case | Learning, development | Research (local) | Research, production |
Already have Docker or prefer pip? Here are other ways to run m3:
DuckDB (Local):
|
BigQuery:
|
MCP config (same for both):
{
"mcpServers": {
"m3": {
"command": "docker",
"args": ["exec", "-i", "m3-server", "python", "-m", "m3.mcp_server"]
}
}
}
Stop: docker stop m3-server && docker rm m3-server
pip install m3-mcp
💡 CLI commands: Run
m3 --helpto see all available options.
Useful CLI commands:
m3 init mimic-iv-demo - Download demo databasem3 config - Generate MCP configuration interactivelym3 config claude --backend bigquery --project-id YOUR_PROJECT_ID - Quick BigQuery setupExample MCP config:
{
"mcpServers": {
"m3": {
"command": "m3-mcp-server",
"env": {
"M3_BACKEND": "duckdb"
}
}
}
}
For contributors:
git clone https://github.com/rafiattrach/m3.git && cd m3
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
pre-commit install
MCP config:
{
"mcpServers": {
"m3": {
"command": "/path/to/m3/.venv/bin/python",
"args": ["-m", "m3.mcp_server"],
"cwd": "/path/to/m3",
"env": {
"M3_BACKEND": "duckdb"
}
}
}
}
UV (Recommended)Assuming you have UV installed.
Step 1: Clone and Navigate
# Clone the repository
git clone https://github.com/rafiattrach/m3.git
cd m3
Step 2: Create UV Virtual Environment
# Create virtual environment
uv venv
Step 3: Install M3
uv sync
# Do not forget to use `uv run` to any subsequent commands to ensure you're using the `uv` virtual environment
After installation, choose your data source:
Perfect for learning and development - completely free!
Initialize demo dataset:
m3 init mimic-iv-demo
Setup MCP Client:
m3 config
Alternative: For Claude Desktop specifically:
m3 config claude --backend duckdb --db-path /Users/you/path/to/m3_data/databases/mimic_iv_demo.duckdb
Restart your MCP client and ask:
Run the entire MIMIC-IV dataset locally with DuckDB views over Parquet.
Acquire CSVs (requires PhysioNet credentials):
/Users/you/path/to/m3/m3_data/raw_files/mimic-iv-full/hosp//Users/you/path/to/m3/m3_data/raw_files/mimic-iv-full/icu/m3 init's auto-download function currently only supports the demo dataset. Use your browser or wget to obtain the full dataset.Initialize full dataset:
m3 init mimic-iv-full
export M3_CONVERT_MAX_WORKERS=6 # number of parallel files (default=4)
export M3_DUCKDB_MEM=4GB # DuckDB memory limit per worker (default=3GB)
export M3_DUCKDB_THREADS=4 # DuckDB threads per worker (default=2)
Pay attention to your system specifications, especially if you have enough memory.Select dataset and verify:
m3 use full # optional, as this automatically got set to full
m3 status
Configure MCP client (uses the full local DB):
m3 config
# or
m3 config claude --backend duckdb --db-path /Users/you/path/to/m3/m3_data/databases/mimic_iv_full.duckdb
For researchers needing complete MIMIC-IV data
Install Google Cloud CLI:
macOS (with Homebrew):
brew install google-cloud-sdk
Windows: Download from https://cloud.google.com/sdk/docs/install
Linux:
curl https://sdk.cloud.google.com | bash
Authenticate:
gcloud auth application-default login
This will open your browser - choose the Google account that has access to your BigQuery project with MIMIC-IV data.
Setup MCP Client for BigQuery:
m3 config
Alternative: For Claude Desktop specifically:
m3 config claude --backend bigquery --project-id YOUR_PROJECT_ID
Test BigQuery Access - Restart your MCP client and ask:
Use the get_race_distribution function to show me the top 5 races in MIMIC-IV admissions.
Need to configure other MCP clients or customize settings? Use these commands:
m3 config
Generates configuration for any MCP client with step-by-step guidance.
# Quick universal config with defaults
m3 config --quick
# Universal config with custom DuckDB database
m3 config --quick --backend duckdb --db-path /path/to/database.duckdb
# Save config to file for other MCP clients
m3 config --output my_config.json
For production deployments requiring secure access to medical data:
# Enable OAuth2 with Claude Desktop
m3 config claude --enable-oauth2 \
--oauth2-issuer https://your-auth-provider.com \
--oauth2-audience m3-api \
--oauth2-scopes "read:mimic-data"
# Or configure interactively
m3 config # Choose OAuth2 option during setup
Supported OAuth2 Providers:
Key Benefits:
📖 Complete OAuth2 Setup Guide: See docs/OAUTH2_AUTHENTICATION.md for detailed configuration, troubleshooting, and production deployment guidelines.
When your MCP client processes questions, it uses these tools automatically:
Try asking your MCP client these questions:
Demographics & Statistics:
Prompt: What is the race distribution in MIMIC-IV admissions?Prompt: Show me patient demographics for ICU staysPrompt: How many total admissions are in the database?Clinical Data:
Prompt: Find lab results for patient XPrompt: What lab tests are most commonly ordered?Prompt: Show me recent ICU admissionsData Exploration:
Prompt: What tables are available in the database?Prompt: What tools do you have for MIMIC-IV data?Prompt: Can you please call all your tools in a logical sequence?Local "Parquet not found" or view errors:
Rerun the m3 init command for your chosen dataset.
MCP client server not starting:
"Missing OAuth2 access token" errors:
# Set your access token
export M3_OAUTH2_TOKEN="Bearer your-access-token-here"
"OAuth2 authentication failed" errors:
Rate limit exceeded:
🔧 OAuth2 Troubleshooting: See OAUTH2_AUTHENTICATION.md for detailed OAuth2 troubleshooting and configuration guides.
"Access Denied" errors:
gcloud auth list"Dataset not found" errors:
physionet-data projectAuthentication issues:
# Re-authenticate
gcloud auth application-default login
# Check current authentication
gcloud auth list
See "Local Development" section above for setup instructions.
pytest # All tests (includes OAuth2 and BigQuery mocks)
pytest tests/test_mcp_server.py -v # MCP server tests
pytest tests/test_oauth2_auth.py -v # OAuth2 authentication tests
# Set environment variables
export M3_BACKEND=bigquery
export M3_PROJECT_ID=your-project-id
export GOOGLE_CLOUD_PROJECT=your-project-id
# Optional: Test with OAuth2 authentication
export M3_OAUTH2_ENABLED=true
export M3_OAUTH2_ISSUER_URL=https://your-provider.com
export M3_OAUTH2_AUDIENCE=m3-api
export M3_OAUTH2_TOKEN="Bearer your-test-token"
# Test MCP server
m3-mcp-server
mimic-iv-full (Download CLI)Deploy M3 on Kubernetes using Docker images with pre-loaded MIMIC-IV demo database:
# Build and push Docker image
make all # Will prompt for Docker registry/username
# Or specify registry directly
make all DOCKER_REGISTRY=your-username DOCKER=podman
The container uses StreamableHTTP transport on port 3000 with path /sse. Configure your MCP client to connect to the service endpoint (e.g., http://m3.kagent.svc.cluster.local:3000/sse for intra-cluster access).
Helm charts for deploying M3 are available in a separate repository.
We welcome contributions! Please:
If you use M3 in your research, please cite:
@article{attrach2025conversational,
title={Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis},
author={Attrach, Rafi Al and Moreira, Pedro and Fani, Rajna and Umeton, Renato and Celi, Leo Anthony},
journal={arXiv preprint arXiv:2507.01053},
year={2025}
}
You can also use the "Cite this repository" button at the top of the GitHub page for other formats.
M3 has been forked and adapted by the community:
Built with ❤️ for the medical AI community
Need help? Open an issue on GitHub or check our troubleshooting guide above.
Выполни в терминале:
claude mcp add m3 -- npx Query your database in natural language
автор: AnthropicRead-only database access with schema inspection.
автор: modelcontextprotocolInteract with Redis key-value stores.
автор: modelcontextprotocolDatabase interaction and business intelligence capabilities.
автор: modelcontextprotocolНе уверен что выбрать?
Найди свой стек за 60 секунд
Автор?
Embed-бейдж для README
Похожее
Все в категории data