Byok Observability

БесплатноHosted

A comprehensive Model Context Protocol (MCP) server for Grafana, Prometheus, Kafka UI, and Datadog. Features a secure "Bring Your Own Key" (BYOK) architecture w

автор: alimuratkuslu

GitHub

Описание

A comprehensive Model Context Protocol (MCP) server for Grafana, Prometheus, Kafka UI, and Datadog. Features a secure "Bring Your Own Key" (BYOK) architecture where credentials stay local. Provides tools for metrics querying, dashboard inspection, Kafka lag monitoring, and unified health checks.

README

byok-observability-mcp

Query your observability stack from Claude Code, Codex, or any MCP client and no data leaves your machine.

npm version License: MIT Awesome MCP Servers Node ≥18 alimuratkuslu/byok-observability-mcp MCP server

Bring Your Own Keys — credentials stay in env vars on your machine. No clone, no build, runs via npx.

Partial setup — configure only the backends you use. Tools for unconfigured backends are never exposed.

Quick Start · Tools · Credentials · Configuration · Scheduled Reports · Examples · Security · Development

How it works

     🤖 Claude Code / Codex CLI      
                 │                   
                 ▼                   
     ⚡ byok-observability-mcp       
        (Local npx process)          
                 │                   
  🔒 env vars never leave your machine 
                 │                   
                 ▼                   
 ┌─────────────────────────────────┐ 
 │  📊 Grafana     🔥 Prometheus   │ 
 │  🛶 Kafka UI    🐶 Datadog      │ 
 └─────────────────────────────────┘

⚡ Quick Start

Option A — Interactive wizard (recommended)

Run once, answer a few questions, get a ready-made .mcp.json:

npx byok-observability-mcp --init

The wizard will:

Let you pick which backends to configure
Ask for credentials per service
Test connectivity with your real endpoints before writing anything
Write .mcp.json to your project root or ~/.claude/ — your choice

Then just start Claude Code:

claude

[!TIP] That's it. No clone, no build, no env file. Works in under 60 seconds.

Option B — Manual `.mcp.json`

Create .mcp.json in your project root. Include only the backends you need.

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":    "https://grafana.mycompany.internal",
        "GRAFANA_TOKEN":  "glsa_...",
        "PROMETHEUS_URL": "https://prometheus.mycompany.internal",
        "KAFKA_UI_URL":   "https://kafka-ui.mycompany.internal",
        "DD_API_KEY":     "your-datadog-api-key",
        "DD_APP_KEY":     "your-datadog-app-key"
      }
    }
  }
}

Credentials in git? Use the ${VAR} approach instead — see Configuration → Method B.

Start Claude Code:

claude

Claude Code reads .mcp.json automatically. No claude mcp add, no build step.

Verify by asking Claude:

What observability tools do you have available?

🧩 Supported clients

Client	Configuration
Claude Code	`.mcp.json` in project root (recommended) or `claude mcp add` CLI
OpenAI Codex CLI	`.mcp.json` in project root — same format as Claude Code

Both clients read .mcp.json automatically. The Quick Start above works for either.

Codex CLI example

# Same .mcp.json as above works out of the box
codex

Or add via CLI:

codex mcp add --transport stdio observability-mcp -- npx -y byok-observability-mcp

🔧 Available tools

🛰️ System — 1 tool

Always available. Checks connectivity across all configured backends.

Tool	Description
`obs_health_check`	Unified Health Check. Runs a parallel check on all backends and returns a status table.

7 tools

Enabled when GRAFANA_URL + GRAFANA_TOKEN are set.

Tool	Description
`grafana_health`	Check connectivity, version, and database status
`grafana_list_datasources`	List all datasources (name, type, UID)
`grafana_query_metrics`	Run a PromQL expression via a Grafana datasource
`grafana_list_dashboards`	Search and list dashboards by name or tag
`grafana_get_dashboard`	Get panels and metadata for a dashboard by UID
`grafana_list_alerts`	List active alerts from Alertmanager (firing/pending)
`grafana_get_alert_rules`	List all configured alert rules across all folders

5 tools

Enabled when PROMETHEUS_URL is set.

Tool	Description
`prometheus_health`	Check connectivity
`prometheus_query`	Instant PromQL query — current value of a metric
`prometheus_query_range`	Range PromQL query — metric values over time
`prometheus_list_metrics`	List all available metric names
`prometheus_metric_metadata`	Get help text and type for a specific metric

6 tools

Enabled when KAFKA_UI_URL is set.

Tool	Description
`kafka_list_clusters`	List configured Kafka clusters and their status
`kafka_list_topics`	List topics in a cluster
`kafka_describe_topic`	Get partition count, replication factor, and config
`kafka_list_consumer_groups`	List consumer groups and their state
`kafka_consumer_group_lag`	Get per-partition lag for a consumer group
`kafka_broker_health`	Broker count and disk usage per broker

proxied via official server

Enabled when both DD_API_KEY and DD_APP_KEY are set. Proxies the official Datadog MCP server.

Default toolsets: core, apm, alerting. Set DD_TOOLSETS=all to load everything.

Toolset	Covers
`core`	Metrics, dashboards, monitors, infrastructure
`apm`	APM services, traces, service map
`alerting`	Monitors, downtimes, alerts
`logs`	Log search and analytics
`incidents`	Incident management
`ddsql`	SQL-style metric queries
`security`	Cloud security posture
`synthetics`	Synthetic test results
`networks`	Network performance monitoring
`dbm`	Database monitoring
`software-delivery`	CI/CD pipelines
`llm-obs`	LLM observability
`cases`	Case management
`feature-flags`	Feature flag tracking

🔑 Getting credentials

Service account token

Open Grafana → Administration → Users and access → Service accounts
Click Add service account → set Role to Viewer → Create
On the service account page → Add service account token → Generate token
Copy the token (starts with glsa_) — you won't see it again

GRAFANA_URL=https://grafana.mycompany.internal
GRAFANA_TOKEN=glsa_xxxxxxxxxxxxxxxx

If your Grafana uses a self-signed certificate:

GRAFANA_VERIFY_SSL=false

URL (+ optional basic auth)

If Prometheus has no authentication:

PROMETHEUS_URL=https://prometheus.mycompany.internal

If Prometheus uses basic auth:

PROMETHEUS_URL=https://prometheus.mycompany.internal
PROMETHEUS_USERNAME=your-username
PROMETHEUS_PASSWORD=your-password

URL (+ optional login)

If Kafka UI has no authentication:

KAFKA_UI_URL=https://kafka-ui.mycompany.internal

If Kafka UI requires a login:

KAFKA_UI_URL=https://kafka-ui.mycompany.internal
KAFKA_UI_USERNAME=admin
KAFKA_UI_PASSWORD=your-password

API key + Application key

API key: Datadog → Organization Settings → API Keys → New Key

Application key: Datadog → Organization Settings → Application Keys → New Key

DD_SITE — match your Datadog login URL:

Login URL	DD_SITE
`app.datadoghq.com`	`datadoghq.com` (default)
`app.us3.datadoghq.com`	`us3.datadoghq.com`
`app.us5.datadoghq.com`	`us5.datadoghq.com`
`app.datadoghq.eu`	`datadoghq.eu`
`app.ap1.datadoghq.com`	`ap1.datadoghq.com`

DD_API_KEY=your-api-key
DD_APP_KEY=your-application-key
DD_SITE=datadoghq.com
DD_TOOLSETS=core,apm,alerting

⚙️ Configuration

Method A — Values directly in `.mcp.json` (simplest)

Put credentials directly in .mcp.json. Works everywhere, no extra steps.

Add .mcp.json to your .gitignore if the repo is shared.

Method B — Keep credentials out of git

Use ${VAR} placeholders in .mcp.json and put real values in .env.

.mcp.json (safe to commit — contains no secrets):

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":    "${GRAFANA_URL}",
        "GRAFANA_TOKEN":  "${GRAFANA_TOKEN}",
        "PROMETHEUS_URL": "${PROMETHEUS_URL}",
        "KAFKA_UI_URL":   "${KAFKA_UI_URL}",
        "DD_API_KEY":     "${DD_API_KEY}",
        "DD_APP_KEY":     "${DD_APP_KEY}"
      }
    }
  }
}

.env (add to .gitignore):

GRAFANA_URL=https://grafana.mycompany.internal
GRAFANA_TOKEN=glsa_...

Start Claude with the env loaded:

set -a && source .env && set +a && claude

A ready-made helper script is included:

./scripts/run-claude-with-env.sh

A template .mcp.json with all variables is available as .mcp.json.example.

Method C — Global config (available in every project)

Add to ~/.claude.json:

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":   "https://grafana.mycompany.internal",
        "GRAFANA_TOKEN": "glsa_..."
      }
    }
  }
}

📋 Environment variables

Variable	Backend	Required	Description
`GRAFANA_URL`	Grafana	✅	Base URL of your Grafana instance
`GRAFANA_TOKEN`	Grafana	✅	Service account token (Viewer role)
`GRAFANA_VERIFY_SSL`	Grafana		Set to `false` to skip TLS verification
`PROMETHEUS_URL`	Prometheus	✅	Base URL of your Prometheus instance
`PROMETHEUS_USERNAME`	Prometheus		Basic auth username
`PROMETHEUS_PASSWORD`	Prometheus		Basic auth password
`KAFKA_UI_URL`	Kafka UI	✅	Base URL of your Kafka UI instance
`KAFKA_UI_USERNAME`	Kafka UI		Login username
`KAFKA_UI_PASSWORD`	Kafka UI		Login password
`DD_API_KEY`	Datadog	✅	Datadog API key
`DD_APP_KEY`	Datadog	✅	Datadog Application key
`DD_SITE`	Datadog		Datadog site (default: `datadoghq.com`)
`DD_TOOLSETS`	Datadog		Tool groups to load (default: `core,apm,alerting`)
`SLACK_WEBHOOK_URL`	Reports	✅*	Slack Incoming Webhook URL for scheduled reports
`REPORT_BACKENDS`	Reports		Comma-separated backends to include in reports (default: all configured)

📊 Scheduled Reports

Send an automated observability digest to Slack on a schedule — no Claude or Codex instance needs to be running.

How it works

cron / launchd
     │  fires every N minutes
     ▼
npx byok-observability-mcp --report
     │
     │  reads env vars, connects directly to backends
     ▼
Grafana · Prometheus · Kafka UI
     │
     │  categorizes findings → P0 / P1 / P2 / P3
     ▼
Slack Incoming Webhook  →  #your-channel

The command collects data, categorizes every finding by severity, formats a Slack message, sends it, and exits. It is completely stateless.

Severity levels

Level	Meaning	Examples
🔴 P0 — KRİTİK	Service down or unreachable	Grafana alert firing (critical), Kafka cluster offline, backend unreachable
🟠 P1 — YÜKSEK	Degraded, action needed soon	Grafana alert firing (non-critical), Kafka consumer lag > 10 000
🟡 P2 — ORTA	Warning, monitor closely	Grafana alert pending, Kafka consumer lag > 1 000
🟢 P3 — BİLGİ	Informational, all normal	Healthy backends, silenced alerts

Setup

Step 1 — Get a Slack Incoming Webhook URL

Go to api.slack.com/apps → Create New App → From scratch
Incoming Webhooks → toggle on → Add New Webhook to Workspace
Pick a channel → Allow → copy the Webhook URL

Step 2 — Set environment variables

export SLACK_WEBHOOK_URL=https://hooks.slack.com/services/XXX/YYY/ZZZ

# Optional: restrict which backends are included (default: all configured)
export REPORT_BACKENDS=grafana,prometheus,kafka

Step 3 — Run a one-off report to verify

npx byok-observability-mcp --report

You should see a message in your Slack channel within seconds.

Step 4 — Schedule with cron

Open your crontab:

crontab -e

Add a line. Examples:

# Every hour at minute 0
0 * * * * SLACK_WEBHOOK_URL=https://hooks.slack.com/... GRAFANA_URL=... GRAFANA_TOKEN=... npx byok-observability-mcp --report >> /tmp/obs-report.log 2>&1

# Every 30 minutes
*/30 * * * * SLACK_WEBHOOK_URL=https://hooks.slack.com/... npx byok-observability-mcp --report >> /tmp/obs-report.log 2>&1

[!TIP] Put all env vars in a .env file and source it inside the cron command to keep the crontab clean:
0 * * * * bash -c 'source /path/to/.env && npx byok-observability-mcp --report' >> /tmp/obs-report.log 2>&1

Alternative: macOS launchd (runs on login, survives reboots)

Create ~/Library/LaunchAgents/com.observability-mcp.report.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>com.observability-mcp.report</string>
  <key>ProgramArguments</key>
  <array>
    <string>/usr/local/bin/npx</string>
    <string>byok-observability-mcp</string>
    <string>--report</string>
  </array>
  <key>EnvironmentVariables</key>
  <dict>
    <key>SLACK_WEBHOOK_URL</key>
    <string>https://hooks.slack.com/services/XXX/YYY/ZZZ</string>
    <key>GRAFANA_URL</key>
    <string>https://grafana.mycompany.internal</string>
    <key>GRAFANA_TOKEN</key>
    <string>glsa_...</string>
  </dict>
  <key>StartInterval</key>
  <integer>3600</integer>
  <key>StandardOutPath</key>
  <string>/tmp/obs-report.log</string>
  <key>StandardErrorPath</key>
  <string>/tmp/obs-report.log</string>
</dict>
</plist>

Load it:

launchctl load ~/Library/LaunchAgents/com.observability-mcp.report.plist

To stop: launchctl unload ~/Library/LaunchAgents/com.observability-mcp.report.plist

💬 Example prompts

Single-backend queries

Backend	Try asking Claude...
Grafana	"List all datasources and tell me which ones are Prometheus type."
Grafana	"Search for dashboards related to 'kubernetes' — list names and UIDs."
Grafana	"Query `http_requests_total` rate over the last hour via the default Prometheus datasource."
Prometheus	"What is the current value of the `up` metric? Which targets are down?"
Prometheus	"Show CPU usage (`node_cpu_seconds_total` rate) over the past hour, by instance."
Prometheus	"List all available metrics that start with `http_`."
Kafka UI	"List all Kafka clusters. Are there any with offline brokers?"
Kafka UI	"Describe the topic 'orders' in cluster 'production' — partitions and replication factor?"
Kafka UI	"Check consumer lag for group 'order-processor'. Which partitions have the highest lag?"
Datadog	"List all Datadog monitors currently in Alert state."
Datadog	"Show APM service performance for the past hour. Which services have the highest error rate?"
Datadog	"Query `aws.ec2.cpuutilization` for the last 30 minutes. Which hosts are above 80%?"

🛠️ Incident Response (v0.2.0+)

Goal	Try asking Claude...
Health	"Run a health check on all systems."
Alerts	"Are there any firing alerts in Grafana right now?"
Triage	"Show me the alert rules for the 'Production' folder."

Cross-backend queries

Check the health of all configured observability backends and give me a summary.

I'm seeing high error rates. Check Prometheus for http_requests_total with status=500,
then look for related Datadog monitors that might be alerting.

🔒 Security

[!NOTE] All tools are read-only. No write operations are performed on any backend.

[!IMPORTANT] Credentials are read from environment variables and never logged or sent to Anthropic. Tokens are redacted in all error messages.

TLS certificate verification is enabled by default
The MCP process runs locally — your infrastructure URLs only reach Claude's context window if you type them into the chat

Least-privilege recommendations:

Backend	Recommended role
Grafana	Service account with Viewer role
Prometheus	Network-level read-only access
Kafka UI	Read-only UI user
Datadog	API key + Application key with read scopes

🛠 Development

git clone https://github.com/alimuratkuslu/byok-observability-mcp
cd byok-observability-mcp
npm install
npm run dev          # run with tsx (no build step)
npm run build        # compile to dist/
npm run typecheck    # TypeScript check without emitting

Tested versions

Backend	Tested version
Grafana	v9.x, v10.x, v11.x
Prometheus	v2.x
Kafka UI	`provectus/kafka-ui:v0.7.2`

License

MIT

from github.com/alimuratkuslu/byok-observability-mcp

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "byok-observability-mcp": {
      "command": "npx",
      "args": []
    }
  }
}

Byok Observability

БесплатноHosted

A comprehensive Model Context Protocol (MCP) server for Grafana, Prometheus, Kafka UI, and Datadog. Features a secure "Bring Your Own Key" (BYOK) architecture w

автор: alimuratkuslu

GitHub

Описание

README

byok-observability-mcp

Query your observability stack from Claude Code, Codex, or any MCP client and no data leaves your machine.

npm version License: MIT Awesome MCP Servers Node ≥18 alimuratkuslu/byok-observability-mcp MCP server

Bring Your Own Keys — credentials stay in env vars on your machine. No clone, no build, runs via npx.

Partial setup — configure only the backends you use. Tools for unconfigured backends are never exposed.

Quick Start · Tools · Credentials · Configuration · Scheduled Reports · Examples · Security · Development

How it works

     🤖 Claude Code / Codex CLI      
                 │                   
                 ▼                   
     ⚡ byok-observability-mcp       
        (Local npx process)          
                 │                   
  🔒 env vars never leave your machine 
                 │                   
                 ▼                   
 ┌─────────────────────────────────┐ 
 │  📊 Grafana     🔥 Prometheus   │ 
 │  🛶 Kafka UI    🐶 Datadog      │ 
 └─────────────────────────────────┘

⚡ Quick Start

Option A — Interactive wizard (recommended)

Run once, answer a few questions, get a ready-made .mcp.json:

npx byok-observability-mcp --init

The wizard will:

Let you pick which backends to configure
Ask for credentials per service
Test connectivity with your real endpoints before writing anything
Write .mcp.json to your project root or ~/.claude/ — your choice

Then just start Claude Code:

claude

[!TIP] That's it. No clone, no build, no env file. Works in under 60 seconds.

Option B — Manual `.mcp.json`

Create .mcp.json in your project root. Include only the backends you need.

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":    "https://grafana.mycompany.internal",
        "GRAFANA_TOKEN":  "glsa_...",
        "PROMETHEUS_URL": "https://prometheus.mycompany.internal",
        "KAFKA_UI_URL":   "https://kafka-ui.mycompany.internal",
        "DD_API_KEY":     "your-datadog-api-key",
        "DD_APP_KEY":     "your-datadog-app-key"
      }
    }
  }
}

Credentials in git? Use the ${VAR} approach instead — see Configuration → Method B.

Start Claude Code:

claude

Claude Code reads .mcp.json automatically. No claude mcp add, no build step.

Verify by asking Claude:

What observability tools do you have available?

🧩 Supported clients

Client	Configuration
Claude Code	`.mcp.json` in project root (recommended) or `claude mcp add` CLI
OpenAI Codex CLI	`.mcp.json` in project root — same format as Claude Code

Both clients read .mcp.json automatically. The Quick Start above works for either.

Codex CLI example

# Same .mcp.json as above works out of the box
codex

Or add via CLI:

codex mcp add --transport stdio observability-mcp -- npx -y byok-observability-mcp

🔧 Available tools

🛰️ System — 1 tool

Always available. Checks connectivity across all configured backends.

Tool	Description
`obs_health_check`	Unified Health Check. Runs a parallel check on all backends and returns a status table.

7 tools

Enabled when GRAFANA_URL + GRAFANA_TOKEN are set.

Tool	Description
`grafana_health`	Check connectivity, version, and database status
`grafana_list_datasources`	List all datasources (name, type, UID)
`grafana_query_metrics`	Run a PromQL expression via a Grafana datasource
`grafana_list_dashboards`	Search and list dashboards by name or tag
`grafana_get_dashboard`	Get panels and metadata for a dashboard by UID
`grafana_list_alerts`	List active alerts from Alertmanager (firing/pending)
`grafana_get_alert_rules`	List all configured alert rules across all folders

5 tools

Enabled when PROMETHEUS_URL is set.

Tool	Description
`prometheus_health`	Check connectivity
`prometheus_query`	Instant PromQL query — current value of a metric
`prometheus_query_range`	Range PromQL query — metric values over time
`prometheus_list_metrics`	List all available metric names
`prometheus_metric_metadata`	Get help text and type for a specific metric

6 tools

Enabled when KAFKA_UI_URL is set.

Tool	Description
`kafka_list_clusters`	List configured Kafka clusters and their status
`kafka_list_topics`	List topics in a cluster
`kafka_describe_topic`	Get partition count, replication factor, and config
`kafka_list_consumer_groups`	List consumer groups and their state
`kafka_consumer_group_lag`	Get per-partition lag for a consumer group
`kafka_broker_health`	Broker count and disk usage per broker

proxied via official server

Enabled when both DD_API_KEY and DD_APP_KEY are set. Proxies the official Datadog MCP server.

Default toolsets: core, apm, alerting. Set DD_TOOLSETS=all to load everything.

Toolset	Covers
`core`	Metrics, dashboards, monitors, infrastructure
`apm`	APM services, traces, service map
`alerting`	Monitors, downtimes, alerts
`logs`	Log search and analytics
`incidents`	Incident management
`ddsql`	SQL-style metric queries
`security`	Cloud security posture
`synthetics`	Synthetic test results
`networks`	Network performance monitoring
`dbm`	Database monitoring
`software-delivery`	CI/CD pipelines
`llm-obs`	LLM observability
`cases`	Case management
`feature-flags`	Feature flag tracking

🔑 Getting credentials

Service account token

Open Grafana → Administration → Users and access → Service accounts
Click Add service account → set Role to Viewer → Create
On the service account page → Add service account token → Generate token
Copy the token (starts with glsa_) — you won't see it again

GRAFANA_URL=https://grafana.mycompany.internal
GRAFANA_TOKEN=glsa_xxxxxxxxxxxxxxxx

If your Grafana uses a self-signed certificate:

GRAFANA_VERIFY_SSL=false

URL (+ optional basic auth)

If Prometheus has no authentication:

PROMETHEUS_URL=https://prometheus.mycompany.internal

If Prometheus uses basic auth:

PROMETHEUS_URL=https://prometheus.mycompany.internal
PROMETHEUS_USERNAME=your-username
PROMETHEUS_PASSWORD=your-password

URL (+ optional login)

If Kafka UI has no authentication:

KAFKA_UI_URL=https://kafka-ui.mycompany.internal

If Kafka UI requires a login:

KAFKA_UI_URL=https://kafka-ui.mycompany.internal
KAFKA_UI_USERNAME=admin
KAFKA_UI_PASSWORD=your-password

API key + Application key

API key: Datadog → Organization Settings → API Keys → New Key

Application key: Datadog → Organization Settings → Application Keys → New Key

DD_SITE — match your Datadog login URL:

Login URL	DD_SITE
`app.datadoghq.com`	`datadoghq.com` (default)
`app.us3.datadoghq.com`	`us3.datadoghq.com`
`app.us5.datadoghq.com`	`us5.datadoghq.com`
`app.datadoghq.eu`	`datadoghq.eu`
`app.ap1.datadoghq.com`	`ap1.datadoghq.com`

DD_API_KEY=your-api-key
DD_APP_KEY=your-application-key
DD_SITE=datadoghq.com
DD_TOOLSETS=core,apm,alerting

⚙️ Configuration

Method A — Values directly in `.mcp.json` (simplest)

Put credentials directly in .mcp.json. Works everywhere, no extra steps.

Add .mcp.json to your .gitignore if the repo is shared.

Method B — Keep credentials out of git

Use ${VAR} placeholders in .mcp.json and put real values in .env.

.mcp.json (safe to commit — contains no secrets):

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":    "${GRAFANA_URL}",
        "GRAFANA_TOKEN":  "${GRAFANA_TOKEN}",
        "PROMETHEUS_URL": "${PROMETHEUS_URL}",
        "KAFKA_UI_URL":   "${KAFKA_UI_URL}",
        "DD_API_KEY":     "${DD_API_KEY}",
        "DD_APP_KEY":     "${DD_APP_KEY}"
      }
    }
  }
}

.env (add to .gitignore):

GRAFANA_URL=https://grafana.mycompany.internal
GRAFANA_TOKEN=glsa_...

Start Claude with the env loaded:

set -a && source .env && set +a && claude

A ready-made helper script is included:

./scripts/run-claude-with-env.sh

A template .mcp.json with all variables is available as .mcp.json.example.

Method C — Global config (available in every project)

Add to ~/.claude.json:

{
  "mcpServers": {
    "observability-mcp": {
      "command": "npx",
      "args": ["-y", "byok-observability-mcp"],
      "env": {
        "GRAFANA_URL":   "https://grafana.mycompany.internal",
        "GRAFANA_TOKEN": "glsa_..."
      }
    }
  }
}

📋 Environment variables

Variable	Backend	Required	Description
`GRAFANA_URL`	Grafana	✅	Base URL of your Grafana instance
`GRAFANA_TOKEN`	Grafana	✅	Service account token (Viewer role)
`GRAFANA_VERIFY_SSL`	Grafana		Set to `false` to skip TLS verification
`PROMETHEUS_URL`	Prometheus	✅	Base URL of your Prometheus instance
`PROMETHEUS_USERNAME`	Prometheus		Basic auth username
`PROMETHEUS_PASSWORD`	Prometheus		Basic auth password
`KAFKA_UI_URL`	Kafka UI	✅	Base URL of your Kafka UI instance
`KAFKA_UI_USERNAME`	Kafka UI		Login username
`KAFKA_UI_PASSWORD`	Kafka UI		Login password
`DD_API_KEY`	Datadog	✅	Datadog API key
`DD_APP_KEY`	Datadog	✅	Datadog Application key
`DD_SITE`	Datadog		Datadog site (default: `datadoghq.com`)
`DD_TOOLSETS`	Datadog		Tool groups to load (default: `core,apm,alerting`)
`SLACK_WEBHOOK_URL`	Reports	✅*	Slack Incoming Webhook URL for scheduled reports
`REPORT_BACKENDS`	Reports		Comma-separated backends to include in reports (default: all configured)

📊 Scheduled Reports

Send an automated observability digest to Slack on a schedule — no Claude or Codex instance needs to be running.

How it works

cron / launchd
     │  fires every N minutes
     ▼
npx byok-observability-mcp --report
     │
     │  reads env vars, connects directly to backends
     ▼
Grafana · Prometheus · Kafka UI
     │
     │  categorizes findings → P0 / P1 / P2 / P3
     ▼
Slack Incoming Webhook  →  #your-channel

The command collects data, categorizes every finding by severity, formats a Slack message, sends it, and exits. It is completely stateless.

Severity levels

Level	Meaning	Examples
🔴 P0 — KRİTİK	Service down or unreachable	Grafana alert firing (critical), Kafka cluster offline, backend unreachable
🟠 P1 — YÜKSEK	Degraded, action needed soon	Grafana alert firing (non-critical), Kafka consumer lag > 10 000
🟡 P2 — ORTA	Warning, monitor closely	Grafana alert pending, Kafka consumer lag > 1 000
🟢 P3 — BİLGİ	Informational, all normal	Healthy backends, silenced alerts

Setup

Step 1 — Get a Slack Incoming Webhook URL

Go to api.slack.com/apps → Create New App → From scratch
Incoming Webhooks → toggle on → Add New Webhook to Workspace
Pick a channel → Allow → copy the Webhook URL

Step 2 — Set environment variables

export SLACK_WEBHOOK_URL=https://hooks.slack.com/services/XXX/YYY/ZZZ

# Optional: restrict which backends are included (default: all configured)
export REPORT_BACKENDS=grafana,prometheus,kafka

Step 3 — Run a one-off report to verify

npx byok-observability-mcp --report

You should see a message in your Slack channel within seconds.

Step 4 — Schedule with cron

Open your crontab:

crontab -e

Add a line. Examples:

# Every hour at minute 0
0 * * * * SLACK_WEBHOOK_URL=https://hooks.slack.com/... GRAFANA_URL=... GRAFANA_TOKEN=... npx byok-observability-mcp --report >> /tmp/obs-report.log 2>&1

# Every 30 minutes
*/30 * * * * SLACK_WEBHOOK_URL=https://hooks.slack.com/... npx byok-observability-mcp --report >> /tmp/obs-report.log 2>&1

[!TIP] Put all env vars in a .env file and source it inside the cron command to keep the crontab clean:
0 * * * * bash -c 'source /path/to/.env && npx byok-observability-mcp --report' >> /tmp/obs-report.log 2>&1

Alternative: macOS launchd (runs on login, survives reboots)

Create ~/Library/LaunchAgents/com.observability-mcp.report.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>com.observability-mcp.report</string>
  <key>ProgramArguments</key>
  <array>
    <string>/usr/local/bin/npx</string>
    <string>byok-observability-mcp</string>
    <string>--report</string>
  </array>
  <key>EnvironmentVariables</key>
  <dict>
    <key>SLACK_WEBHOOK_URL</key>
    <string>https://hooks.slack.com/services/XXX/YYY/ZZZ</string>
    <key>GRAFANA_URL</key>
    <string>https://grafana.mycompany.internal</string>
    <key>GRAFANA_TOKEN</key>
    <string>glsa_...</string>
  </dict>
  <key>StartInterval</key>
  <integer>3600</integer>
  <key>StandardOutPath</key>
  <string>/tmp/obs-report.log</string>
  <key>StandardErrorPath</key>
  <string>/tmp/obs-report.log</string>
</dict>
</plist>

Load it:

launchctl load ~/Library/LaunchAgents/com.observability-mcp.report.plist

To stop: launchctl unload ~/Library/LaunchAgents/com.observability-mcp.report.plist

💬 Example prompts

Single-backend queries

Backend	Try asking Claude...
Grafana	"List all datasources and tell me which ones are Prometheus type."
Grafana	"Search for dashboards related to 'kubernetes' — list names and UIDs."
Grafana	"Query `http_requests_total` rate over the last hour via the default Prometheus datasource."
Prometheus	"What is the current value of the `up` metric? Which targets are down?"
Prometheus	"Show CPU usage (`node_cpu_seconds_total` rate) over the past hour, by instance."
Prometheus	"List all available metrics that start with `http_`."
Kafka UI	"List all Kafka clusters. Are there any with offline brokers?"
Kafka UI	"Describe the topic 'orders' in cluster 'production' — partitions and replication factor?"
Kafka UI	"Check consumer lag for group 'order-processor'. Which partitions have the highest lag?"
Datadog	"List all Datadog monitors currently in Alert state."
Datadog	"Show APM service performance for the past hour. Which services have the highest error rate?"
Datadog	"Query `aws.ec2.cpuutilization` for the last 30 minutes. Which hosts are above 80%?"

🛠️ Incident Response (v0.2.0+)

Goal	Try asking Claude...
Health	"Run a health check on all systems."
Alerts	"Are there any firing alerts in Grafana right now?"
Triage	"Show me the alert rules for the 'Production' folder."

Cross-backend queries

Check the health of all configured observability backends and give me a summary.

I'm seeing high error rates. Check Prometheus for http_requests_total with status=500,
then look for related Datadog monitors that might be alerting.

🔒 Security

[!NOTE] All tools are read-only. No write operations are performed on any backend.

[!IMPORTANT] Credentials are read from environment variables and never logged or sent to Anthropic. Tokens are redacted in all error messages.

TLS certificate verification is enabled by default
The MCP process runs locally — your infrastructure URLs only reach Claude's context window if you type them into the chat

Least-privilege recommendations:

Backend	Recommended role
Grafana	Service account with Viewer role
Prometheus	Network-level read-only access
Kafka UI	Read-only UI user
Datadog	API key + Application key with read scopes

🛠 Development

git clone https://github.com/alimuratkuslu/byok-observability-mcp
cd byok-observability-mcp
npm install
npm run dev          # run with tsx (no build step)
npm run build        # compile to dist/
npm run typecheck    # TypeScript check without emitting

Tested versions

Backend	Tested version
Grafana	v9.x, v10.x, v11.x
Prometheus	v2.x
Kafka UI	`provectus/kafka-ui:v0.7.2`

License

MIT

from github.com/alimuratkuslu/byok-observability-mcp

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "byok-observability-mcp": {
      "command": "npx",
      "args": []
    }
  }
}

Command Palette

Byok Observability

Описание

README

byok-observability-mcp

How it works

⚡ Quick Start

Option A — Interactive wizard (recommended)

Option B — Manual .mcp.json

🧩 Supported clients

🔧 Available tools

🔑 Getting credentials

⚙️ Configuration

Method A — Values directly in .mcp.json (simplest)

📋 Environment variables

📊 Scheduled Reports

How it works

Severity levels

Setup

💬 Example prompts

Single-backend queries

🛠️ Incident Response (v0.2.0+)

Cross-backend queries

🔒 Security

🛠 Development

Tested versions

License

Как установить

Похожие MCP

Figma

mcp-dockmaster

ariekogan/ateam-mcp

thinkchainai/mcpbundles

Byok Observability

Описание

README

byok-observability-mcp

How it works

⚡ Quick Start

Option A — Interactive wizard (recommended)

Option B — Manual .mcp.json

🧩 Supported clients

🔧 Available tools

🔑 Getting credentials

⚙️ Configuration

Method A — Values directly in .mcp.json (simplest)

📋 Environment variables

📊 Scheduled Reports

How it works

Severity levels

Setup

💬 Example prompts

Single-backend queries

🛠️ Incident Response (v0.2.0+)

Cross-backend queries

🔒 Security

🛠 Development

Tested versions

License

Как установить

Похожие MCP

Figma

mcp-dockmaster

ariekogan/ateam-mcp

thinkchainai/mcpbundles

Option B — Manual `.mcp.json`

Method A — Values directly in `.mcp.json` (simplest)

Option B — Manual `.mcp.json`

Method A — Values directly in `.mcp.json` (simplest)