loading…
Search for a command to run...
loading…
An MCP server that surfaces scheduled-job state and detects silent failures (exit 0 but no useful output) for cron, systemd timers, and OpenClaw schedulers, ena
An MCP server that surfaces scheduled-job state and detects silent failures (exit 0 but no useful output) for cron, systemd timers, and OpenClaw schedulers, enabling AI agents to query job health and overdue status directly.
MCP server for catching cron silent failures — when scheduled jobs exit 0 with empty output, when retry storms run away, when action budgets leak. Surfaces overdue jobs, length anomalies, and silent-fail patterns to any Claude or MCP-aware agent. Works with system cron, systemd timers, OpenClaw cron logs, and any JSONL run-log out of the box. Keywords: AI agent monitoring, cron health, scheduled-task observability, production AI ops.
Status: v1.0.1 Tests: 74 passing License: MIT MCP PyPI
Real silent failures from production AI deployments in the last 30 days:
ended_reason: run_once_fired), but the cloud container never reaches prompt execution. This silently affected the operator's routines for at least 28 days before they noticed the output files weren't updating.claude-sonnet-4-6 returns empty assistant turns in a tight loop (stop_reason: null, output_tokens: 8) for ~20 minutes. The workflow step then exits as success with no artifacts produced — the GitHub Actions API can't distinguish "completed cleanly" from "returned empty for 20 minutes burning Claude Max budget."These all map to one underlying problem: exit-code monitoring lies. The job returned 0; the data is broken anyway. Any team running scheduled jobs has hit at least one of these:
<no rows> in the body). Traditional monitoring sees a green checkmark; the data is broken anyway.journalctl output that rotated last weeksilentwatch-mcp exposes that visibility as MCP tools your AI agent can query directly. No metrics pipeline, no separate dashboard, no SaaS subscription.
> claude: which of my cron jobs have silent failures in the last 24 hours?
[MCP tool: find_silent_failures]
3 jobs flagged:
• web-search-refresh — ran 12× successfully but output empty in 8 (66% silent fail rate)
• daily-summary — ran 1× successfully (24× expected); output normal
• audit-snapshot — last success 5 days ago, all subsequent runs returned exit 0 with empty body
silentwatch-mcpThree things existing tools (Cronitor, Healthchecks.io, Datadog, Prometheus) don't do:
exit 0 = success. We check the output against configurable rules: empty output, length anomaly vs historical median, error keywords in stdout despite exit 0, duration anomaly. The job that "ran successfully" but returned nothing useful — that's the failure mode that hides for weeks. We catch it./etc/crontab + /etc/cron.d/* + per-user crontab -l), and systemd timers (systemctl list-timers + journalctl) — all four backends ship in v0.3, so you can run silentwatch-mcp against whatever scheduler you have. No vendor lock-in.Built for the SMB self-hoster running a $40 VPS where Datadog is overkill and a "$0/mo open-source MCP" is the right price point — but the silent-failure detection is just as valuable on enterprise infra.
The server registers these MCP tools (full spec in SPEC.md):
| Tool | What it does |
|---|---|
list_jobs |
Enumerate all known cron jobs with last-run summary |
get_job_status(job_id) |
Detailed status for one job: last run, last success, success rate over window |
get_job_runs(job_id, limit) |
Recent run history with timing + status + output snippet |
find_overdue_jobs |
Jobs whose schedule says they should have run but haven't |
find_silent_failures(window_hours) |
Jobs that ran "successfully" but output looks suspicious |
tail_job_logs(job_id, lines) |
Recent log output for one job |
Resources:
cron://jobs — list of all jobs (manifest)cron://job/{id} — individual job manifest + recent runscron://run/{id} — individual run instance with full outputPrompts:
diagnose-overdue — diagnostic prompt template for an overdue jobsummarize-cron-health — daily digest of cron activity + anomaliesv0.3 beta — all 4 backends shipped + real overdue detection via cron-schedule parsing (croniter). Mock, OpenClaw JSONL, crontab, and systemd backends are all production-ready. 74 tests passing. v1.0 is now polish: PyPI release + GitHub Actions CI + MCP registry submissions.
pip install silentwatch-mcp # not yet on PyPI; install from source for now:
pip install -e .
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"silentwatch": {
"command": "python",
"args": ["-m", "silentwatch_mcp"],
"env": {
"SILENTWATCH_BACKEND": "mock"
}
}
}
}
Backends (all four shipped as of v0.3):
SILENTWATCH_BACKEND=mock — returns sample data (default for development)SILENTWATCH_BACKEND=openclaw-jsonl — parses OpenClaw's native cron run JSONL files (set SILENTWATCH_OPENCLAW_LOGS to the directory, default ~/.openclaw/cron-runs/); richest data — full run history + silent-fail detectionSILENTWATCH_BACKEND=crontab — parses /etc/crontab + /etc/cron.d/* + user crontabs (crontab -l); last-run inferred from /var/log/syslog or /var/log/cron (set SILENTWATCH_SYSLOG to override)SILENTWATCH_BACKEND=systemd — parses systemctl list-timers --all --output=json + journalctl -u <unit> for run history; lifts OnCalendar= into the schedule fieldAll non-mock backends gracefully return empty results on platforms / hosts where the underlying tooling isn't present, so configuration is safe to leave in place across environments.
The server registers as silentwatch. Test:
Show me all my cron jobs and their last-run status.
| Version | Scope | Status |
|---|---|---|
| v0.1 | Protocol wiring, mock backend, all 6 tools registered with stub data, tests pass | ✅ Complete |
| v0.2 | OpenClaw JSONL backend implemented (real cron run parsing, malformed-line handling, silent-fail enrichment) | ✅ Complete (2026-05-02) |
| v0.3 | Crontab + systemd backends; cron-schedule parsing for real overdue detection (croniter); 35 new tests | ✅ Complete (2026-05-02) |
| v1.0 | Polish: PyPI release, GitHub Actions CI, MCP registry submissions (Glama + PulseMCP), refined silent-fail rule configuration | ⏳ Phase 1 ship target (W3, May 18) |
| v1.x | Additional backends (Cowork scheduler, Claude Code background tasks, generic JSON config), webhook emitter for alerts | ⏳ Phase 2+ |
silentwatch-mcp ships with 4 backends (mock, OpenClaw JSONL, crontab, systemd). If your scheduler is something else — AWS EventBridge, GCP Cloud Scheduler, Hangfire, Sidekiq, Temporal, Apache Airflow, Prefect, Dagster, or a custom job runner — and you want the same silent-failure-detection MCP visibility surface for it, that's a Custom MCP Build engagement.
| Tier | Scope | Investment | Timeline |
|---|---|---|---|
| Simple | Single backend adapter for an existing scheduler with documented API (e.g., GCP Cloud Scheduler) | $8,000–$10,000 | 1–2 weeks |
| Standard | Custom backend + custom silent-fail rules + integration with your existing alerting (PagerDuty, Slack, etc.) | $15,000–$20,000 | 2–4 weeks |
| Complex | Multi-backend (federated cron across regions / clusters / tenants) + RBAC + audit-log integration + on-call workflow | $25,000–$35,000 | 4–8 weeks |
To engage:
Custom MCP Build inquiryThis server is also part of the AI Production Discipline Framework — the methodology underlying production AI audits I run.
If you're running production AI and want an outside practitioner to score readiness, find the failure patterns that are already present, and write the corrective-action plan — that's what this MCP is built into supporting. The standalone audit service:
| Tier | Scope | Investment | Timeline |
|---|---|---|---|
| Audit Lite | One system, top-5 findings, written report | $1,500 | 1 week |
| Audit Standard | Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up | $3,000 | 2–3 weeks |
| Audit + Workshop | Standard audit + 2-day team workshop + first monthly audit included | $7,500 | 3–4 weeks |
Same email channel: [email protected] with subject AI audit inquiry.
PRs welcome. The structure is intentionally flat to make custom backends easy to add — see src/silentwatch_mcp/backends/ for existing examples.
To add a new backend:
CronBackend in backends/<your_backend>.pylist_jobs, get_job_runs, tail_logsbackends/__init__.pytests/test_backend_<your_backend>.pyBug reports + feature requests: open a GitHub issue.
MIT — see LICENSE.
openclaw-health-mcp, openclaw-cost-tracker-mcp, openclaw-skill-vetter-mcp, openclaw-upgrade-orchestrator-mcp) in one curated bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $99, or $49 with LAUNCH50 for the first 30 days.Built by Temur Khan — independent practitioner on production AI systems. Contact: [email protected]
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"silentwatch-mcp": {
"command": "npx",
"args": []
}
}
}