Openclaw Skill Vetter

Name: Openclaw Skill Vetter
Availability: InStock
Author: temurkhan13

Бесплатно

Vet ClawHub skills before installing them; detects prompt-injection, exfiltration, and other security issues, outputting a risk score with per-finding evidence.

автор: temurkhan13

GitHub

Описание

Vet ClawHub skills before installing them; detects prompt-injection, exfiltration, and other security issues, outputting a risk score with per-finding evidence.

README

MCP server for security-vetting third-party AI agent extensions before installation — Claude skills, ClawHub plugins, agent tool packs, any code-shaped artifact that runs in your agent environment with your API keys. 41 detection rules across prompt-injection patterns, hardcoded exfiltration channels (Discord/Slack/Telegram webhooks, SSH-key reads, AWS-creds reads), dangerous dynamic execution (eval, exec, subprocess shell=True, pickle.loads), manifest/permission drift, and known typosquat dependencies. Outputs a 0-100 risk score + BLOCK/REVIEW/CAUTION/CLEAN bucket + per-finding evidence. Native ClawHub manifest support; the rule engine generalizes to any code-shaped extension via Custom MCP Build adapters. Keywords: AI agent security, plugin vetting, supply-chain security, prompt injection detection, MCP static analysis.

Status: v1.0.0 License: MIT MCP PyPI

What it does

Third-party AI agent extensions — Claude skills, ClawHub plugins, MCP servers themselves, agent tool packs, npm-distributed agent code — are code that runs inside your environment with your API keys, your filesystem access, your network egress. The supply-chain attack surface is now broadly recognized:

The OWASP MCP Top 10 catalogues prompt injection, command injection, and "rug pull" attacks where compromised MCP servers update with malicious tool definitions after user approval. Microsoft, Datadog, Atlassian, Palo Alto Unit 42, and Prompt Security have all published detailed threat analyses in 2026: see Datadog's MCP risks blog, Atlassian's MCP Clients risk awareness, and Unit 42's MCP sampling attack vectors.
The 2026 ClawHavoc campaign distributed hundreds of skills with prompt-injection payloads, hardcoded webhooks, and typosquatted dependencies. Per public post-mortem analysis: 36% of ClawHub skills carried injection patterns, 8% were actively exfiltrating data.

The same shape of attack works against any third-party extension a user installs into their AI agent runtime — Claude skills, MCP servers, browser-extension agents, npm-distributed agent code. The defensive question every operator faces before clicking install: "is this safe to run with my API keys?"

This MCP server runs a battery of static-analysis scanners against any skill's directory and produces a single VetReport that an operator can act on:

> claude: vet the data-extractor skill before I install it.
[MCP tool: vet_skill]

Skill 'data-extractor': BLOCK — do not install.
Risk score: 100/100. Findings: 1 critical, 4 high, 1 info.

Critical:
  EXFIL.WEBHOOK_DISCORD (extract.py:5) —
    Hardcoded Discord webhook URL: 'https://discord.com/api/webhooks/...'
    Recommendation: Refuse install unless explicitly justified.

High:
  AST.OS_SYSTEM (extract.py:14) — os.system('curl ... | bash')
  EXFIL.ENV_DUMP (extract.py:9) — dumps full os.environ
  MANIFEST.WILDCARD_PERMISSION — `network.http: *`
  ...

Vet result for data-extractor: REFUSE INSTALL.

> claude: any flagged skills currently installed?
[MCP tool: flagged_skills_report]

2 skills flagged at REVIEW or BLOCK:
  - data-extractor       BLOCK   risk_score=100   1 CRITICAL EXFIL.WEBHOOK_DISCORD
  - markdown-formatter   REVIEW  risk_score=35    1 HIGH AST.EVAL_CALL on user input

Why `openclaw-skill-vetter-mcp`

Three things existing tools (manual code review, generic SAST, ClawHub trust scores) don't do:

Skill-aware scanning. Generic SAST tools don't know what an OpenClaw skill manifest looks like. They miss the most common malware shape: a "calculator" skill that requests network.http: *. The vetter cross-checks declared purpose against requested permissions.
Risk score the operator can paste into a ticket. Not "high cyclomatic complexity" — BLOCK — Discord webhook at extract.py:5. Each finding has rule_id, file:line, evidence, and a specific recommendation.
Built for review-before-install, not after-the-fact audit. Run it from inside Claude on a skill you're about to add. Get a verdict in seconds. Refuse the install if it's BLOCK; sandbox-test if REVIEW; install if CLEAN.

Built for the production-AI operator who has been bitten (or doesn't want to be) by ClawHavoc-style supply-chain attacks.

Tool surface

Tool	What it returns
`vet_skill`	Full VetReport for one skill: risk_score, risk_level, sorted findings, summary
`vet_skill_directory`	Aggregate report across every skill in the directory + per-bucket counts
`installed_skills_overview`	Lightweight: just bucket counts + flagged skill IDs
`flagged_skills_report`	Just REVIEW + BLOCK skills with their findings
`scan_for_prompt_injection`	Focused: only prompt-injection findings on one skill
`scan_for_exfiltration`	Focused: only exfiltration findings on one skill
`list_detection_rules`	Catalog of every rule the server applies (transparency)

Resources:

skill-vetter://overview — installed-skills risk overview
skill-vetter://flagged — currently-flagged skills
skill-vetter://rules — detection rules catalog

Prompts:

pre-install-skill-check — vet a specific skill before installation
weekly-skill-audit — compose a 200-word weekly audit of all installed skills

Quickstart

Install

pip install openclaw-skill-vetter-mcp

Configure for Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "openclaw-skill-vetter": {
      "command": "python",
      "args": ["-m", "openclaw_skill_vetter_mcp"],
      "env": {
        "OPENCLAW_SKILL_VETTER_BACKEND": "mock"
      }
    }
  }
}

Backends

Backend	Status	Description
`mock`	✅ v1.0	6 demo skills with deliberate findings spanning all severities — for protocol verification and README/CLI demos
`openclaw-skills-dir`	✅ v1.0	Reads `~/.openclaw/skills/` (override via `OPENCLAW_SKILLS_DIR`); each subdirectory is parsed as one skill
`clawhub-fetch`	⏳ v1.1	Fetches a candidate skill from the ClawHub registry directly for vet-before-install workflows

Skill manifest format

Each skill directory contains a skill.yaml (or skill.json):

id: weather-fetch
name: Weather Fetch
version: 1.0.0
author: [email protected]
description: Fetches current weather for a city using OpenWeatherMap.
purpose: Live weather data lookup
runtime: python3.11
entry_point: main.py
permissions:
  - network.http: api.openweathermap.org
dependencies:
  - requests>=2.31
  - pydantic>=2.0
signature: ed25519:abcd1234efgh5678

Plus the actual code files (*.py, *.js, *.ts, *.sh, *.rb, *.go, *.rs) and any prompt files (*.prompt, *.md, *.txt).

If your OpenClaw deployment uses a different on-disk shape, see the Custom MCP Build section below.

Detection rules (v1.0)

Four scanner modules cover the v1.0 ruleset:

Manifest — MANIFEST.MISSING, MANIFEST.PURPOSE_NETWORK_DRIFT, MANIFEST.WILDCARD_PERMISSION, MANIFEST.BROAD_FILESYSTEM_WRITE, MANIFEST.EMPTY_DESCRIPTION, MANIFEST.NO_AUTHOR, MANIFEST.UNSIGNED

Static patterns (text regex over code + prompts) —

Prompt-injection: PROMPT_INJ.IGNORE_PRIOR, PROMPT_INJ.ROLE_OVERRIDE, PROMPT_INJ.EXTRACT_SYSTEM, PROMPT_INJ.JAILBREAK_DAN, PROMPT_INJ.NEW_USER_MARKER
Exfiltration: EXFIL.WEBHOOK_DISCORD, EXFIL.WEBHOOK_SLACK, EXFIL.WEBHOOK_TELEGRAM, EXFIL.PASTEBIN_LITERAL, EXFIL.SSH_KEY_READ, EXFIL.AWS_CREDS_READ, EXFIL.ENV_DUMP, EXFIL.SUBPROCESS_CURL
Dynamic execution: DYN_EXEC.SHELL_TRUE, DYN_EXEC.OS_SYSTEM, DYN_EXEC.EVAL_LITERAL, DYN_EXEC.EXEC_LITERAL, DYN_EXEC.PICKLE_LOADS, DYN_EXEC.DYNAMIC_IMPORT
Obfuscation: OBFUSCATION.LARGE_BASE64, OBFUSCATION.LARGE_HEX

Python AST (catches what regex misses) — AST.EVAL_CALL, AST.EXEC_CALL, AST.COMPILE_CALL, AST.OS_SYSTEM, AST.OS_POPEN, AST.OS_EXECV, AST.SUBPROCESS_RUN_SHELL_TRUE, AST.SUBPROCESS_POPEN_SHELL_TRUE, AST.DYNAMIC_IMPORT

Dependencies — DEP.TYPOSQUAT, DEP.HOMOGLYPH, DEP.UNTRUSTED_GIT_SOURCE, DEP.LOCAL_PATH

Use list_detection_rules to query the live catalog.

Risk scoring

Each finding contributes by severity:

Severity	Weight
CRITICAL	40
HIGH	15
MEDIUM	5
LOW	1
INFO	0

Final risk_score = min(sum, 100). Bucketing (first match wins):

Bucket	Trigger
BLOCK	≥1 CRITICAL or score ≥ 80
REVIEW	≥1 HIGH or score ≥ 50
CAUTION	≥1 MEDIUM or score ≥ 20
CLEAN	no findings or only INFO

Conservative-by-design: false positives are OK, missed criticals are not. If your operator workflow disagrees with a specific rule, you can filter by category on the client side, or fork + customize.

Roadmap

Version	Scope	Status
v1.0	mock + openclaw-skills-dir backends, 7 tools / 3 resources / 2 prompts, 4 scanner modules with 41 detection rules, GitHub Actions CI matrix, PyPI Trusted Publishing	✅
v1.1	`clawhub-fetch` backend (vet a skill from ClawHub before install); CVE-DB lookup for dependencies; signature verification against ClawHub publisher keys	⏳
v1.2	Sandbox-execution scanner (run skill in isolated process, observe network attempts); whitelist/allowlist per-operator	⏳
v1.x	Custom rule packs; integration with existing SAST tools; per-rule severity overrides	⏳

Need this adapted to your stack?

If your AI deployment doesn't use the OpenClaw skill format — different agent harness, custom skill schema, monolithic skill files, internal-registry distribution — and you want the same vet-before-install discipline, that's a Custom MCP Build engagement.

Tier	Scope	Investment	Timeline
Simple	Single backend adapter for your existing skill format	$8,000–$12,000	1–2 weeks
Standard	Custom backend + custom rule pack tuned to your ecosystem + CI integration	$15,000–$25,000	2–4 weeks
Complex	Multi-format ingestion + sandbox-execution + signed-publisher allowlist + rule-tuning workshop	$30,000–$45,000	4–8 weeks

To engage:

Email [email protected] with subject Custom MCP Build inquiry — skill vetting
Include: 1-paragraph description of your skill ecosystem + which tier you're considering
Reply within 2 business days with a 30-min discovery call slot

This server is part of a production-AI infrastructure MCP suite — companion to silentwatch-mcp, openclaw-health-mcp, and openclaw-cost-tracker-mcp. Install all four for full operational visibility.

Production AI audits

If you're running production AI and want an outside practitioner to score readiness, find the failure patterns already present (ClawHavoc-style skill malware being one of the most damaging), and write the corrective-action plan:

Tier	Scope	Investment	Timeline
Audit Lite	One system, top-5 findings, written report	$1,500	1 week
Audit Standard	Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up	$3,000	2–3 weeks
Audit + Workshop	Standard audit + 2-day team workshop + first monthly audit included	$7,500	3–4 weeks

Same email channel: [email protected] with subject AI audit inquiry.

Contributing

PRs welcome. Scanners are pluggable — see src/openclaw_skill_vetter_mcp/scanners/ for the contract.

To add a new scanner:

Create scanners/<your_scanner>.py exporting SCANNER_NAME: str and def scan(skill: Skill) -> list[Finding]
Optionally export def all_rules() -> list[tuple[...]] for the rules catalog
Register in analysis.vet_skill (the orchestrator iterates over a fixed tuple of scanner modules)
Add tests in tests/test_scanners.py

To add a new backend:

Subclass SkillBackend in backends/<your_backend>.py
Implement get_skills, get_skill_by_id, get_directory
Register in backends/__init__.py
Add tests in tests/test_backend_<your_backend>.py

Bug reports + feature requests: open a GitHub issue. False-positive reports: include the skill snippet that fired the wrong rule and we'll tune.

License

MIT — see LICENSE.

Production-AI MCP Suite (Gumroad bundle) — this server plus 4 others in one curated bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $99, or $49 with LAUNCH50 for the first 30 days.
silentwatch-mcp — cron silent-failure detection
openclaw-health-mcp — deployment health
openclaw-cost-tracker-mcp — token-cost telemetry
openclaw-upgrade-orchestrator-mcp — read-only upgrade advisor
AI Production Discipline Framework — Notion template, $29 — methodology these MCPs implement
SPEC.md — full server design

Built by Temur Khan — independent practitioner on production AI systems. Contact: [email protected]

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "openclaw-skill-vetter-mcp": {
      "command": "npx",
      "args": []
    }
  }
}

Openclaw Skill Vetter

Описание

README

What it does

Why `openclaw-skill-vetter-mcp`

Tool surface

Quickstart

Install

Configure for Claude Desktop

Backends

Skill manifest format

Detection rules (v1.0)

Risk scoring

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Как установить

Похожие MCP

GitHub

Supabase

Everything

Filesystem

Command Palette

Openclaw Skill Vetter

Описание

README

What it does

Why openclaw-skill-vetter-mcp

Tool surface

Quickstart

Install

Configure for Claude Desktop

Backends

Skill manifest format

Detection rules (v1.0)

Risk scoring

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Как установить

Похожие MCP

GitHub

Supabase

Everything

Filesystem

Why `openclaw-skill-vetter-mcp`