loading…
Search for a command to run...
loading…
A Model Context Protocol server that enables AI systems to crawl and scrape the live web using Crawl4AI and headless Chromium. It provides tools for structured
A Model Context Protocol server that enables AI systems to crawl and scrape the live web using Crawl4AI and headless Chromium. It provides tools for structured data extraction, deep site traversal, and session-aware workflows with LLM-optimized outputs like markdown.

A Model Context Protocol server for web crawling powered by Crawl4AI
MCP-Crawl4AI is a Model Context Protocol server that gives AI systems access to the live web. Built on FastMCP v3 and Crawl4AI, it exposes 4 tools, 2 resources, and 3 prompts through the standardized MCP interface, backed by a lifespan-managed headless Chromium browser.
Only 2 runtime dependencies — fastmcp and crawl4ai.
[!TIP] Full documentation site →
readOnlyHint, destructiveHint, etc.)crawl contractpip install mcp-crawl4ai
mcp-crawl4ai --setup # one-time: installs Playwright browsers
uv add mcp-crawl4ai
mcp-crawl4ai --setup # one-time: installs Playwright browsers
docker build -t mcp-crawl4ai .
docker run -p 8000:8000 mcp-crawl4ai
The Docker image includes Playwright browsers — no separate setup needed.
git clone https://github.com/wyattowalsh/mcp-crawl4ai.git
cd mcp-crawl4ai
uv sync --group dev
mcp-crawl4ai --setup
[!NOTE] The server auto-detects missing Playwright browsers on first startup and attempts to install them automatically. You can also run
mcp-crawl4ai --setuporcrawl4ai-setupmanually at any time.
mcp-crawl4ai
mcp-crawl4ai --transport http --port 8000
[!NOTE] HTTP binds to
127.0.0.1by default (private/local only); for external exposure, set--hostexplicitly and use a reverse proxy for TLS/auth.
Add to your Claude Desktop MCP settings (claude_desktop_config.json):
{
"mcpServers": {
"crawl4ai": {
"command": "mcp-crawl4ai",
"args": ["--transport", "stdio"]
}
}
}
claude mcp add crawl4ai -- mcp-crawl4ai --transport stdio
npx @modelcontextprotocol/inspector uv --directory . run mcp-crawl4ai
The canonical surface now exposes 4 tools:
scrapeScrape one URL or a bounded list of URLs (up to 20) with a single canonical envelope response.
targets (str | list[str]) and optional grouped optionsschema, extraction_mode), runtime controls, diagnostics, session settings, render settings, and artifact captureschema_version, tool, ok, data/items, meta, warnings, errorcrawlCrawl with canonical traversal controls.
options.traversal.mode="list" for bounded list traversaloptions.traversal.mode="deep" for recursive BFS/DFS traversal from a single seedclose_sessionClose a stateful session created via options.session.session_id.
get_artifactRetrieve artifact metadata/content captured during scrape or crawl when options.conversion.capture_artifacts is enabled.
scrape/crawl with minimal options (runtime, conversion.output_format, traversal.mode="list"). This keeps behavior predictable and low-risk for most agent workflows.config://server for settings.defaults, settings.limits, settings.policies, and settings.capabilities to understand active constraints and feature gates before enabling advanced options.| URI | MIME Type | Description |
|---|---|---|
config://server |
application/json |
Current server configuration: name, version, tool list, browser config |
crawl4ai://version |
application/json |
Server and dependency version information (server, crawl4ai, fastmcp) |
| Prompt | Parameters | Description |
|---|---|---|
summarize_page |
url, focus (default: "key points") |
Crawl a page and summarize its content with the specified focus |
build_extraction_schema |
url, data_type |
Inspect a page and build a CSS extraction schema for scrape |
compare_pages |
url1, url2 |
Crawl two pages and produce a structured comparison |
graph TD
A[MCP Client] -->|stdio / HTTP| B[FastMCP v3]
B --> C[Tool Router]
C --> D[scrape]
C --> E[crawl]
C --> F[close_session]
C --> G[get_artifact]
D & E & F & G --> N[AsyncWebCrawler Singleton]
N --> O[Headless Chromium]
B --> P[Resources]
B --> Q[Prompts]
style B fill:#4B8BBE,color:#fff
style N fill:#FF6B35,color:#fff
style O fill:#2496ED,color:#fff
The server uses a single-module architecture:
AsyncWebCrawler starts a headless Chromium browser once at server startup and shares it across all tool invocations, then shuts it down cleanly on exit@mcp.tool() define the canonical surface@mcp.resource() return JSON@mcp.prompt return structured Message lists[!IMPORTANT] There are no intermediate manager classes or custom HTTP clients. The server delegates all crawling to crawl4ai's
AsyncWebCrawlerand all protocol handling to FastMCP. Only 2 runtime dependencies.
| Flag | Default | Description |
|---|---|---|
--transport |
stdio |
Transport protocol: stdio or http |
--host |
127.0.0.1 |
Host to bind (HTTP transport only) |
--port |
8000 |
Port to bind (HTTP transport only) |
--setup |
— | Install Playwright browsers and exit |
No environment variables are required. The server uses sensible defaults for all configuration. Crawl4AI's own environment variables (e.g., CRAWL4AI_VERBOSE) are respected if set.
# Run all tests
uv run pytest
# Coverage gate (>=90%)
uv run pytest --cov=mcp_crawl4ai --cov-report=term-missing
# Smoke tests only
uv run pytest -m smoke
# Unit tests only
uv run pytest -m unit
# Integration workflow tests
uv run pytest -m integration
# End-to-end workflow tests
uv run pytest -m e2e
# Manual live test (requires browser)
uv run python tests/manual/test_live.py
[!NOTE] All automated tests run in-memory using
fastmcp.Client(mcp)— no browser or network required. The test suite mocksAsyncWebCrawlerfor fast, deterministic execution.
See the Contributing Guide for details on setting up your development environment, coding standards, and the pull request process.
This project is licensed under the MIT License. See the LICENSE file for details.
MCP-Crawl4AI — Connecting AI to the Live Web
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"mcp-crawl4ai": {
"command": "npx",
"args": []
}
}
}Browser automation, scraping, screenshots
Browser automation and web scraping.
Plugin-based MCP server + Chrome extension that gives AI agents access to web applications through the user's authenticated browser session. 100+ plugins with a
1,500+ developer infrastructure deals, free tiers, and startup programs across 54 categories. Search deals, compare vendors, plan stacks, and track pricing chan