loading…
Search for a command to run...
loading…
Context-aware web fetching for LLMs, providing 7 tools to check page size, fetch with truncation, extract code/sections/links/tables, and paginate large documen
Context-aware web fetching for LLMs, providing 7 tools to check page size, fetch with truncation, extract code/sections/links/tables, and paginate large documents.
PyPI version PyPI downloads Python version License: MIT
Context-aware web fetching for LLMs. Prevents context window flooding by checking page size before fetching and providing surgical extraction tools.
Standard web fetch tools dump entire pages into the context window, often:
Smart WebFetch provides 7 tools for intelligent web fetching:
| Tool | Purpose |
|---|---|
web_preflight |
Check page size before fetching |
web_smart_fetch |
Fetch with automatic truncation |
web_fetch_code |
Extract only code blocks |
web_fetch_section |
Fetch specific heading/section |
web_fetch_chunked |
Paginated fetching for large docs |
web_fetch_links |
Extract all links from a page |
web_fetch_tables |
Extract tables as markdown |
# Install from PyPI
pip install smart-webfetch-mcp
# Or with uvx (recommended for MCP)
uvx smart-webfetch-mcp
claude mcp add --transport stdio smart-webfetch -- uvx smart-webfetch-mcp
Add to your opencode.json:
{
"mcp": {
"smart-webfetch": {
"type": "local",
"command": ["uvx", "smart-webfetch-mcp"],
"enabled": true
}
}
}
Add to claude_desktop_config.json:
{
"mcpServers": {
"smart-webfetch": {
"command": "uvx",
"args": ["smart-webfetch-mcp"]
}
}
}
Use web_preflight to check https://docs.python.org/3/library/asyncio.html
Response:
{
"url": "https://docs.python.org/3/library/asyncio.html",
"estimated_tokens": 45000,
"safe_for_context": false,
"recommendation": "Very large page (~45,000 tokens). Use web_fetch_section or web_fetch_chunked."
}
Use web_smart_fetch on https://example.com/docs with max_tokens=4000
Use web_fetch_code on https://docs.python.org/3/library/asyncio-task.html
Use web_fetch_section on https://docs.python.org/3/library/asyncio.html
with heading="Running an asyncio Program"
Use web_fetch_chunked on https://large-docs.com/api with chunk=0, chunk_size=4000
Then continue with chunk=1, chunk=2, etc.
Check page metadata before fetching.
Parameters:
url (required): URL to checkReturns:
estimated_tokens: Approximate token countcontent_type: MIME typeis_html: Whether content is HTMLtitle: Page title (if HTML)safe_for_context: Boolean (true if < 8000 tokens)recommendation: Human-readable adviceFetch with automatic truncation for large pages.
Parameters:
url (required): URL to fetchmax_tokens (optional, default 8000): Maximum tokens to returnstrategy (optional, default "auto"): "auto" finds natural break points, "truncate" hard cutsReturns: Markdown content with metadata header
Extract only code blocks from a page.
Parameters:
url (required): URL to extract code fromReturns: Code blocks with language annotations and context
Fetch content under a specific heading.
Parameters:
url (required): URL to fetch fromheading (required): Heading text to find (case-insensitive)Returns: Section content or list of available sections if not found
Fetch large documents in chunks.
Parameters:
url (required): URL to fetchchunk (optional, default 0): Chunk index (0-based)chunk_size (optional, default 4000): Tokens per chunkReturns: Chunk content with navigation metadata
Extract all links from a page.
Parameters:
url (required): URL to extract links fromfilter_pattern (optional): Regex to filter link URLsexternal_only (optional, default false): Only return external linksReturns: Markdown list of links with text and URL
Extract tables from a page as markdown.
Parameters:
url (required): URL to extract tables fromtable_index (optional): Specific table index (0-based), returns all if not specifiedReturns: Markdown formatted tables
# Clone and install dev dependencies
git clone https://github.com/mathisto/smart-webfetch-mcp
cd smart-webfetch-mcp
pip install -e ".[dev]"
# Run tests
pytest
# Format code
ruff format .
ruff check --fix .
MIT
Run in your terminal:
claude mcp add smart-webfetch-mcp -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.