loading…
Search for a command to run...
loading…
Extracts clean web content for RAG and provides Q&A about web pages.
Extracts clean web content for RAG and provides Q&A about web pages.
A powerful MCP (Model Context Protocol) server for intelligent web content analysis and summarization. Built with FastMCP, this server provides smart web scraping, content extraction, and AI-powered question-answering capabilities.
url_to_markdown - Extract and summarize web pages to markdown
web_content_qna - AI-powered Q&A about web content
pip install web-analyzer-mcp
git clone https://github.com/kimdonghwi94/web-analyzer-mcp.git
cd web-analyzer-mcp
pip install -e .
# Clone and setup
git clone https://github.com/kimdonghwi94/web-analyzer-mcp.git
cd web-analyzer-mcp
# Install dependencies (both Node.js and Python)
npm install
npm run install
# Build the project
npm run build
# Test with MCP Inspector
npm test
# Start development server
npm run dev
Create a .env file or set environment variables:
OPENAI_API_KEY=your_openai_api_key_here
Add to your Claude Desktop configuration file:
Windows: %APPDATA%/Claude/claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"web-analyzer": {
"command": "python",
"args": ["-m", "web_analyzer_mcp.server"],
"env": {
"OPENAI_API_KEY": "your_openai_api_key_here",
"OPENAI_MODEL": "gpt-3.5-turbo"
}
}
}
}
Note: OPENAI_MODEL is optional - defaults to gpt-3.5-turbo if not specified
Add to your Cursor settings (File > Preferences > Settings > Extensions > MCP):
{
"mcp.servers": {
"web-analyzer": {
"command": "python",
"args": ["-m", "web_analyzer_mcp.server"],
"env": {
"OPENAI_API_KEY": "your_openai_api_key_here",
"OPENAI_MODEL": "gpt-4"
}
}
}
}
Note: OPENAI_MODEL is optional - defaults to gpt-3.5-turbo if not specified
Add to your VS Code settings.json:
{
"claude-code.mcpServers": {
"web-analyzer": {
"command": "python",
"args": ["-m", "web_analyzer_mcp.server"],
"cwd": "${workspaceFolder}/web-analyzer-mcp",
"env": {
"OPENAI_API_KEY": "your_openai_api_key_here",
"OPENAI_MODEL": "gpt-4-turbo"
}
}
}
}
Note: OPENAI_MODEL is optional - defaults to gpt-3.5-turbo if not specified
Create a run configuration in PyCharm:
Run > Edit Configurations/path/to/web_analyzer_mcp/server.pyOPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o
/path/to/web-analyzer-mcpNote: OPENAI_MODEL is optional - defaults to gpt-3.5-turbo if not specified
Or use the external tool configuration:
<tool name="Web Analyzer MCP" description="Start Web Analyzer MCP Server" showInMainMenu="false" showInEditor="false" showInProject="false" showInSearchPopup="false">
<exec>
<option name="COMMAND" value="python" />
<option name="PARAMETERS" value="-m web_analyzer_mcp.server" />
<option name="WORKING_DIRECTORY" value="$ProjectFileDir$" />
</exec>
</tool>
# Extract clean markdown from a web page
result = url_to_markdown("https://example.com/article")
print(result)
# Ask questions about web page content
answer = web_content_qna(
url="https://example.com/documentation",
question="What are the main features of this product?"
)
print(answer)
url_to_markdownConverts web pages to clean markdown format with essential content extraction.
Parameters:
url (string): The web page URL to analyzeReturns: Clean markdown content with structured data preservation
web_content_qnaAnswers questions about web page content using intelligent content analysis.
Parameters:
url (string): The web page URL to analyzequestion (string): Question about the page contentReturns: AI-generated answer based on page content
web-analyzer-mcp/
├── web_analyzer_mcp/ # Main Python package
│ ├── __init__.py # Package initialization
│ ├── server.py # FastMCP server with tools
│ ├── web_extractor.py # Web content extraction engine
│ └── rag_processor.py # RAG-based Q&A processor
├── scripts/ # Build and utility scripts
│ └── build.js # Node.js build script
├── README.md # English documentation
├── README.ko.md # Korean documentation
├── package.json # npm configuration and scripts
├── pyproject.toml # Python package configuration
├── .env.example # Environment variables template
└── dist-info.json # Build information (generated)
# Clone repository
git clone https://github.com/kimdonghwi94/web-analyzer-mcp.git
cd web-analyzer-mcp
# Setup environment
npm install # Install Node.js dependencies
npm run install # Install Python dependencies
# Development commands
npm run build # Full build with validation
npm run dev # Start development server
npm test # Test with MCP Inspector
npm run lint # Code formatting and linting
npm run typecheck # Type checking
npm run clean # Clean build artifacts
# Setup Python environment
pip install -e .[dev]
# Development commands
python -m web_analyzer_mcp.server # Start server
python -m pytest tests/ # Run tests (if available)
python -m black web_analyzer_mcp/ # Format code
python -m mypy web_analyzer_mcp/ # Type checking
git checkout -b feature/amazing-feature)git commit -m 'Add amazing feature')git push origin feature/amazing-feature)This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ for the MCP community
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"kimdonghwi94-web-analyzer-mcp": {
"command": "npx",
"args": []
}
}
}