Website Scraper Server

БесплатноHosted

Extracts clean text content and metadata from website URLs using Beautiful Soup through a standardized interface for AI agents. It supports both Bearer token au

автор: Traia-IO

GitHub

Описание

Extracts clean text content and metadata from website URLs using Beautiful Soup through a standardized interface for AI agents. It supports both Bearer token authentication and a blockchain-based pay-per-use protocol for flexible access.

README

This is an MCP (Model Context Protocol) server that provides with authentication via Bearer tokens access to the Website Scraper MCP API. It enables AI agents and LLMs to interact with Website Scraper MCP through standardized tools.

Features

🔧 MCP Protocol: Built on the Model Context Protocol for seamless AI integration
🌐 Full API Access: Provides tools for interacting with Website Scraper MCP endpoints
🔐 Secure Authentication: Supports API key authentication via Bearer tokens
💳 HTTP 402 Payment Protocol: Dual-mode operation (authenticated or paid access)
🔗 D402 Integration: Uses traia_iatp.d402 for blockchain payment verification
🐳 Docker Support: Easy deployment with Docker and Docker Compose
⚡ Async Operations: Built with FastMCP for efficient async handling

API Documentation

Website Scraper MCP Website: https://www.crummy.com/software/BeautifulSoup
API Documentation: None

Available Tools

This server provides the following tools:

example_tool: Placeholder tool (to be implemented)

Note: Replace example_tool with actual Website Scraper MCP API tools based on the documentation.

Installation

Using Docker (Recommended)

Clone this repository:

git clone https://github.com/Traia-IO/website-scraper-mcp-mcp-server.git
cd website-scraper-mcp-mcp-server

Set your API key:

export WEB_SCRAPPING_API_KEY="your-api-key-here"

Run with Docker:
```
./run_local_docker.sh
```

Using Docker Compose

Create a .env file with your configuration:

Server's internal API key (for payment mode)

WEB_SCRAPPING_API_KEY=your-api-key-here

Server payment address (for HTTP 402 protocol)

SERVER_ADDRESS=0x1234567890123456789012345678901234567890

Operator keys (for signing settlement attestations)

MCP_OPERATOR_PRIVATE_KEY=0x1234567890abcdef... MCP_OPERATOR_ADDRESS=0x9876543210fedcba...

Optional: Testing mode (skip settlement for local dev)

D402_TESTING_MODE=false PORT=8000


2. Start the server:
```bash
docker-compose up

Manual Installation

Install dependencies using uv:
```
uv pip install -e .
```
Run the server:

WEB_SCRAPPING_API_KEY="your-api-key-here" uv run python -m server


## Usage

### Health Check

Test if the server is running:
```bash
python mcp_health_check.py

Using with CrewAI

from traia_iatp.mcp.traia_mcp_adapter import create_mcp_adapter_with_auth

# Connect with authentication
with create_mcp_adapter_with_auth(
    url="http://localhost:8000/mcp/",
    api_key="your-api-key"
) as tools:
    # Use the tools
    for tool in tools:
        print(f"Available tool: {tool.name}")
        
    # Example usage
    result = await tool.example_tool(query="test")
    print(result)

Authentication & Payment (HTTP 402 Protocol)

This server supports two modes of operation:

Mode 1: Authenticated Access (Free)

Clients with their own Website Scraper MCP API key can use the server for free:

# Request with client's API key
curl -X POST http://localhost:8000/mcp \
  -H "Authorization: Bearer CLIENT_WEB_SCRAPPING_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"method":"tools/call","params":{"name":"example_tool","arguments":{"query":"test"}}}'

Flow:

Client provides their Website Scraper MCP API key
Server uses client's API key to call Website Scraper MCP API
No payment required
Client pays Website Scraper MCP directly

Mode 2: Payment Required (Paid Access)

Clients without an API key can pay-per-use via HTTP 402 protocol:

# Request with payment proof (x402/d402 protocol)
curl -X POST http://localhost:8000/mcp \
  -H "X-PAYMENT: <base64_encoded_x402_payment>" \
  -H "Content-Type: application/json" \
  -d '{"method":"tools/call","params":{"name":"example_tool","arguments":{"query":"test"}}}'

Flow:

Client makes initial request without payment
Server returns HTTP 402 with PaymentRequirements (token, network, amount)
Client creates EIP-3009 transferWithAuthorization payment signature
Client base64-encodes payment and sends in X-PAYMENT header
Server verifies payment via traia_iatp.d402.mcp_middleware
Server uses its INTERNAL Website Scraper MCP API key to call the API
Client receives result

D402 Protocol Details

This server uses the traia_iatp.d402 module for payment verification:

Payment Method: EIP-3009 transferWithAuthorization (gasless)
Supported Tokens: USDC, TRAIA, or any ERC20 token
Default Price: $0.001 per request (configurable via DEFAULT_PRICE_USD)
Networks: Base Sepolia, Sepolia, Polygon, etc.
Facilitator: d402.org (public) or custom facilitator

Environment Variables for Payment Mode

# Required
WEB_SCRAPPING_API_KEY=your_internal_website-scraper-mcp_api_key  # Server's API key (for payment mode)
SERVER_ADDRESS=0x1234567890123456789012345678901234567890  # Server's payment address

# Required for Settlement (Production)
MCP_OPERATOR_PRIVATE_KEY=0x1234...  # Private key for signing settlement attestations
MCP_OPERATOR_ADDRESS=0x5678...      # Operator's public address (for verification)

# Optional
D402_FACILITATOR_URL=https://facilitator.d402.net  # Facilitator service URL
D402_FACILITATOR_API_KEY=your_key  # For private facilitator
D402_TESTING_MODE=false  # Set to 'true' for local testing without settlement

Operator Keys:

MCP_OPERATOR_PRIVATE_KEY: Used to sign settlement attestations (proof of service completion)
MCP_OPERATOR_ADDRESS: Public address corresponding to the private key
Required for on-chain settlement via IATP Settlement Layer
Can be the same as SERVER_ADDRESS or a separate operator key

Note on Per-Endpoint Configuration: Each endpoint's payment requirements (token address, network, price) are embedded in the tool code. They come from the endpoint configuration when the server is generated.

How It Works

Client Decision:
- Has Website Scraper MCP API key? → Mode 1 (Authenticated)
- No API key but willing to pay? → Mode 2 (Payment)
Server Response:
- Mode 1: Uses client's API key (free for client)
- Mode 2: Uses server's API key (client pays server)
Business Model:
- Mode 1: No revenue (passthrough)
- Mode 2: Revenue from pay-per-use (monetize server's API subscription)

Development

Testing the Server

Start the server locally
Run the health check: python mcp_health_check.py
Test individual tools using the CrewAI adapter

Adding New Tools

To add new tools, edit server.py and:

Create API client functions for Website Scraper MCP endpoints
Add @mcp.tool() decorated functions
Update this README with the new tools
Update deployment_params.json with the tool names in the capabilities array

Deployment

Deployment Configuration

The deployment_params.json file contains the deployment configuration for this MCP server:

{
  "github_url": "https://github.com/Traia-IO/website-scraper-mcp-mcp-server",
  "mcp_server": {
    "name": "website-scraper-mcp-mcp",
    "description": "Website content scraper using beautiful soup - extract clean text content from any website url with metadata support and custom headers",
    "server_type": "streamable-http",
"requires_api_key": true,
    "api_key_header": "Authorization",
"capabilities": [
      // List all implemented tool names here
      "example_tool"
    ]
  },
  "deployment_method": "cloud_run",
  "gcp_project_id": "traia-mcp-servers",
  "gcp_region": "us-central1",
  "tags": ["website scraper mcp", "api"],
  "ref": "main"
}

Important: Always update the capabilities array when you add or remove tools!

Google Cloud Run

This server is designed to be deployed on Google Cloud Run. The deployment will:

Build a container from the Dockerfile
Deploy to Cloud Run with the specified configuration
Expose the /mcp endpoint for client connections

Environment Variables

PORT: Server port (default: 8000)
STAGE: Environment stage (default: MAINNET, options: MAINNET, TESTNET)
LOG_LEVEL: Logging level (default: INFO)
WEB_SCRAPPING_API_KEY: Your Website Scraper MCP API key (required)

Troubleshooting

Server not starting: Check Docker logs with docker logs <container-id>
Authentication errors: Ensure your API key is correctly set in the environment
API errors: Verify your API key has the necessary permissions3. Tool errors: Check the server logs for detailed error messages

Contributing

Fork the repository
Create a feature branch
Implement new tools or improvements
Update the README and deployment_params.json
Submit a pull request

License

MIT License

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "website-scraper-mcp-server": {
      "command": "npx",
      "args": []
    }
  }
}

Website Scraper Server

Описание

README

Features

API Documentation

Available Tools

Installation

Using Docker (Recommended)

Using Docker Compose

Server's internal API key (for payment mode)

Server payment address (for HTTP 402 protocol)

Operator keys (for signing settlement attestations)

Optional: Testing mode (skip settlement for local dev)

Manual Installation

Using with CrewAI

Authentication & Payment (HTTP 402 Protocol)

Mode 1: Authenticated Access (Free)

Mode 2: Payment Required (Paid Access)

D402 Protocol Details

Environment Variables for Payment Mode

How It Works

Development

Testing the Server

Adding New Tools

Deployment

Deployment Configuration

Google Cloud Run

Environment Variables

Troubleshooting

Contributing

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant

Command Palette

Website Scraper Server

Описание

README

Features

API Documentation

Available Tools

Installation

Using Docker (Recommended)

Using Docker Compose

Server's internal API key (for payment mode)

Server payment address (for HTTP 402 protocol)

Operator keys (for signing settlement attestations)

Optional: Testing mode (skip settlement for local dev)

Manual Installation

Using with CrewAI

Authentication & Payment (HTTP 402 Protocol)

Mode 1: Authenticated Access (Free)

Mode 2: Payment Required (Paid Access)

D402 Protocol Details

Environment Variables for Payment Mode

How It Works

Development

Testing the Server

Adding New Tools

Deployment

Deployment Configuration

Google Cloud Run

Environment Variables

Troubleshooting

Contributing

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant