loading…
Search for a command to run...
loading…
Web scraping, crawling, and search API. Extract content in Markdown/JSON, batch process 10k URLs, and get AI-powered answers with citations.
Web scraping, crawling, and search API. Extract content in Markdown/JSON, batch process 10k URLs, and get AI-powered answers with citations.
Docker Hub npm version License: ISC
A Model Context Protocol (MCP) server implementation that integrates with Olostep for web scraping, content extraction, and search capabilities. To set up Olostep MCP Server, you need to have an API key. You can get the API key by signing up on the Olostep website.
There are multiple ways to connect to the Olostep MCP Server. Choose the one that best fits your workflow.
The simplest way — no local installation required. Connect directly to our hosted MCP server:
https://mcp.olostep.com/mcp
Authentication is done via a Bearer token in the Authorization header using your Olostep API key. See the Client Setup section below for configuration examples.
Pull and run the official Docker image:
docker pull olostep/mcp-server
docker run -i --rm \
-e OLOSTEP_API_KEY="your-api-key" \
olostep/mcp-server
If you prefer to build the image yourself from source:
git clone https://github.com/olostep/olostep-mcp-server.git
cd olostep-mcp-server
npm install
npm run build
docker build -t olostep/mcp-server:local .
docker run -i --rm -e OLOSTEP_API_KEY="your-api-key" olostep/mcp-server:local
Run without any installation using npx:
env OLOSTEP_API_KEY=your-api-key npx -y olostep-mcp
On Windows (PowerShell):
$env:OLOSTEP_API_KEY = "your-api-key"; npx -y olostep-mcp
On Windows (CMD):
set OLOSTEP_API_KEY=your-api-key && npx -y olostep-mcp
Or install globally:
npm install -g olostep-mcp
The easiest way is to use the remote endpoint. Create or edit .cursor/mcp.json in your project root:
{
"mcpServers": {
"olostep": {
"url": "https://mcp.olostep.com/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_KEY_HERE"
}
}
}
}
Alternative (local): Go to Cursor Settings > Features > MCP Servers, click "+ Add New MCP Server":
olostepcommandenv OLOSTEP_API_KEY=your-api-key npx -y olostep-mcpAdd this to your claude_desktop_config.json:
{
"mcpServers": {
"mcp-server-olostep": {
"command": "npx",
"args": ["-y", "olostep-mcp"],
"env": {
"OLOSTEP_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Alternative (Docker):
{
"mcpServers": {
"olostep": {
"command": "docker",
"args": [
"run", "-i", "--rm",
"-e", "OLOSTEP_API_KEY=YOUR_API_KEY_HERE",
"olostep/mcp-server"
]
}
}
}
Or install via the Smithery CLI in your device terminal:
npx -y @smithery/cli install @olostep/olostep-mcp-server --client claude
Add the remote endpoint to your Claude Code MCP configuration:
{
"mcpServers": {
"olostep": {
"url": "https://mcp.olostep.com/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_KEY_HERE"
}
}
}
}
Alternative (local):
{
"mcpServers": {
"olostep": {
"command": "npx",
"args": ["-y", "olostep-mcp"],
"env": {
"OLOSTEP_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Add this to your ./codeium/windsurf/model_config.json:
{
"mcpServers": {
"olostep": {
"serverUrl": "https://mcp.olostep.com/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_KEY_HERE"
}
}
}
}
Alternative (local):
{
"mcpServers": {
"mcp-server-olostep": {
"command": "npx",
"args": ["-y", "olostep-mcp"],
"env": {
"OLOSTEP_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Add this to your .vscode/mcp.json:
{
"servers": {
"olostep": {
"type": "http",
"url": "https://mcp.olostep.com/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_KEY_HERE"
}
}
}
}
Alternative (local):
{
"servers": {
"olostep": {
"type": "stdio",
"command": "npx",
"args": ["-y", "olostep-mcp"],
"env": {
"OLOSTEP_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Option 1: One-Click Installation (Recommended)
Option 2: Manual Configuration
Add this to your Metorial MCP server configuration:
{
"olostep": {
"command": "npx",
"args": ["-y", "olostep-mcp"],
"env": {
"OLOSTEP_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
The Olostep tools will then be available in your Metorial AI chats.
OLOSTEP_API_KEY: Your Olostep API key (required)ORBIT_KEY: An optional key for using Orbit to route requests.scrape_website)Extract content from a single URL. Supports multiple formats and JavaScript rendering.
{
"name": "scrape_website",
"arguments": {
"url_to_scrape": "https://example.com",
"output_format": "markdown",
"country": "US",
"wait_before_scraping": 1000,
"parser": "@olostep/amazon-product"
}
}
url_to_scrape: The URL of the website you want to scrape (required)output_format: Choose format (html, markdown, json, or text) - default: markdowncountry: Optional country code (e.g., US, GB, CA) for location-specific scrapingwait_before_scraping: Wait time in milliseconds before scraping (0-10000)parser: Optional parser ID for specialized extraction{
"content": [
{
"type": "text",
"text": "{\n \"id\": \"scrp_...\",\n \"url\": \"https://example.com\",\n \"markdown_content\": \"# ...\",\n \"html_content\": null,\n \"json_content\": null,\n \"text_content\": null,\n \"status\": \"succeeded\",\n \"timestamp\": \"2025-11-14T12:34:56Z\",\n \"screenshot_hosted_url\": null,\n \"page_metadata\": { }\n}"
}
]
}
search_web)Search the Web for a given query and get structured results (non-AI, parser-based).
{
"name": "search_web",
"arguments": {
"query": "your search query",
"country": "US"
}
}
query: Search query (required)country: Optional country code for localized results (default: US)answers)Search the web and return AI-powered answers in the JSON structure you want, with sources and citations.
{
"name": "answers",
"arguments": {
"task": "Who are the top 5 competitors to Acme Inc. in the EU?",
"json": "Return a list of the top 5 competitors with name and homepage URL"
}
}
task: Question or task to answer using web data (required)json: Optional JSON schema/object or a short description of the desired output shapeanswer_id, object, task, result (JSON if provided), sources, createdbatch_scrape_urls)Scrape up to 10k URLs at the same time. Perfect for large-scale data extraction.
{
"name": "batch_scrape_urls",
"arguments": {
"urls_to_scrape": [
{"url": "https://example.com/a", "custom_id": "a"},
{"url": "https://example.com/b", "custom_id": "b"}
],
"output_format": "markdown",
"country": "US",
"wait_before_scraping": 500,
"parser": "@olostep/amazon-product"
}
}
batch_id, status, total_urls, created_at, formats, country, parser, urlscreate_crawl)Start an async crawl that autonomously discovers and scrapes entire websites by following links. Returns a crawl_id — the crawl runs in the background and does not return content in this response. You must then call get_crawl_results with the crawl_id to poll status and retrieve the scraped pages (same two-step pattern as batch_scrape_urls + get_batch_results).
{
"name": "create_crawl",
"arguments": {
"start_url": "https://example.com/docs",
"max_pages": 25,
"follow_links": true,
"output_format": "markdown",
"country": "US",
"parser": "@olostep/doc-parser"
}
}
crawl_id, object, status, start_url, max_pages, follow_links, created, formats, country, parserPair this call with
get_crawl_results— do not pass acrawl_idtoget_batch_results(crawls and batches are separate resources).
create_map)Get all URLs on a website. Extract all URLs for discovery and analysis.
{
"name": "create_map",
"arguments": {
"website_url": "https://example.com",
"search_query": "blog",
"top_n": 200,
"include_url_patterns": ["/blog/**"],
"exclude_url_patterns": ["/admin/**"]
}
}
map_id, object, url, total_urls, urls, search_query, top_nget_webpage_content)Retrieves webpage content in clean markdown format with support for JavaScript rendering.
{
"name": "get_webpage_content",
"arguments": {
"url_to_scrape": "https://example.com",
"wait_before_scraping": 1000,
"country": "US"
}
}
url_to_scrape: The URL of the webpage to scrape (required)wait_before_scraping: Time to wait in milliseconds before starting the scrape (default: 0)country: Residential country to load the request from (e.g., US, CA, GB) (optional){
"content": [
{
"type": "text",
"text": "# Example Website\n\nThis is the markdown content of the webpage..."
}
]
}
get_website_urls)Search and retrieve relevant URLs from a website, sorted by relevance to your query.
{
"name": "get_website_urls",
"arguments": {
"url": "https://example.com",
"search_query": "your search term"
}
}
url: The URL of the website to map (required)search_query: The search query to sort URLs by (required){
"content": [
{
"type": "text",
"text": "Found 42 URLs matching your query:\n\nhttps://example.com/page1\nhttps://example.com/page2\n..."
}
]
}
get_batch_results)Retrieve the results of a previously submitted batch scrape job using its batch_id.
{
"name": "get_batch_results",
"arguments": {
"batch_id": "batch_abc123"
}
}
batch_id: The batch ID returned from batch_scrape_urls (required)batch_id, status (processing or completed), total_urls, completed_urls, items (array of scraped results per URL with url, custom_id, markdown_content, html_content, json_content, text_content, status, page_metadata)get_crawl_results)Retrieve the status and scraped pages for an async crawl started with create_crawl. This is the required companion to create_crawl — create_crawl only kicks off the job and returns a crawl_id; this tool is how you actually fetch the discovered pages and their content.
{
"name": "get_crawl_results",
"arguments": {
"crawl_id": "crawl_abc123",
"formats": ["markdown"],
"items_limit": 20,
"cursor": 0
}
}
crawl_id: The crawl ID returned from create_crawl (required)formats: Array of formats to retrieve per page — markdown, html, json, text (default: ["markdown"])items_limit: Max pages to retrieve content for, 1–100 (default: 20)cursor: Pagination cursor into the list of discovered pages (default: 0)search_query: Optional filter to rank/select pages by relevance to a querycrawl_id, status (in_progress), pages_completed, pages_total, and a message prompting you to call again in ~10 seconds.crawl_id, status (completed), pages_returned, next_cursor, has_more, and a pages array where each entry has url, custom_id, and the requested content fields (markdown_content, html_content, json_content, text_content).The server provides robust error handling:
Example error response:
{
"isError": true,
"content": [
{
"type": "text",
"text": "Olostep API Error: 401 Unauthorized. Details: {\"error\":\"Invalid API key\"}"
}
]
}
The MCP server is available as a Docker image:
[olostep/mcp-server](https://hub.docker.com/r/olostep/mcp-server)mcp/olostep (coming soon - enhanced security with signatures & SBOMs)ghcr.io/olostep/olostep-mcp-serverThe Olostep MCP Server is being added to Docker Desktop's official MCP Toolkit, which means users will be able to:
Status: Submission in progress to Docker MCP Registry
linux/amd64linux/arm64# Clone the repository
git clone https://github.com/olostep/olostep-mcp-server.git
cd olostep-mcp-server
# Build the image
npm install
npm run build
docker build -t olostep/mcp-server .
# Run locally
docker run -i --rm -e OLOSTEP_API_KEY="your-key" olostep/mcp-server
ISC License
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"olostep-olostep-mcp-server": {
"command": "npx",
"args": []
}
}
}Browser automation, scraping, screenshots
Browser automation and web scraping.
Plugin-based MCP server + Chrome extension that gives AI agents access to web applications through the user's authenticated browser session. 100+ plugins with a
1,500+ developer infrastructure deals, free tiers, and startup programs across 54 categories. Search deals, compare vendors, plan stacks, and track pricing chan