loading…
Search for a command to run...
loading…
MCP server that captures webpage screenshots, with viewport or full-page options and base64 PNG output.
MCP server that captures webpage screenshots, with viewport or full-page options and base64 PNG output.
npm version
GitHub stars
License: MIT

mcp-page-capture is a Model Context Protocol (MCP) server that orchestrates headless Chromium via Puppeteer to capture pixel-perfect screenshots of arbitrary URLs. It is optimized for Copilot/MCP-enabled environments and can be embedded into automated workflows or run as a standalone developer tool.
viewport, wait, fill, click, scroll, screenshottarget for elements, for for waiting, to for scrolling, device for viewportnpm start, npm run dev, or as a long-lived MCP sidecarcaptureScreenshot and extractDom tools.url, steps, headers, validateSee LLM Quick Reference for the 6 primary step types and common patterns.
For advanced features, see Advanced Steps.
| Step | Purpose | Key Parameters | Example |
|---|---|---|---|
viewport |
Set device | device, width, height |
{ "type": "viewport", "device": "mobile" } |
wait |
Wait for element/time | for OR duration, timeout |
{ "type": "wait", "for": ".loaded" } |
fill |
Fill form field | target, value, submit |
{ "type": "fill", "target": "#email", "value": "[email protected]" } |
click |
Click element | target, waitFor |
{ "type": "click", "target": "button", "waitFor": ".result" } |
scroll |
Scroll page | to, y |
{ "type": "scroll", "to": "#footer" } |
screenshot |
Capture (auto-added) | fullPage, element |
{ "type": "screenshot", "fullPage": true } |
Step Order: Auto-fixed! viewport auto-moves to first, screenshot auto-added at end.
High-level patterns that auto-expand to multiple steps:
// Login pattern
{
"type": "login",
"email": { "selector": "#email", "value": "[email protected]" },
"password": { "selector": "#password", "value": "secret" },
"submit": "button[type=submit]",
"successIndicator": ".dashboard"
}
// Search pattern
{
"type": "search",
"input": "#search-box",
"query": "MCP protocol",
"resultsIndicator": ".search-results"
}
git clone https://github.com/chasesaurabh/mcp-page-capture.git
npm install
# Run without installing globally
npx mcp-page-capture
# Or add it to your toolchain
npm install -g mcp-page-capture
mcp-page-capture
npm install
npm run build
npm start
For hot reload while iterating locally, run npm run dev.
If you need those guarantees, build and run via:
docker build -t mcp-page-capture .
docker run --rm -it mcp-page-capture
Otherwise you can keep using the standard npm scripts locally.
{
"mcpServers": {
"page-capture": {
"command": "node",
"args": ["dist/cli.js"]
}
}
}
Note: You may need to use the full path to dist/cli.js or node depending on your working directory and Node.js module resolution configuration.
If you want to embed the server inside another Node.js process, import the helpers exposed by the package:
import { startMcpPageCaptureServer } from "mcp-page-capture";
await startMcpPageCaptureServer();
// Optionally pass a custom Transport implementation if you don't want stdio.
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com"
}
}
Note: A screenshot is automatically captured at the end if no explicit screenshot step is provided.
The fill step auto-detects field types and handles text inputs, selects, checkboxes, and radio buttons.
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "fill", "target": "#search", "value": "MCP protocol", "submit": true },
{ "type": "wait", "for": ".search-results" },
{ "type": "screenshot" }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/login",
"steps": [
{ "type": "wait", "for": "#login-form" },
{ "type": "fill", "target": "#email", "value": "[email protected]" },
{ "type": "fill", "target": "#password", "value": "secretpassword" },
{ "type": "fill", "target": "#remember-me", "value": "true" },
{ "type": "click", "target": "button[type=submit]", "waitFor": ".dashboard" },
{ "type": "screenshot" }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://docs.modelcontextprotocol.io",
"steps": [
{ "type": "screenshot", "fullPage": true }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/dashboard",
"headers": {
"authorization": "Bearer dev-token"
},
"steps": [
{
"type": "cookie",
"action": "set",
"name": "session",
"value": "abc123",
"path": "/secure"
},
{ "type": "screenshot" }
]
}
}
{
"tool": "extractDom",
"params": {
"url": "https://docs.modelcontextprotocol.io",
"selector": "main article"
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "viewport", "device": "ipad-pro" },
{ "type": "scroll", "to": "#main-content" },
{ "type": "screenshot" }
]
}
}
These are the only steps exposed to LLMs. They cover 95%+ of use cases:
| Step | Purpose | Parameters |
|---|---|---|
viewport |
Set device/screen size | device, width, height |
wait |
Wait for element/time | for OR duration, timeout |
fill |
Fill form field | target, value, submit |
click |
Click element | target, waitFor |
scroll |
Scroll page | to (selector), y (pixels) |
screenshot |
Capture (auto-added) | fullPage, element |
Use validate: true to check steps before execution:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "fill", "target": "#email", "value": "[email protected]" },
{ "type": "click", "target": "button" }
],
"validate": true
}
}
Returns validation analysis including:
waitFor to click)These work at runtime but are not exposed in the LLM schema. Use the 6 primary steps instead:
| Deprecated | Use Instead |
|---|---|
quickFill |
fill with submit: true |
fillForm |
Multiple fill steps |
waitForSelector |
wait with for parameter |
delay |
wait with duration parameter |
fullPage |
screenshot with fullPage: true |
These are for power users only and are not documented in the tool schema:
type, hover, cookie, storage, evaluate, keypress, focus, blur, clear, upload, submit
For backward compatibility, these parameters work at runtime but are not exposed in the LLM schema:
| Legacy Parameter | Canonical | Notes |
|---|---|---|
selector |
target |
Use target for element selectors |
awaitElement |
for |
Use for in wait steps |
scrollTo |
to |
Use to in scroll steps |
preset |
device |
Use device in viewport steps |
captureElement |
element |
Use element in screenshot steps |
waitAfter |
wait |
Use wait in click steps |
Deprecation warnings are logged when legacy parameters are used.
{
"content": [
{
"type": "text",
"text": "mcp-page-capture screenshot\nURL: https://example.com\nCaptured: 2025-12-13T08:30:12.713Z\nFull page: false\nViewport: 1280x720\nDocument: 1280x2000\nScroll position: (0, 0)\nSize: 45.2 KB\nSteps executed: 5"
},
{
"type": "image",
"mimeType": "image/png",
"data": "iVBORw0KGgoAAAANSUhEUgA..."
}
]
}
## Supported options
### `captureScreenshot`
- `url` (string, required): Fully-qualified URL to capture
- `headers` (object, optional): Key/value map of HTTP headers to send with the initial page navigation
- `cookies` (array, optional): List of cookies to set before navigation. Each cookie supports `name`, `value`, and optional `url`, `domain`, `path`, `secure`, `httpOnly`, `sameSite`, and `expires` (Unix timestamp, seconds)
- `viewport` (object, optional): Viewport configuration
- `preset` (string, optional): Use a predefined viewport preset (see Viewport Presets section)
- `width` (number, optional): Custom viewport width
- `height` (number, optional): Custom viewport height
- `deviceScaleFactor` (number, optional): Device scale factor (e.g., 2 for Retina)
- `isMobile` (boolean, optional): Whether to emulate mobile device
- `hasTouch` (boolean, optional): Whether to enable touch events
- `userAgent` (string, optional): Custom user agent string
- `retryPolicy` (object, optional): Retry configuration for transient failures
- `maxRetries` (number, optional, default 3): Maximum number of retry attempts
- `initialDelayMs` (number, optional, default 1000): Initial delay between retries
- `maxDelayMs` (number, optional, default 10000): Maximum delay between retries
- `backoffMultiplier` (number, optional, default 2): Exponential backoff multiplier
- `storageTarget` (string, optional): Storage backend name for saving captures
### `extractDom`
- `url` (string, required): Fully-qualified URL to inspect
- `selector` (string, optional): CSS selector to scope extraction to a specific element. Defaults to the entire document
- `headers` (object, optional): Key/value map of HTTP headers sent before navigation
- `cookies` (array, optional): Same cookie structure as `captureScreenshot`, applied before navigation
- `viewport` (object, optional): Same viewport configuration as `captureScreenshot`
- `retryPolicy` (object, optional): Same retry configuration as `captureScreenshot`
- `storageTarget` (string, optional): Storage backend name for saving DOM data
## Action Steps for captureScreenshot
The `captureScreenshot` tool supports a comprehensive `steps` array that allows you to perform various web interactions before capturing the screenshot. Each step is executed in sequence, allowing for complex automation scenarios.
### Fill Form (`fillForm`) - Recommended for Form Interactions
The `fillForm` step is the easiest and most LLM-friendly way to interact with forms. It auto-detects field types and handles multiple fields in a single step.
```json
{
"type": "fillForm",
"fields": [
{ "selector": "#email", "value": "[email protected]" },
{ "selector": "#password", "value": "secretpassword" },
{ "selector": "#country", "value": "us" },
{ "selector": "#newsletter", "value": "true" },
{ "selector": "#plan", "value": "premium", "type": "radio" }
],
"formSelector": "#signup-form",
"submit": true,
"submitSelector": "#submit-btn",
"waitForNavigation": true
}
Each field in the fields array supports:
selector (required): CSS selector for the form fieldvalue (required): Value to set. For checkboxes use "true" or "false". For selects/radios use the value attribute.type (optional): Field type hint (text, select, checkbox, radio, textarea, password, email, number, tel, url, date, file). Auto-detected if not specified.matchByText (optional): For select fields, match by visible text instead of value attributedelay (optional): Delay between keystrokes in ms (for text inputs)formSelector (optional): CSS selector for the form container (for scoping field selectors)submit (optional): Whether to submit the form after filling (default: false)submitSelector (optional): Selector for submit button. If not specified, uses form.submit() or looks for [type="submit"]waitForNavigation (optional): Whether to wait for navigation after submit (default: true)text)Type text into input fields:
{
"type": "text",
"selector": "#username",
"value": "[email protected]",
"clearFirst": true, // Clear existing text first (default: true)
"delay": 100, // Delay between keystrokes in ms (0-1000)
"pressEnter": false // Press Enter after typing (default: false)
}
select)Select an option from a dropdown:
{
"type": "select",
"selector": "#country",
"value": "us" // OR "text": "United States" OR "index": 0
}
radio)Select a radio button:
{
"type": "radio",
"selector": "input[type='radio']",
"value": "option1", // Value attribute of the radio button
"name": "preference" // Name attribute to identify the radio group
}
checkbox)Check or uncheck a checkbox:
{
"type": "checkbox",
"selector": "#agree-terms",
"checked": true // true to check, false to uncheck
}
click)Click on elements:
{
"type": "click",
"target": "button.submit",
"button": "left", // "left", "right", or "middle" (default: left)
"clickCount": 1, // 1=single, 2=double, 3=triple (default: 1)
"waitForNavigation": false, // Wait for page navigation (default: false)
"waitForSelector": ".modal-content" // Wait for element to appear after click
}
hover)Hover over elements:
{
"type": "hover",
"selector": ".dropdown-trigger",
"duration": 1000 // How long to maintain hover in ms (0-10000)
}
upload)Upload files:
{
"type": "upload",
"selector": "input[type='file']",
"filePaths": ["/path/to/file1.pdf", "/path/to/file2.jpg"]
}
submit)Submit forms:
{
"type": "submit",
"selector": "#contact-form", // Form element or submit button
"waitForNavigation": true // Wait for page navigation (default: true)
}
scroll)Scroll the page:
{
"type": "scroll",
"scrollTo": "#section-2", // Scroll to element (takes precedence)
"x": 0, // OR horizontal scroll position in pixels
"y": 500, // OR vertical scroll position in pixels
"behavior": "smooth" // "auto" or "smooth" (default: auto)
}
keypress)Press keyboard keys:
{
"type": "keypress",
"key": "Enter", // Key to press (e.g., "Enter", "Tab", "Escape", "ArrowDown")
"modifiers": ["Control", "Shift"], // Optional modifiers
"selector": "#search-box" // Optional element to focus first
}
waitForSelector) - DEPRECATEDUse wait step instead:
{ "type": "wait", "for": ".loading-complete", "timeout": 10000 }
delay) - DEPRECATEDUse wait step with duration instead:
{ "type": "wait", "duration": 2000 }
focus)Focus an element:
{
"type": "focus",
"selector": "#search-input"
}
blur)Blur (unfocus) an element:
{
"type": "blur",
"selector": "#search-input"
}
clear)Clear input field contents:
{
"type": "clear",
"selector": "#search-input"
}
evaluate)Execute custom JavaScript:
{
"type": "evaluate",
"script": "document.title = 'New Title'; return document.title;",
"selector": "#element" // Optional element to pass to the script
}
screenshot)Capture screenshot at any point:
{
"type": "screenshot",
"fullPage": true, // Capture entire page (optional)
"captureElement": ".specific-element" // Capture specific element (optional)
}
cookie)Set or delete browser cookies:
{
"type": "cookie",
"action": "set",
"name": "session_id",
"value": "abc123",
"domain": ".example.com",
"path": "/",
"secure": true
}
Supported actions: set (add/update cookie), delete (remove cookie).
storage)Manage localStorage/sessionStorage:
{
"type": "storage",
"storageType": "localStorage",
"action": "set",
"key": "user_preferences",
"value": "{\"theme\":\"dark\"}"
}
Supported actions: set (add/update), delete (remove key), clear (remove all items).
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/signup",
"steps": [
{ "type": "waitForSelector", "awaitElement": "#signup-form" },
{ "type": "text", "selector": "#email", "value": "[email protected]" },
{ "type": "text", "selector": "#password", "value": "SecurePass123!" },
{ "type": "select", "selector": "#country", "text": "United States" },
{ "type": "radio", "selector": "input[name='plan']", "value": "premium" },
{ "type": "checkbox", "selector": "#newsletter", "checked": true },
{ "type": "checkbox", "selector": "#terms", "checked": true },
{ "type": "hover", "selector": ".tooltip-trigger", "duration": 500 },
{ "type": "scroll", "y": 200 },
{ "type": "click", "target": "button[type='submit']", "waitForNavigation": true },
{ "type": "delay", "duration": 2000 },
{ "type": "screenshot", "fullPage": true }
]
}
}
If your capture fails, use these guidelines:
| Error | Solution |
|---|---|
| "element not found" | Check CSS selector, add waitForSelector before click/fillForm |
| "navigation timeout" | Increase retryPolicy.maxRetries or add delay step |
| "page not loaded" | Add waitForSelector or delay before screenshot |
| "click failed" | Ensure element is visible, add scroll to bring it into view |
The following viewport presets are available:
desktop-fhd: 1920x1080 Full HDdesktop-hd: 1280x720 HDdesktop-4k: 3840x2160 4Kmacbook-pro-16: MacBook Pro 16-inch Retinaipad-pro: iPad Pro 12.9-inchipad-pro-landscape: iPad Pro 12.9-inch (landscape)ipad: iPad 10.2-inchsurface-pro: Microsoft Surface Proiphone-14-pro-max: iPhone 14 Pro Maxiphone-14-pro: iPhone 14 Proiphone-se: iPhone SE (3rd generation)pixel-7-pro: Google Pixel 7 Progalaxy-s23-ultra: Samsung Galaxy S23 Ultra{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"viewport": {
"preset": "iphone-14-pro"
}
}
}
The tools automatically retry on transient failures with exponential backoff. Default retryable conditions:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"retryPolicy": {
"maxRetries": 5,
"initialDelayMs": 2000,
"backoffMultiplier": 1.5
}
}
}
The server emits structured telemetry events that can be consumed for monitoring and observability:
tool.invoked: Tool execution startedtool.completed: Tool execution succeededtool.failed: Tool execution failednavigation.started: Page navigation initiatednavigation.completed: Page navigation succeedednavigation.failed: Page navigation failedretry.attempt: Retry attempt startedretry.succeeded: Retry succeededbrowser.launched: Puppeteer browser startedbrowser.closed: Puppeteer browser closedscreenshot.captured: Screenshot takendom.extracted: DOM content extractedYou can configure telemetry hooks programmatically:
import { getGlobalTelemetry } from "mcp-page-capture";
const telemetry = getGlobalTelemetry();
// Configure HTTP sink for centralized collection
telemetry.configureHttpSink({
url: "https://telemetry.example.com/events",
headers: { "X-API-Key": "your-api-key" },
batchSize: 100,
flushIntervalMs: 5000,
});
// Register custom hooks
telemetry.registerHook({
name: "custom-logger",
enabled: true,
handler: async (event) => {
console.log(`[${event.type}]`, event.data);
},
});
Captures can be automatically saved to configurable storage backends:
import { registerStorageTarget, LocalStorageTarget } from "mcp-page-capture";
const localStorage = new LocalStorageTarget("/path/to/captures");
registerStorageTarget("local", localStorage);
import { registerStorageTarget, S3StorageTarget } from "mcp-page-capture";
const s3Storage = new S3StorageTarget({
bucket: "my-captures",
prefix: "screenshots/",
region: "us-west-2",
});
registerStorageTarget("s3", s3Storage);
import { registerStorageTarget, MemoryStorageTarget } from "mcp-page-capture";
const memoryStorage = new MemoryStorageTarget();
registerStorageTarget("memory", memoryStorage);
Then use the storage in tool invocations:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"storageTarget": "s3"
}
}
feat:, fix:, chore:) for automatic versioningmain branch<username>/mcp-page-captureghcr.io/<org>/mcp-page-captureConfigure these secrets in your GitHub repository settings:
NPM_TOKEN: npm access token with publish permissionsDOCKER_USERNAME: Docker Hub usernameDOCKER_PASSWORD: Docker Hub access tokenThe GITHUB_TOKEN is provided automatically by GitHub Actions.
Read CONTRIBUTING.md, open an issue describing the change, and submit a PR that includes npm run build output plus updated docs/tests.
Maintained by Saurabh Chase (@chasesaurabh). Reach out via issues or discussions for roadmap coordination.
Released under the MIT License.
Add this to claude_desktop_config.json and restart Claude Desktop.
{
"mcpServers": {
"chasesaurabh-mcp-page-capture": {
"command": "npx",
"args": []
}
}
}