mcp-page-capture is a Model Context Protocol (MCP) server that orchestrates headless Chromium via Puppeteer to capture pixel-perfect screenshots of arbitrary URLs. It is optimized for Copilot/MCP-enabled environments and can be embedded into automated workflows or run as a standalone developer tool. - ๐ธ High-fidelity screenshots powered by Puppeteer and headless Chromium - โ๏ธ LLM-optimized schema
Add this skill
npx mdskills install chasesaurabh/mcp-page-captureFeature-rich MCP server with excellent LLM-optimized schema and comprehensive browser automation capabilities

mcp-page-capture is a Model Context Protocol (MCP) server that orchestrates headless Chromium via Puppeteer to capture pixel-perfect screenshots of arbitrary URLs. It is optimized for Copilot/MCP-enabled environments and can be embedded into automated workflows or run as a standalone developer tool.
viewport, wait, fill, click, scroll, screenshottarget for elements, for for waiting, to for scrolling, device for viewportnpm start, npm run dev, or as a long-lived MCP sidecarcaptureScreenshot and extractDom tools.url, steps, headers, validateSee LLM Quick Reference for the 6 primary step types and common patterns.
For advanced features, see Advanced Steps.
| Step | Purpose | Key Parameters | Example |
|---|---|---|---|
viewport | Set device | device, width, height | { "type": "viewport", "device": "mobile" } |
wait | Wait for element/time | for OR duration, timeout | { "type": "wait", "for": ".loaded" } |
fill | Fill form field | target, value, submit | { "type": "fill", "target": "#email", "value": "a@b.com" } |
click | Click element | target, waitFor | { "type": "click", "target": "button", "waitFor": ".result" } |
scroll | Scroll page | to, y | { "type": "scroll", "to": "#footer" } |
screenshot | Capture (auto-added) | fullPage, element | { "type": "screenshot", "fullPage": true } |
Step Order: Auto-fixed! viewport auto-moves to first, screenshot auto-added at end.
High-level patterns that auto-expand to multiple steps:
// Login pattern
{
"type": "login",
"email": { "selector": "#email", "value": "user@example.com" },
"password": { "selector": "#password", "value": "secret" },
"submit": "button[type=submit]",
"successIndicator": ".dashboard"
}
// Search pattern
{
"type": "search",
"input": "#search-box",
"query": "MCP protocol",
"resultsIndicator": ".search-results"
}
git clone https://github.com/chasesaurabh/mcp-page-capture.git
npm install
# Run without installing globally
npx mcp-page-capture
# Or add it to your toolchain
npm install -g mcp-page-capture
mcp-page-capture
npm install
npm run build
npm start
For hot reload while iterating locally, run npm run dev.
If you need those guarantees, build and run via:
docker build -t mcp-page-capture .
docker run --rm -it mcp-page-capture
Otherwise you can keep using the standard npm scripts locally.
{
"mcpServers": {
"page-capture": {
"command": "node",
"args": ["dist/cli.js"]
}
}
}
Note: You may need to use the full path to dist/cli.js or node depending on your working directory and Node.js module resolution configuration.
If you want to embed the server inside another Node.js process, import the helpers exposed by the package:
import { startMcpPageCaptureServer } from "mcp-page-capture";
await startMcpPageCaptureServer();
// Optionally pass a custom Transport implementation if you don't want stdio.
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com"
}
}
Note: A screenshot is automatically captured at the end if no explicit screenshot step is provided.
The fill step auto-detects field types and handles text inputs, selects, checkboxes, and radio buttons.
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "fill", "target": "#search", "value": "MCP protocol", "submit": true },
{ "type": "wait", "for": ".search-results" },
{ "type": "screenshot" }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/login",
"steps": [
{ "type": "wait", "for": "#login-form" },
{ "type": "fill", "target": "#email", "value": "user@example.com" },
{ "type": "fill", "target": "#password", "value": "secretpassword" },
{ "type": "fill", "target": "#remember-me", "value": "true" },
{ "type": "click", "target": "button[type=submit]", "waitFor": ".dashboard" },
{ "type": "screenshot" }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://docs.modelcontextprotocol.io",
"steps": [
{ "type": "screenshot", "fullPage": true }
]
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/dashboard",
"headers": {
"authorization": "Bearer dev-token"
},
"steps": [
{
"type": "cookie",
"action": "set",
"name": "session",
"value": "abc123",
"path": "/secure"
},
{ "type": "screenshot" }
]
}
}
{
"tool": "extractDom",
"params": {
"url": "https://docs.modelcontextprotocol.io",
"selector": "main article"
}
}
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "viewport", "device": "ipad-pro" },
{ "type": "scroll", "to": "#main-content" },
{ "type": "screenshot" }
]
}
}
These are the only steps exposed to LLMs. They cover 95%+ of use cases:
| Step | Purpose | Parameters |
|---|---|---|
viewport | Set device/screen size | device, width, height |
wait | Wait for element/time | for OR duration, timeout |
fill | Fill form field | target, value, submit |
click | Click element | target, waitFor |
scroll | Scroll page | to (selector), y (pixels) |
screenshot | Capture (auto-added) | fullPage, element |
Use validate: true to check steps before execution:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"steps": [
{ "type": "fill", "target": "#email", "value": "test@example.com" },
{ "type": "click", "target": "button" }
],
"validate": true
}
}
Returns validation analysis including:
waitFor to click)These work at runtime but are not exposed in the LLM schema. Use the 6 primary steps instead:
| Deprecated | Use Instead |
|---|---|
quickFill | fill with submit: true |
fillForm | Multiple fill steps |
waitForSelector | wait with for parameter |
delay | wait with duration parameter |
fullPage | screenshot with fullPage: true |
These are for power users only and are not documented in the tool schema:
type, hover, cookie, storage, evaluate, keypress, focus, blur, clear, upload, submit
For backward compatibility, these parameters work at runtime but are not exposed in the LLM schema:
| Legacy Parameter | Canonical | Notes |
|---|---|---|
selector | target | Use target for element selectors |
awaitElement | for | Use for in wait steps |
scrollTo | to | Use to in scroll steps |
preset | device | Use device in viewport steps |
captureElement | element | Use element in screenshot steps |
waitAfter | wait | Use wait in click steps |
Deprecation warnings are logged when legacy parameters are used.
{
"content": [
{
"type": "text",
"text": "mcp-page-capture screenshot\nURL: https://example.com\nCaptured: 2025-12-13T08:30:12.713Z\nFull page: false\nViewport: 1280x720\nDocument: 1280x2000\nScroll position: (0, 0)\nSize: 45.2 KB\nSteps executed: 5"
},
{
"type": "image",
"mimeType": "image/png",
"data": "iVBORw0KGgoAAAANSUhEUgA..."
}
]
}
## Supported options
### `captureScreenshot`
- `url` (string, required): Fully-qualified URL to capture
- `headers` (object, optional): Key/value map of HTTP headers to send with the initial page navigation
- `cookies` (array, optional): List of cookies to set before navigation. Each cookie supports `name`, `value`, and optional `url`, `domain`, `path`, `secure`, `httpOnly`, `sameSite`, and `expires` (Unix timestamp, seconds)
- `viewport` (object, optional): Viewport configuration
- `preset` (string, optional): Use a predefined viewport preset (see Viewport Presets section)
- `width` (number, optional): Custom viewport width
- `height` (number, optional): Custom viewport height
- `deviceScaleFactor` (number, optional): Device scale factor (e.g., 2 for Retina)
- `isMobile` (boolean, optional): Whether to emulate mobile device
- `hasTouch` (boolean, optional): Whether to enable touch events
- `userAgent` (string, optional): Custom user agent string
- `retryPolicy` (object, optional): Retry configuration for transient failures
- `maxRetries` (number, optional, default 3): Maximum number of retry attempts
- `initialDelayMs` (number, optional, default 1000): Initial delay between retries
- `maxDelayMs` (number, optional, default 10000): Maximum delay between retries
- `backoffMultiplier` (number, optional, default 2): Exponential backoff multiplier
- `storageTarget` (string, optional): Storage backend name for saving captures
### `extractDom`
- `url` (string, required): Fully-qualified URL to inspect
- `selector` (string, optional): CSS selector to scope extraction to a specific element. Defaults to the entire document
- `headers` (object, optional): Key/value map of HTTP headers sent before navigation
- `cookies` (array, optional): Same cookie structure as `captureScreenshot`, applied before navigation
- `viewport` (object, optional): Same viewport configuration as `captureScreenshot`
- `retryPolicy` (object, optional): Same retry configuration as `captureScreenshot`
- `storageTarget` (string, optional): Storage backend name for saving DOM data
## Action Steps for captureScreenshot
The `captureScreenshot` tool supports a comprehensive `steps` array that allows you to perform various web interactions before capturing the screenshot. Each step is executed in sequence, allowing for complex automation scenarios.
### Fill Form (`fillForm`) - Recommended for Form Interactions
The `fillForm` step is the easiest and most LLM-friendly way to interact with forms. It auto-detects field types and handles multiple fields in a single step.
```json
{
"type": "fillForm",
"fields": [
{ "selector": "#email", "value": "user@example.com" },
{ "selector": "#password", "value": "secretpassword" },
{ "selector": "#country", "value": "us" },
{ "selector": "#newsletter", "value": "true" },
{ "selector": "#plan", "value": "premium", "type": "radio" }
],
"formSelector": "#signup-form",
"submit": true,
"submitSelector": "#submit-btn",
"waitForNavigation": true
}
Each field in the fields array supports:
selector (required): CSS selector for the form fieldvalue (required): Value to set. For checkboxes use "true" or "false". For selects/radios use the value attribute.type (optional): Field type hint (text, select, checkbox, radio, textarea, password, email, number, tel, url, date, file). Auto-detected if not specified.matchByText (optional): For select fields, match by visible text instead of value attributedelay (optional): Delay between keystrokes in ms (for text inputs)formSelector (optional): CSS selector for the form container (for scoping field selectors)submit (optional): Whether to submit the form after filling (default: false)submitSelector (optional): Selector for submit button. If not specified, uses form.submit() or looks for [type="submit"]waitForNavigation (optional): Whether to wait for navigation after submit (default: true)text)Type text into input fields:
{
"type": "text",
"selector": "#username",
"value": "john.doe@example.com",
"clearFirst": true, // Clear existing text first (default: true)
"delay": 100, // Delay between keystrokes in ms (0-1000)
"pressEnter": false // Press Enter after typing (default: false)
}
select)Select an option from a dropdown:
{
"type": "select",
"selector": "#country",
"value": "us" // OR "text": "United States" OR "index": 0
}
radio)Select a radio button:
{
"type": "radio",
"selector": "input[type='radio']",
"value": "option1", // Value attribute of the radio button
"name": "preference" // Name attribute to identify the radio group
}
checkbox)Check or uncheck a checkbox:
{
"type": "checkbox",
"selector": "#agree-terms",
"checked": true // true to check, false to uncheck
}
click)Click on elements:
{
"type": "click",
"target": "button.submit",
"button": "left", // "left", "right", or "middle" (default: left)
"clickCount": 1, // 1=single, 2=double, 3=triple (default: 1)
"waitForNavigation": false, // Wait for page navigation (default: false)
"waitForSelector": ".modal-content" // Wait for element to appear after click
}
hover)Hover over elements:
{
"type": "hover",
"selector": ".dropdown-trigger",
"duration": 1000 // How long to maintain hover in ms (0-10000)
}
upload)Upload files:
{
"type": "upload",
"selector": "input[type='file']",
"filePaths": ["/path/to/file1.pdf", "/path/to/file2.jpg"]
}
submit)Submit forms:
{
"type": "submit",
"selector": "#contact-form", // Form element or submit button
"waitForNavigation": true // Wait for page navigation (default: true)
}
scroll)Scroll the page:
{
"type": "scroll",
"scrollTo": "#section-2", // Scroll to element (takes precedence)
"x": 0, // OR horizontal scroll position in pixels
"y": 500, // OR vertical scroll position in pixels
"behavior": "smooth" // "auto" or "smooth" (default: auto)
}
keypress)Press keyboard keys:
{
"type": "keypress",
"key": "Enter", // Key to press (e.g., "Enter", "Tab", "Escape", "ArrowDown")
"modifiers": ["Control", "Shift"], // Optional modifiers
"selector": "#search-box" // Optional element to focus first
}
waitForSelector) - DEPRECATEDUse wait step instead:
{ "type": "wait", "for": ".loading-complete", "timeout": 10000 }
delay) - DEPRECATEDUse wait step with duration instead:
{ "type": "wait", "duration": 2000 }
focus)Focus an element:
{
"type": "focus",
"selector": "#search-input"
}
blur)Blur (unfocus) an element:
{
"type": "blur",
"selector": "#search-input"
}
clear)Clear input field contents:
{
"type": "clear",
"selector": "#search-input"
}
evaluate)Execute custom JavaScript:
{
"type": "evaluate",
"script": "document.title = 'New Title'; return document.title;",
"selector": "#element" // Optional element to pass to the script
}
screenshot)Capture screenshot at any point:
{
"type": "screenshot",
"fullPage": true, // Capture entire page (optional)
"captureElement": ".specific-element" // Capture specific element (optional)
}
cookie)Set or delete browser cookies:
{
"type": "cookie",
"action": "set",
"name": "session_id",
"value": "abc123",
"domain": ".example.com",
"path": "/",
"secure": true
}
Supported actions: set (add/update cookie), delete (remove cookie).
storage)Manage localStorage/sessionStorage:
{
"type": "storage",
"storageType": "localStorage",
"action": "set",
"key": "user_preferences",
"value": "{\"theme\":\"dark\"}"
}
Supported actions: set (add/update), delete (remove key), clear (remove all items).
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com/signup",
"steps": [
{ "type": "waitForSelector", "awaitElement": "#signup-form" },
{ "type": "text", "selector": "#email", "value": "user@example.com" },
{ "type": "text", "selector": "#password", "value": "SecurePass123!" },
{ "type": "select", "selector": "#country", "text": "United States" },
{ "type": "radio", "selector": "input[name='plan']", "value": "premium" },
{ "type": "checkbox", "selector": "#newsletter", "checked": true },
{ "type": "checkbox", "selector": "#terms", "checked": true },
{ "type": "hover", "selector": ".tooltip-trigger", "duration": 500 },
{ "type": "scroll", "y": 200 },
{ "type": "click", "target": "button[type='submit']", "waitForNavigation": true },
{ "type": "delay", "duration": 2000 },
{ "type": "screenshot", "fullPage": true }
]
}
}
If your capture fails, use these guidelines:
| Error | Solution |
|---|---|
| "element not found" | Check CSS selector, add waitForSelector before click/fillForm |
| "navigation timeout" | Increase retryPolicy.maxRetries or add delay step |
| "page not loaded" | Add waitForSelector or delay before screenshot |
| "click failed" | Ensure element is visible, add scroll to bring it into view |
The following viewport presets are available:
desktop-fhd: 1920x1080 Full HDdesktop-hd: 1280x720 HDdesktop-4k: 3840x2160 4Kmacbook-pro-16: MacBook Pro 16-inch Retinaipad-pro: iPad Pro 12.9-inchipad-pro-landscape: iPad Pro 12.9-inch (landscape)ipad: iPad 10.2-inchsurface-pro: Microsoft Surface Proiphone-14-pro-max: iPhone 14 Pro Maxiphone-14-pro: iPhone 14 Proiphone-se: iPhone SE (3rd generation)pixel-7-pro: Google Pixel 7 Progalaxy-s23-ultra: Samsung Galaxy S23 Ultra{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"viewport": {
"preset": "iphone-14-pro"
}
}
}
The tools automatically retry on transient failures with exponential backoff. Default retryable conditions:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"retryPolicy": {
"maxRetries": 5,
"initialDelayMs": 2000,
"backoffMultiplier": 1.5
}
}
}
The server emits structured telemetry events that can be consumed for monitoring and observability:
tool.invoked: Tool execution startedtool.completed: Tool execution succeededtool.failed: Tool execution failednavigation.started: Page navigation initiatednavigation.completed: Page navigation succeedednavigation.failed: Page navigation failedretry.attempt: Retry attempt startedretry.succeeded: Retry succeededbrowser.launched: Puppeteer browser startedbrowser.closed: Puppeteer browser closedscreenshot.captured: Screenshot takendom.extracted: DOM content extractedYou can configure telemetry hooks programmatically:
import { getGlobalTelemetry } from "mcp-page-capture";
const telemetry = getGlobalTelemetry();
// Configure HTTP sink for centralized collection
telemetry.configureHttpSink({
url: "https://telemetry.example.com/events",
headers: { "X-API-Key": "your-api-key" },
batchSize: 100,
flushIntervalMs: 5000,
});
// Register custom hooks
telemetry.registerHook({
name: "custom-logger",
enabled: true,
handler: async (event) => {
console.log(`[${event.type}]`, event.data);
},
});
Captures can be automatically saved to configurable storage backends:
import { registerStorageTarget, LocalStorageTarget } from "mcp-page-capture";
const localStorage = new LocalStorageTarget("/path/to/captures");
registerStorageTarget("local", localStorage);
import { registerStorageTarget, S3StorageTarget } from "mcp-page-capture";
const s3Storage = new S3StorageTarget({
bucket: "my-captures",
prefix: "screenshots/",
region: "us-west-2",
});
registerStorageTarget("s3", s3Storage);
import { registerStorageTarget, MemoryStorageTarget } from "mcp-page-capture";
const memoryStorage = new MemoryStorageTarget();
registerStorageTarget("memory", memoryStorage);
Then use the storage in tool invocations:
{
"tool": "captureScreenshot",
"params": {
"url": "https://example.com",
"storageTarget": "s3"
}
}
feat:, fix:, chore:) for automatic versioningmain branch/mcp-page-captureghcr.io//mcp-page-captureConfigure these secrets in your GitHub repository settings:
NPM_TOKEN: npm access token with publish permissionsDOCKER_USERNAME: Docker Hub usernameDOCKER_PASSWORD: Docker Hub access tokenThe GITHUB_TOKEN is provided automatically by GitHub Actions.
Read CONTRIBUTING.md, open an issue describing the change, and submit a PR that includes npm run build output plus updated docs/tests.
Maintained by Saurabh Chase (@chasesaurabh). Reach out via issues or discussions for roadmap coordination.
Released under the MIT License.
Install via CLI
npx mdskills install chasesaurabh/mcp-page-captureMCP Page Capture is a free, open-source AI agent skill. mcp-page-capture is a Model Context Protocol (MCP) server that orchestrates headless Chromium via Puppeteer to capture pixel-perfect screenshots of arbitrary URLs. It is optimized for Copilot/MCP-enabled environments and can be embedded into automated workflows or run as a standalone developer tool. - ๐ธ High-fidelity screenshots powered by Puppeteer and headless Chromium - โ๏ธ LLM-optimized schema
Install MCP Page Capture with a single command:
npx mdskills install chasesaurabh/mcp-page-captureThis downloads the skill files into your project and your AI agent picks them up automatically.
MCP Page Capture works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Gemini Cli, Amp, Roo Code, Goose. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.