An official Microsoft browser automation server based on the Model Context Protocol (MCP), providing LLMs with structured web interaction capabilities via accessibility tree snapshots.
Overview#
A browser automation server based on the Model Context Protocol (MCP) that provides LLMs with structured and deterministic web interaction capabilities via accessibility tree snapshots. Current version v0.0.75, maintained by Microsoft's official microsoft organization. Published as @playwright/mcp on npm. Primary languages: TypeScript (~62.2%), JavaScript (~31.9%). Requires Node.js ≥ 18.
Core Mechanism#
- Accessibility Snapshot Driven: Uses Playwright's accessibility tree to generate structured page snapshots for LLM comprehension, rather than pixel screenshots.
- No Vision Model Required: Pure structured data-driven approach, reducing dependency on multimodal vision models.
- Deterministic Tool Calls: Avoids coordinate or semantic ambiguity from screenshot-based methods.
- Lightweight & Fast: Bypasses heavyweight screenshot pipelines for rapid response.
Browser & Device Control#
- Supports Chrome, Firefox, WebKit, and MS Edge.
- Device emulation via
--deviceparameter (e.g., "iPhone 15"). - Proxy configuration via
--proxy-serverand--proxy-bypass.
Runtime & Connection Modes#
- Multiple Context Modes: Persistent profile and isolated context modes.
- Browser Extension Connection: Connect to already-open browser tabs, reusing authenticated session state.
- Dual Transport Protocols: Default stdio transport; SSE/HTTP standalone endpoint via
--portfor remote connections. - Containerized Execution: Dockerfile provided for container deployment.
- Code Generation:
--codegen typescriptgenerates corresponding TypeScript code.
Optional Capabilities (via --caps)#
| Capability | Description |
|---|---|
vision | Screenshot-based coordinate-level interaction |
pdf | PDF file generation |
devtools | Chrome DevTools Protocol integration |
network | Network request interception and control |
storage | Cookie and local storage state management |
config | Runtime dynamic configuration |
testing | Test assertion support |
Client Ecosystem Compatibility#
Broadly compatible with 20+ MCP clients including Claude Desktop, Claude Code, VS Code, Cursor, Windsurf, Goose, Cline, Copilot, and Gemini CLI.
Architecture Highlights#
- Control Engine: Core dependency on
playwrightandplaywright-core(v1.61.0-alpha series) for cross-browser automation. - Protocol Layer: Built on
@modelcontextprotocol/sdk(^1.25.2), implementing MCP standard interfaces with stdio and SSE transport layers. - Tool Grouping: Core automation, Tab management, Browser installation, Configuration/Network/Storage/DevTools, Coordinate-based/PDF generation/Test assertions.
- Source Structure: CLI entry
cli.js, library exportindex.js/index.d.ts, core implementation insrc/. - Claude Code Integration: Includes
.claude/skillsdirectory for specific skill integration optimization. - Test Suite: Uses
@playwright/test, test matrix covers Chrome, Firefox, WebKit, and Docker. - Security Note: Playwright MCP is not a security boundary itself; deploy following MCP Security Best Practices.
Installation#
Direct run (recommended):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
Global install: npm install -g @playwright/mcp
Docker:
docker build --no-cache -t playwright-mcp-dev:latest .
docker run -it -p 8080:8080 --name playwright-mcp-dev playwright-mcp-dev:latest
Standalone HTTP server mode:
npx @playwright/mcp@latest --port 8931
Client config: {"url": "http://localhost:8931/mcp"}
Use Cases#
- AI Coding Assistant Browser Integration: Enable Claude Desktop, Cursor, VS Code Copilot to directly control browsers for testing, scraping, form filling.
- Exploratory Automation: Long-running autonomous web exploration using persistent browser contexts.
- Self-healing Tests: Avoid test failures from UI changes using structured accessibility snapshots.
- Automated Data Collection: LLMs extract structured data by understanding page structure.
- Headless/Remote Browser Automation: Run in display-less environments via SSE transport.
- Session Reuse: Connect to open tabs via browser extension, reusing authenticated sessions.