DISCOVER THE FUTURE OF AI AGENTS

Playwright MCP

Added May 8, 2026
Agent & Tooling
Open Source
TypeScriptNode.jsModel Context ProtocolPlaywrightAI AgentsBrowser AutomationAgent & ToolingModel & Inference FrameworkDeveloper Tools & CodingAutomation, Workflow & RPAProtocol, API & Integration

An official Microsoft browser automation server based on the Model Context Protocol (MCP), providing LLMs with structured web interaction capabilities via accessibility tree snapshots.

Overview#

A browser automation server based on the Model Context Protocol (MCP) that provides LLMs with structured and deterministic web interaction capabilities via accessibility tree snapshots. Current version v0.0.75, maintained by Microsoft's official microsoft organization. Published as @playwright/mcp on npm. Primary languages: TypeScript (~62.2%), JavaScript (~31.9%). Requires Node.js ≥ 18.

Core Mechanism#

  • Accessibility Snapshot Driven: Uses Playwright's accessibility tree to generate structured page snapshots for LLM comprehension, rather than pixel screenshots.
  • No Vision Model Required: Pure structured data-driven approach, reducing dependency on multimodal vision models.
  • Deterministic Tool Calls: Avoids coordinate or semantic ambiguity from screenshot-based methods.
  • Lightweight & Fast: Bypasses heavyweight screenshot pipelines for rapid response.

Browser & Device Control#

  • Supports Chrome, Firefox, WebKit, and MS Edge.
  • Device emulation via --device parameter (e.g., "iPhone 15").
  • Proxy configuration via --proxy-server and --proxy-bypass.

Runtime & Connection Modes#

  • Multiple Context Modes: Persistent profile and isolated context modes.
  • Browser Extension Connection: Connect to already-open browser tabs, reusing authenticated session state.
  • Dual Transport Protocols: Default stdio transport; SSE/HTTP standalone endpoint via --port for remote connections.
  • Containerized Execution: Dockerfile provided for container deployment.
  • Code Generation: --codegen typescript generates corresponding TypeScript code.

Optional Capabilities (via --caps)#

CapabilityDescription
visionScreenshot-based coordinate-level interaction
pdfPDF file generation
devtoolsChrome DevTools Protocol integration
networkNetwork request interception and control
storageCookie and local storage state management
configRuntime dynamic configuration
testingTest assertion support

Client Ecosystem Compatibility#

Broadly compatible with 20+ MCP clients including Claude Desktop, Claude Code, VS Code, Cursor, Windsurf, Goose, Cline, Copilot, and Gemini CLI.

Architecture Highlights#

  • Control Engine: Core dependency on playwright and playwright-core (v1.61.0-alpha series) for cross-browser automation.
  • Protocol Layer: Built on @modelcontextprotocol/sdk (^1.25.2), implementing MCP standard interfaces with stdio and SSE transport layers.
  • Tool Grouping: Core automation, Tab management, Browser installation, Configuration/Network/Storage/DevTools, Coordinate-based/PDF generation/Test assertions.
  • Source Structure: CLI entry cli.js, library export index.js / index.d.ts, core implementation in src/.
  • Claude Code Integration: Includes .claude/skills directory for specific skill integration optimization.
  • Test Suite: Uses @playwright/test, test matrix covers Chrome, Firefox, WebKit, and Docker.
  • Security Note: Playwright MCP is not a security boundary itself; deploy following MCP Security Best Practices.

Installation#

Direct run (recommended):

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

Global install: npm install -g @playwright/mcp

Docker:

docker build --no-cache -t playwright-mcp-dev:latest .
docker run -it -p 8080:8080 --name playwright-mcp-dev playwright-mcp-dev:latest

Standalone HTTP server mode:

npx @playwright/mcp@latest --port 8931

Client config: {"url": "http://localhost:8931/mcp"}

Use Cases#

  1. AI Coding Assistant Browser Integration: Enable Claude Desktop, Cursor, VS Code Copilot to directly control browsers for testing, scraping, form filling.
  2. Exploratory Automation: Long-running autonomous web exploration using persistent browser contexts.
  3. Self-healing Tests: Avoid test failures from UI changes using structured accessibility snapshots.
  4. Automated Data Collection: LLMs extract structured data by understanding page structure.
  5. Headless/Remote Browser Automation: Run in display-less environments via SSE transport.
  6. Session Reuse: Connect to open tabs via browser extension, reusing authenticated sessions.

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.