playwright-fetch

playwright-fetch

3.5

A Model Context Protocol server that provides web content fetching capabilities using Playwright for browser automation.

Top Comments

Playwright Fetch MCP Server

GitHub release (latest by date) CI codecov License: MIT Python Version Playwright MCP Code style: black

A Model Context Protocol server that provides web content fetching capabilities using Playwright for browser automation. This server enables LLMs to retrieve and process JavaScript-rendered content from web pages, converting HTML to markdown for easier consumption.

Author

Created by Wyatt Roersma with assistance from Claude Code.

Key Features

  • Browser Automation: Uses Playwright to render web pages with full JavaScript support
  • Content Extraction: Automatically identifies and extracts main content areas from web pages
  • Markdown Conversion: Converts HTML to clean, readable markdown
  • Pagination Support: Handles large content through pagination
  • Robots.txt Compliance: Respects robots.txt directives for autonomous fetching
  • Proxy Support: Allows routing requests through a proxy server
  • Docker Ready: Available as pre-built Docker images via Docker Hub and GitHub Container Registry

Available Tools

  • playwright-fetch - Fetches a URL using Playwright browser automation and extracts its contents as markdown.
    • url (string, required): URL to fetch
    • max_length (integer, optional): Maximum number of characters to return (default: 5000)
    • start_index (integer, optional): Start content from this character index (default: 0)
    • raw (boolean, optional): Get raw content without markdown conversion (default: false)
    • wait_for_js (boolean, optional): Wait for JavaScript to execute (default: true)

Prompts

  • playwright-fetch
    • Fetch a URL using Playwright and extract its contents as markdown
    • Arguments:
      • url (string, required): URL to fetch

Requirements

  • Python 3.13.2 or newer
  • uv package manager
  • Playwright browsers installed

Installation

1. Install with uv (recommended)

uv pip install git+https://github.com/ThreatFlux/playwright-fetch.git
# Install Playwright browsers
uv pip exec playwright install

Alternatively, clone the repository and install:

git clone https://github.com/ThreatFlux/playwright-fetch.git
cd playwright-fetch
uv pip install -e .
# Install Playwright browsers
uv pip exec playwright install

2. Using Docker

You can use our pre-built Docker images from Docker Hub or GitHub Container Registry:

# From Docker Hub
docker pull threatflux/playwright-fetch:latest

# From GitHub Container Registry
docker pull ghcr.io/threatflux/playwright-fetch:latest

Or build it yourself:

docker build -t threatflux/playwright-fetch .

Configuration

Configure for Claude.app

Add to your Claude settings:

Using uvx
"mcpServers": {
  "playwright-fetch": {
    "command": "uvx",
    "args": ["mcp-server-playwright-fetch"]
  }
}
Using docker
"mcpServers": {
  "playwright-fetch": {
    "command": "docker",
    "args": ["run", "-i", "--rm", "threatflux/playwright-fetch"]
  }
}

Configure for VS Code

For manual installation, add the following JSON block to your User Settings (JSON) file in VS Code.

Using uvx
{
  "mcp": {
    "servers": {
      "playwright-fetch": {
        "command": "uvx",
        "args": ["mcp-server-playwright-fetch"]
      }
    }
  }
}
Using Docker
{
  "mcp": {
    "servers": {
      "playwright-fetch": {
        "command": "docker",
        "args": ["run", "-i", "--rm", "threatflux/playwright-fetch"]
      }
    }
  }
}

Command Line Options

The server supports these command-line options:

  • --user-agent: Custom User-Agent string
  • --ignore-robots-txt: Ignore robots.txt restrictions
  • --proxy-url: Proxy URL to use for requests
  • --headless: Run browser in headless mode (default: True)
  • --wait-until: When to consider navigation succeeded (choices: "load", "domcontentloaded", "networkidle", "commit", default: "networkidle")

Example Usage

# Run with default settings
uv run mcp-server-playwright-fetch

# Run with a custom user agent and proxy
uv run mcp-server-playwright-fetch --user-agent="MyCustomAgent/1.0" --proxy-url="http://myproxy:8080"

# Run with visible browser for debugging
uv run mcp-server-playwright-fetch --headless=false

Debugging

You can use the MCP inspector to debug the server:

npx @modelcontextprotocol/inspector uvx mcp-server-playwright-fetch

Differences from Standard Fetch Server

This implementation differs from the standard fetch MCP server in these ways:

  1. Browser Automation: Uses Playwright to render JavaScript-heavy pages
  2. Content Extraction: Attempts to extract main content from common page structures
  3. Wait Options: Configurable page loading strategy (wait for load, DOM content, network idle)
  4. Visible Browser Option: Can run with a visible browser for debugging

License

This project is licensed under the MIT License. See the file for details.