mcp-server-fetch
Fetch MCP Server is a Model Context Protocol server that provides web content fetching capabilities using browser automation, OCR, and multiple extraction methods.
Top Comments
Fetch MCP Server is designed to enable Language Model Models (LLMs) to retrieve and process content from web pages, even those requiring JavaScript rendering or employing techniques that prevent simple scraping. It uses a combination of browser automation, OCR, and various extraction methods to ensure comprehensive content retrieval. The server employs a sophisticated scoring system to select the best content, ensuring high-quality and reliable results. Debug logging is available to track scoring decisions, making it a robust tool for content extraction.
Features
- Browser automation with undetected-chromedriver for dynamic content rendering.
- OCR using pytesseract with layout detection for text extraction from images.
- HTML extraction using requests and BeautifulSoup for static content.
- Document parsing capabilities for formats like PDF, DOCX, and PPTX.
- Sophisticated scoring system to ensure high-quality content selection.
MCP Tools
- {'fetch': 'Fetches a URL from the internet using browser automation and multi-method extraction, including OCR.'}
Usage with Different Platforms
docker_installation
To install and run mcp-server-fetch using Docker, follow these steps:
1. **Build the Docker image:**
bash
docker build -t mcp-server-fetch .
2. **Run the Docker container:**
bash
docker run --rm -i mcp-server-fetch
claude_configuration
Add to your Claude settings:
{
"mcpServers": {
"fetch": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"mcp-server-fetch"
],
"disabled": false,
"alwaysAllow": []
}
}
}
Related MCP Servers
View all browser_automation servers →Fetch
by modelcontextprotocol
Fetch MCP Server is designed to help language models retrieve web content by converting HTML to markdown for easier consumption. It includes features like content truncation, chunk reading, and customizable user-agent settings, making it highly adaptable for various web scraping tasks.