mcp-server-webcrawl
mcp-server-webcrawl is an open-source tool designed to integrate web crawling capabilities with AI language models using the Model Context Protocol. It offers flexible filtering, searching, and compatibility with various web crawlers for efficient web content analysis.
mcp-server-webcrawl
Bridge the gap between your web crawl and AI language models using Model Context Protocol (MCP). With mcp-server-webcrawl, your AI client filters and analyzes web content under your direction or autonomously. The server includes a full-text search interface with boolean support, resource filtering by type, HTTP status, and more.
Features
- Claude Desktop ready
- Multi-crawler compatible
- Filter by type, status, and more
- Boolean search support
- Support for Markdown and snippets
- Roll your own website knowledgebase
Supported Crawlers
- WARC
- wget
- InterroBot
- Katana
- SiteOne
Installation Requires Claude Desktop and Python (>=3.10). Install via pip.
Boolean Search Syntax Supports field-specific searches and complex boolean expressions. Familiarize yourself with the search syntax for efficient querying.