crawl4ai-mcp-server
Crawl4ai MCP Server provides web crawling capabilities using crawl4ai with markdown output for LLM.
Top Comments
The Crawl4ai MCP Server is designed to facilitate web crawling tasks by leveraging the capabilities of the crawl4ai platform. It outputs the crawled data in markdown format, making it suitable for integration with language learning models (LLM). The server is built on Node.js and requires access to a crawl4ai instance. It supports crawling multiple URLs and returns the content with proper citations, formatted in markdown. The server is configurable through environment variables, allowing for customization of the API URL and authentication token if needed. It also includes error handling mechanisms to manage common issues such as invalid URLs, authentication errors, and network connectivity problems.
Features
- {'name': 'Web Crawling', 'description': 'Crawl web pages and retrieve content in markdown format with citations.'}
- {'name': 'Markdown Output', 'description': 'Outputs crawled data in markdown format, suitable for LLM integration.'}
- {'name': 'Error Handling', 'description': 'Includes mechanisms to handle common errors such as invalid URLs and network issues.'}
- {'name': 'Authentication Support', 'description': 'Supports optional authentication for secure access to the crawl4ai API.'}
- {'name': 'Development Mode', 'description': 'Offers a development mode with auto-rebuild for easier testing and debugging.'}
MCP Tools
- crawl_urls: Crawl web pages and get markdown content with citations. Requires a list of URLs.
Usage with Different Platforms
nodejs
bash
git clone https://github.com/Kirill812/crawl4ai-mcp-server.git
cd crawl4ai-mcp-server
npm install
npm run build
configuration
{
"mcpServers": {
"crawl4ai": {
"command": "node",
"args": [
"/path/to/crawl4ai-mcp-server/build/index.js"
],
"env": {
"CRAWL4AI_API_URL": "http://127.0.0.1:11235",
"CRAWL4AI_AUTH_TOKEN": "your-auth-token"
}
}
}
}
Frequently Asked Questions
What should I do if I encounter a timeout error?
Try reducing the number of URLs per request to avoid timeout errors.
How can I ensure my authentication token is valid?
Verify the token with the crawl4ai API service and ensure it has not expired.
What happens if a website blocks the crawling request?
The service will automatically handle retries with different user agents to bypass blocks.
Related MCP Servers
View all browser_automation servers →Fetch
by modelcontextprotocol
Fetch MCP Server is designed to help language models retrieve web content by converting HTML to markdown for easier consumption. It includes features like content truncation, chunk reading, and customizable user-agent settings, making it highly adaptable for various web scraping tasks.