mcp-crawl4ai-rag

573

Crawl4AI RAG MCP Server is a web crawling tool designed for AI agents and coding assistants. It provides advanced web scraping and retrieval capabilities by integrating with Supabase and supports multiple embedding models for enhanced RAG processes.

Overview

Crawl4AI RAG MCP Server provides AI agents with web crawling and Retrieval-Augmented Generation (RAG) capabilities using the Model Context Protocol (MCP). It integrates with Crawl4AI and Supabase to allow AI coding assistants to scrape and utilize web knowledge.

Features

  • Smart URL Detection: Automatically manages different URL types.
  • Recursive Crawling: Follows links to gather more information.
  • Parallel Processing: Efficient multi-page crawling.
  • Content Chunking: Divides content by headers for better processing.
  • Vector Search: Performs RAG with optional data source filtering.
  • Source Retrieval: Guides RAG by available filtered sources.

Vision

  • Integration with Archon for comprehensive AI coding assistance.
  • Support for multiple embedding models and local use with Ollama.
  • Advanced RAG strategies and optimized performance.

Tools

  1. crawl_single_page: Crawl single pages and store content.
  2. smart_crawl_url: Crawl websites based on URL type.
  3. get_available_sources: List all available data sources.
  4. perform_rag_query: Search for relevant content with semantic search.

Installation

Using Docker

  1. Clone the repository.
  2. Build the Docker image.

Using uv directly

  1. Clone the repository.
  2. Install uv and dependencies.

Running the Server

Using Docker

Proceeds by using docker run with specified environment files.

Using Python

Executed using uv run src/crawl4ai_mcp.py.