mcp-crawl4ai-rag
If you are the rightful owner of mcp-crawl4ai-rag and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcpreview.com.
Crawl4AI RAG MCP Server is a powerful implementation of the Model Context Protocol (MCP) integrated with Crawl4AI and Supabase, providing AI agents and AI coding assistants with advanced web crawling and RAG capabilities.
The Crawl4AI RAG MCP Server enables AI agents to crawl websites, store content in a vector database, and perform RAG over the crawled content. It includes advanced RAG strategies like contextual embeddings, hybrid search, agentic RAG, reranking, and knowledge graph for AI hallucination detection. The server is designed to be integrated into Archon to create a comprehensive knowledge engine for AI coding assistants. Future improvements include support for multiple embedding models and enhanced chunking strategies.
Features
- Smart URL Detection: Automatically detects and handles different URL types.
- Recursive Crawling: Follows internal links to discover content.
- Parallel Processing: Efficiently crawls multiple pages simultaneously.
- Content Chunking: Intelligently splits content by headers and size for better processing.
- Vector Search: Performs RAG over crawled content with optional source filtering.
Tools
- crawl_single_page: Quickly crawl a single web page and store its content in the vector database.
- smart_crawl_url: Intelligently crawl a full website based on the type of URL provided.
- get_available_sources: Get a list of all available sources (domains) in the database.
- perform_rag_query: Search for relevant content using semantic search with optional source filtering.
- search_code_examples: Search specifically for code examples and their summaries from crawled documentation.
- parse_github_repository: Parse a GitHub repository into a Neo4j knowledge graph.
- check_ai_script_hallucinations: Analyze Python scripts for AI hallucinations by validating imports, method calls, and class usage.
- query_knowledge_graph: Explore and query the Neo4j knowledge graph.
Related MCP Servers
View all browser_automation servers →Fetch
by modelcontextprotocol
A Model Context Protocol server that provides web content fetching capabilities, enabling LLMs to retrieve and process content from web pages.
markdownify-mcp
by zcaceres
Markdownify is a Model Context Protocol (MCP) server that converts various file types and web content to Markdown format.
deepwiki-mcp
by regenrek
This is an unofficial Deepwiki MCP Server that processes Deepwiki URLs, crawls pages, converts them to Markdown, and returns documents or lists by page.
mcp-playwright
by executeautomation
A Model Context Protocol server that provides browser automation capabilities using Playwright.
fetch-mcp
by zcaceres
This MCP server provides functionality to fetch web content in various formats, including HTML, JSON, plain text, and Markdown.
web-eval-agent
by Operative-Sh
operative.sh's MCP Server is a tool for autonomous debugging of web applications directly from your code editor.
cursor-talk-to-figma-mcp
by sonnylazuardi
This project implements a Model Context Protocol (MCP) integration between Cursor AI and Figma, allowing Cursor to communicate with Figma for reading designs and modifying them programmatically.