MCPDocSearch

MCPDocSearch

12

The Documentation Crawler & MCP Server project provides a comprehensive toolset to crawl websites, generate Markdown documentation, and enable semantic search using a Model Context Protocol (MCP) server. It integrates seamlessly with tools like Cursor, offering robust management and query capabilities for documentation content.

What is the purpose of the cache file in the MCP server?

The cache file stores processed document chunks and embeddings, allowing the server to start up faster by loading from the cache instead of reprocessing the Markdown files.

How can I refine my web crawls?

You can refine web crawls using options like --include-pattern, --exclude-pattern, and --max-depth to control which URLs are followed and how deep the crawl goes.

What happens if a Markdown file in ./storage/ is modified?

If a Markdown file is modified, the cache is automatically invalidated and regenerated to ensure the server uses the most up-to-date content.

Can the MCP server be used without Cursor?

While the MCP server is designed for integration with Cursor, it can potentially be used with other MCP clients that support the stdio transport.

What are the hardware requirements for running the MCP server?

The server can run on CPU-only systems, but systems with a compatible GPU (CUDA or Apple Silicon/MPS) will process embeddings much faster.