mcp-rag-server
2.5
The mcp-rag-server is designed to facilitate Retrieval Augmented Generation (RAG) by indexing documents and serving relevant contexts via the Model Context Protocol (MCP). It features tools for document management and querying, supporting various embedding models for seamless integration with large language models.
mcp-rag-server
A Model Context Protocol (MCP) server enabling Retrieval Augmented Generation (RAG). It indexes documents and provides relevant context to Large Language Models via the MCP protocol.
Features
- Indexing of documents in various formats like
.txt
,.md
,.json
. - Customizable text chunk size.
- Local vector store management using SQLite.
- Support for multiple embedding providers (OpenAI, Ollama, etc.).
- Provides MCP tools and resources for integration.
Usage
Configure the environment with base LLM API, embedding model, and vector store path. Run the server via npm or npx, and use MCP tools like embedding_documents
or resources like rag://documents
to manage and query indexed content.
How RAG Works
- Indexing: Splits text into chunks, queues for embedding.
- Embedding: Processes chunks via API, stores vectors.
- Querying: Finds nearest text chunks to a query.