document-archive-mcp-server
This project is a Model Context Protocol (MCP) server implemented in Node.js and TypeScript. It supports document management and retrieval, providing tools for cross-document search and Q&A functionality. The server can be configured via environment variables and supports stdio and SSE transport methods.
MCP Server
A Model Context Protocol (MCP) server implementation using Node.js and TypeScript.
What is MCP?
The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs (Large Language Models). It allows for seamless communication between AI applications and various data sources and tools.
Features
This server implementation provides:
- Cross-document search tool for retrieving relevant information
- Document management with document listing and metadata resources
- Document Q&A prompt template for answering questions about documents
- Support for both stdio and SSE transport methods
- Comprehensive logging system
- Configuration via environment variables
- Readiness management to ensure all services are initialized before accepting requests
- Graceful shutdown handling for server termination
System Architecture

Getting Started
Prerequisites
- Node.js 22+ and npm
Installation
- Clone the repository
- Install dependencies:
npm install
- Build the project:
npm run build
Running the Server
Using stdio (for integration with MCP clients)
npm start
Using SSE (for web-based usage)
npm start -- --transport=sse
The server will start on port 3000 by default. You can change this by setting the PORT
environment variable or using the port argument:
npm start -- --transport=sse --port=8080
Or:
PORT=8080 npm start -- --transport=sse
API
Tools
-
getDocumentsMetadata
: Retrieves metadata of available documents- Parameters:
query
(string): The user's original query or question
- Parameters:
-
checkQueryRelevance
: Assesses relevant documents and returns instructions for intent analysis- Parameters:
query
(string): The user's original query or questionrelevantDocIds
(string): Comma-separated list of document IDs that appear relevant to the query
- Parameters:
-
analyzeIntent
: Takes intent classification and returns instructions for context retrieval- Parameters:
query
(string): The user's original query or questionintentType
(string): The classified intent type of the query
- Parameters:
-
retrieveContext
: Retrieves relevant passages from selected documents- Parameters:
query
(string): The user's original query or questionintentType
(string): The classified intent type of the queryrelevantDocIds
(string): Comma-separated list of document IDs to retrieve context from
- Parameters:
Resources
documents://metadata
: Returns a list of all document metadatadocuments://metadata/{id}
: Returns metadata for a specific document by ID
Prompts
document_qa
: Q&A about document content- Parameters:
question
(string): The user's question about the documentsrelevantDocIds
(string): Comma-separated list of relevant document IDs (from the archive)intentType
(string): The intent of the question (select from dropdown)
- Parameters:
Configuration
The server can be configured through environment variables:
SERVER_NAME
: Name of the server (default: "document-archive-mcp-server")SERVER_VERSION
: Server version (default: "1.0.0")ENABLE_RESOURCES
: Enable/disable resources (default: true)ENABLE_TOOLS
: Enable/disable tools (default: true)ENABLE_PROMPTS
: Enable/disable prompts (default: true)PORT
: Default port for SSE transport (default: 3000)TRANSPORT_TYPE
: Default transport type (default: "stdio")LOG_LEVEL
: Log level (default: "info")LOG_DIR
: Directory for log files (default: "logs")LOG_FILE
: Main log file name (default: "combined.log")ERROR_LOG_FILE
: Error log file name (default: "error.log")ENABLE_FILE_LOGGING
: Enable file logging (default: true in production, false in development)DOCUMENT_DIRECTORIES
: Comma-separated list of directories to watch for documentsDOCUMENT_CHUNK_SIZE
: Size of document chunks for processing (default: 1000)DOCUMENT_CHUNK_OVERLAP
: Overlap size for document chunks (default: 200)EMBEDDING_MODEL
: Embedding model to use (default: 'sentence-transformers/all-MiniLM-L6-v2')LANCEDB_PATH
: Path to LanceDB database (default: './vectordb')
Development
To run the server in development mode with auto-reloading:
npm run dev
For SSE transport in development mode:
npm run dev -- --transport=sse
License
MIT