data-dictionary-mcp
The Data Dictionary MCP is a Model Context Protocol server designed to transform database tables into Wikipedia-style data dictionaries. It coordinates AI agents to analyze, describe, and verify database structures, supporting multiple formats like JSON and CSV.
Data Dictionary MCP
A Model Context Protocol (MCP) server that coordinates AI agents to transform database tables into Wikipedia-style data dictionaries.
Overview
The Data Dictionary MCP project automates the conversion of various database formats into comprehensive, human-readable data dictionaries using AI-powered analysis and description. It leverages the Model Context Protocol (MCP) to coordinate AI agents for analyzing, describing, and verifying database structures.
Features
- Multi-Format Support: Process JSON, CSV, and Plain Text files (with more formats planned)
- AI-Powered Analysis: Generate field descriptions and identify relationships
- MCP Integration: Coordinate AI agents using the Model Context Protocol
- Schema Extraction: Extract database schemas from various formats into a unified representation
- Wikipedia-Style Output: Present data dictionaries in a familiar, accessible format
Project Status
This project is in active development. See the Project Roadmap for details.
Getting Started
Prerequisites
- Python 3.9+
- Git
- pip or poetry for dependency management
Installation
-
Clone the repository:
git clone https://github.com/jonahkeegan/data-dictionary-mcp.git cd data-dictionary-mcp
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python src/main.py
Project Structure
data-dictionary-mcp/
├── docs/ # Documentation
├── src/ # Source code
│ ├── mcp/ # MCP server components
│ ├── analyzers/ # Format analyzers
│ ├── agents/ # Agent coordination
│ └── dictionary/ # Dictionary generation
├── tests/ # Test suite
├── memory-bank/ # Cline memory bank
├── .gitignore
├── .clinerules # Cline rules
├── README.md
└── requirements.txt
Project Roadmap
Milestone 1: MCP Server Foundation and Format Analyzers
- Implement MCP server with basic tool definitions
- Develop format analyzers for JSON, CSV, and Plain Text
- Create schema extraction system
- Implement unit tests for core components
Milestone 2: AI Agent Coordination and Field Description
- Implement agent coordination system
- Develop field description generation
- Create task distribution and result aggregation
- Add integration tests
Milestone 3: Content Verification and Publishing
- Implement content validation
- Develop Wikipedia-style formatting
- Create export capabilities
- Add end-to-end tests
Milestone 4: User Interface and Deployment
- Develop web interface
- Implement search capabilities
- Add user feedback system
- Create deployment infrastructure
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is open source and available under the .