README - Data Dictionary MCP by jonahkeegan

Data Dictionary MCP

A Model Context Protocol (MCP) server that coordinates AI agents to transform database tables into Wikipedia-style data dictionaries.

Overview

The Data Dictionary MCP project automates the conversion of various database formats into comprehensive, human-readable data dictionaries using AI-powered analysis and description. It leverages the Model Context Protocol (MCP) to coordinate AI agents for analyzing, describing, and verifying database structures.

Features

Multi-Format Support: Process JSON, CSV, and Plain Text files (with more formats planned)
AI-Powered Analysis: Generate field descriptions and identify relationships
MCP Integration: Coordinate AI agents using the Model Context Protocol
Schema Extraction: Extract database schemas from various formats into a unified representation
Wikipedia-Style Output: Present data dictionaries in a familiar, accessible format

Project Status

This project is in active development. See the Project Roadmap for details.

Getting Started

Prerequisites

Python 3.9+
Git
pip or poetry for dependency management

Installation

Clone the repository:

git clone https://github.com/jonahkeegan/data-dictionary-mcp.git
cd data-dictionary-mcp

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python src/main.py
```

Project Structure

data-dictionary-mcp/
├── docs/                  # Documentation
├── src/                   # Source code
│   ├── mcp/               # MCP server components
│   ├── analyzers/         # Format analyzers
│   ├── agents/            # Agent coordination
│   └── dictionary/        # Dictionary generation
├── tests/                 # Test suite
├── memory-bank/           # Cline memory bank
├── .gitignore
├── .clinerules            # Cline rules
├── README.md
└── requirements.txt

Project Roadmap

Milestone 1: MCP Server Foundation and Format Analyzers

Implement MCP server with basic tool definitions
Develop format analyzers for JSON, CSV, and Plain Text
Create schema extraction system
Implement unit tests for core components

Milestone 2: AI Agent Coordination and Field Description

Implement agent coordination system
Develop field description generation
Create task distribution and result aggregation
Add integration tests

Milestone 3: Content Verification and Publishing

Implement content validation
Develop Wikipedia-style formatting
Create export capabilities
Add end-to-end tests

Milestone 4: User Interface and Deployment

Develop web interface
Implement search capabilities
Add user feedback system
Create deployment infrastructure

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the .