mcp-server-datahub

mcp-server-datahub

29

The mcp-server-datahub is a server implementation that integrates with DataHub to allow AI agents to query for metadata and context within data ecosystems. It supports searching and filtering across entity types and fetching detailed metadata, making it a powerful tool for managing complex data environments.

mcp-server-datahub

A Model Context Protocol server implementation for DataHub. This enables AI agents to query DataHub for metadata and context about your data ecosystem.

Supports both DataHub Core and DataHub Cloud.

Features

  • Searching across all entity types and using arbitrary filters
  • Fetching metadata for any entity
  • Traversing the lineage graph, both upstream and downstream
  • Listing SQL queries associated with a dataset

Demo

Check out the demo video, done in collaboration with the team at Block.

Usage

  1. Install uv

    # On macOS and Linux.
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Locate your authentication details

    For authentication, you'll need the following:

    Alternative: Using ~/.datahubenv for authentication

    You can also use a ~/.datahubenv file to configure your authentication. The easiest way to create this file is to run datahub init and follow the prompts.

    uvx --from acryl-datahub datahub init
    
  3. Configure your MCP client. See below - this will vary depending on your agent.

Claude Desktop

Run which uvx to find the full path to the uvx command.

In your claude_desktop_config.json file, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "<full-path-to-uvx>",  // e.g. /Users/hsheth/.local/bin/uvx
      "args": ["mcp-server-datahub"],
      "env": {
        "DATAHUB_GMS_URL": "<your-datahub-url>",
        "DATAHUB_GMS_TOKEN": "<your-datahub-token>"
      }
    }
  }
}

Cursor

In .cursor/mcp.json, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "uvx",
      "args": ["mcp-server-datahub"],
      "env": {
        "DATAHUB_GMS_URL": "<your-datahub-url>",
        "DATAHUB_GMS_TOKEN": "<your-datahub-token>"
      }
    }
  }
}

Other MCP Clients

command: uvx
args:
  - mcp-server-datahub
env:
  DATAHUB_GMS_URL: <your-datahub-url>
  DATAHUB_GMS_TOKEN: <your-datahub-token>

Troubleshooting

spawn uvx ENOENT

The full stack trace might look like this:

2025-04-08T19:58:16.593Z [datahub] [error] spawn uvx ENOENT {"stack":"Error: spawn uvx ENOENT\n    at ChildProcess._handle.onexit (node:internal/child_process:285:19)\n    at onErrorNT (node:internal/child_process:483:16)\n    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)"}

Solution: Replace the uvx bit of the command with the output of which uvx.

Developing

See .