mcp-server-datahub
The mcp-server-datahub is a server implementation that integrates with DataHub to allow AI agents to query for metadata and context within data ecosystems. It supports searching and filtering across entity types and fetching detailed metadata, making it a powerful tool for managing complex data environments.
mcp-server-datahub
A Model Context Protocol server implementation for DataHub. This enables AI agents to query DataHub for metadata and context about your data ecosystem.
Supports both DataHub Core and DataHub Cloud.
Features
- Searching across all entity types and using arbitrary filters
- Fetching metadata for any entity
- Traversing the lineage graph, both upstream and downstream
- Listing SQL queries associated with a dataset
Demo
Check out the demo video, done in collaboration with the team at Block.
Usage
-
Install
uv
# On macOS and Linux. curl -LsSf https://astral.sh/uv/install.sh | sh
-
Locate your authentication details
For authentication, you'll need the following:
- The URL of your DataHub instance e.g.
https://tenant.acryl.io/gms
- A personal access token
Alternative: Using ~/.datahubenv for authentication
You can also use a
~/.datahubenv
file to configure your authentication. The easiest way to create this file is to rundatahub init
and follow the prompts.uvx --from acryl-datahub datahub init
- The URL of your DataHub instance e.g.
-
Configure your MCP client. See below - this will vary depending on your agent.
Claude Desktop
Run which uvx
to find the full path to the uvx
command.
In your claude_desktop_config.json
file, add the following:
{
"mcpServers": {
"datahub": {
"command": "<full-path-to-uvx>", // e.g. /Users/hsheth/.local/bin/uvx
"args": ["mcp-server-datahub"],
"env": {
"DATAHUB_GMS_URL": "<your-datahub-url>",
"DATAHUB_GMS_TOKEN": "<your-datahub-token>"
}
}
}
}
Cursor
In .cursor/mcp.json
, add the following:
{
"mcpServers": {
"datahub": {
"command": "uvx",
"args": ["mcp-server-datahub"],
"env": {
"DATAHUB_GMS_URL": "<your-datahub-url>",
"DATAHUB_GMS_TOKEN": "<your-datahub-token>"
}
}
}
}
Other MCP Clients
command: uvx
args:
- mcp-server-datahub
env:
DATAHUB_GMS_URL: <your-datahub-url>
DATAHUB_GMS_TOKEN: <your-datahub-token>
Troubleshooting
spawn uvx ENOENT
The full stack trace might look like this:
2025-04-08T19:58:16.593Z [datahub] [error] spawn uvx ENOENT {"stack":"Error: spawn uvx ENOENT\n at ChildProcess._handle.onexit (node:internal/child_process:285:19)\n at onErrorNT (node:internal/child_process:483:16)\n at process.processTicksAndRejections (node:internal/process/task_queues:82:21)"}
Solution: Replace the uvx
bit of the command with the output of which uvx
.
Developing
See .