databricks-mcp
MCP Databricks integrates AI assistants with Databricks through the Model Context Protocol, providing advanced tools for managing Databricks environments. It supports resource management, SQL query execution, and workspace organization, enhancing productivity for AI-assisted data tasks.
🚀 MCP Databricks
A powerful Databricks integration for AI assistants via Model Context Protocol
📖 Introduction
MCP Databricks seamlessly connects AI assistants to your Databricks workspaces through the Model Context Protocol (MCP). Built with Python, it provides a rich collection of tools for managing virtually every aspect of your Databricks environment.
With this server, AI assistants like Claude can:
- 🔧 Manage compute resources with precision
- 📊 Execute SQL queries and analyze results
- 📁 Organize and manipulate workspace objects
- ✨ And much more!
🔍 Prerequisites
- 🐍 Python 3.11 or higher
- 💻 A Databricks workspace
- 🔑 Databricks Personal Access Token (PAT)
- 📦 Required Python packages (installed in setup)
🚀 Quickstart
1️⃣ Clone the repository
git clone https://github.com/leminkhoa/databricks-mcp
cd databricks-mcp
2️⃣ Configure environment variables
Create a .env
file in the project root with your Databricks credentials:
# Databricks API Configuration
DATABRICKS_HOST="https://adb-<your workspace uri>.azuredatabricks.net/"
DATABRICKS_TOKEN="dapi_<your_token_here>"
# Server Configuration
SERVER_HOST="0.0.0.0"
SERVER_PORT="8000"
DEBUG="false"
TRANSPORT="stdio"
# Logging Configuration
LOG_LEVEL="INFO"
💡 Tip: For further instructions, You can use the
env.sample
file as a template.
3️⃣ Choose Your Installation Method
You can use MCP Databricks in two ways:
Option A: Docker (Recommended for production)
- Build the Docker image:
docker build -t databricks-mcp .
- Configure in Cursor with the following
mcp.json
entry:
{
"databricks-mcp-docker": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--name", "databricks-mcp",
"--env-file", "<path/to/.env>",
"databricks-mcp"
]
}
}
💡 Note: Replace
<path/to/.env>
with the absolute path to your.env
file.
Option B: Local Installation with uv
- Install uv (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh
💡 Note: See uv installation documentation for alternative installation methods.
- Set up virtual environment:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies with uv:
uv sync
- Configure in Cursor with the following
mcp.json
entry:
{
"databricks-mcp-stdio": {
"command": "uv",
"args": [
"--directory",
"<repository directory>",
"run",
"main.py"
]
}
}
💡 Note: Replace
<repository directory>
with the absolute path to your cloned repository.
🚀 Usage
Running the MCP Server
If not using Docker or Cursor integration, start the server with:
python main.py
or
uv run main.py
Connecting to Claude or other MCP clients
This server uses the stdio transport for seamless compatibility with Claude Desktop and other MCP clients. After installing the server, you can immediately connect to it using your preferred MCP client.
🧰 Tools and Capabilities
The MCP Databricks server provides a comprehensive toolkit for managing your Databricks environment:
💻 Cluster Management
Tool | Description |
---|---|
list_clusters | List all Databricks clusters in the workspace |
create_cluster | Create a new Databricks cluster with customizable settings |
delete_cluster | Delete a Databricks cluster by ID |
start_cluster | Start a terminated Databricks cluster |
list_node_types | List all available node types for Databricks clusters |
list_spark_versions | List all available Spark versions for Databricks clusters |
get_cluster | Get detailed information about a specific Databricks cluster |
📦 Library Management
Tool | Description |
---|---|
install_libraries | Install libraries (JAR, WHL, PyPI, Maven, CRAN, etc.) on a running cluster |
🖥️ Command Execution
Tool | Description |
---|---|
execute_command | Execute a command (Python, Scala, SQL) on a running Databricks cluster |
create_execution_context | Create an execution context for interactive command sessions |
📊 SQL Warehouse Management
Tool | Description |
---|---|
list_sql_warehouses | List all SQL warehouses in the workspace |
create_sql_warehouse | Create a new SQL warehouse with configurable size and settings |
📁 Workspace Objects
Tool | Description |
---|---|
delete_workspace_object | Delete an object from the Databricks workspace |
get_workspace_object_status | Get the status of an object in the Databricks workspace |
import_workspace_object | Import an object (notebook, file, etc.) into the workspace |
create_workspace_directory | Create a directory in the Databricks workspace |