databricks-mcp

databricks-mcp

0

MCP Databricks integrates AI assistants with Databricks through the Model Context Protocol, providing advanced tools for managing Databricks environments. It supports resource management, SQL query execution, and workspace organization, enhancing productivity for AI-assisted data tasks.

🚀 MCP Databricks

Python 3.11+ Databricks MCP Protocol

A powerful Databricks integration for AI assistants via Model Context Protocol

📖 Introduction

MCP Databricks seamlessly connects AI assistants to your Databricks workspaces through the Model Context Protocol (MCP). Built with Python, it provides a rich collection of tools for managing virtually every aspect of your Databricks environment.

With this server, AI assistants like Claude can:

  • 🔧 Manage compute resources with precision
  • 📊 Execute SQL queries and analyze results
  • 📁 Organize and manipulate workspace objects
  • ✨ And much more!

🔍 Prerequisites

  • 🐍 Python 3.11 or higher
  • 💻 A Databricks workspace
  • 🔑 Databricks Personal Access Token (PAT)
  • 📦 Required Python packages (installed in setup)

🚀 Quickstart

1️⃣ Clone the repository

git clone https://github.com/leminkhoa/databricks-mcp
cd databricks-mcp

2️⃣ Configure environment variables

Create a .env file in the project root with your Databricks credentials:

# Databricks API Configuration
DATABRICKS_HOST="https://adb-<your workspace uri>.azuredatabricks.net/"
DATABRICKS_TOKEN="dapi_<your_token_here>"

# Server Configuration
SERVER_HOST="0.0.0.0"
SERVER_PORT="8000"
DEBUG="false"
TRANSPORT="stdio"

# Logging Configuration
LOG_LEVEL="INFO"

💡 Tip: For further instructions, You can use the env.sample file as a template.

3️⃣ Choose Your Installation Method

You can use MCP Databricks in two ways:

Option A: Docker (Recommended for production)
  1. Build the Docker image:
docker build -t databricks-mcp .
  1. Configure in Cursor with the following mcp.json entry:
{
  "databricks-mcp-docker": {
    "command": "docker",
    "args": [
      "run",
      "--rm",
      "-i",
      "--name", "databricks-mcp",
      "--env-file", "<path/to/.env>",
      "databricks-mcp"
    ]
  }
}

💡 Note: Replace <path/to/.env> with the absolute path to your .env file.

Option B: Local Installation with uv
  1. Install uv (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh

💡 Note: See uv installation documentation for alternative installation methods.

  1. Set up virtual environment:
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies with uv:
uv sync
  1. Configure in Cursor with the following mcp.json entry:
{
  "databricks-mcp-stdio": {
    "command": "uv",
    "args": [
      "--directory",
      "<repository directory>",
      "run",
      "main.py"
    ]
  }
}

💡 Note: Replace <repository directory> with the absolute path to your cloned repository.

🚀 Usage

Running the MCP Server

If not using Docker or Cursor integration, start the server with:

python main.py

or

uv run main.py

Connecting to Claude or other MCP clients

This server uses the stdio transport for seamless compatibility with Claude Desktop and other MCP clients. After installing the server, you can immediately connect to it using your preferred MCP client.

🧰 Tools and Capabilities

The MCP Databricks server provides a comprehensive toolkit for managing your Databricks environment:

💻 Cluster Management

ToolDescription
list_clustersList all Databricks clusters in the workspace
create_clusterCreate a new Databricks cluster with customizable settings
delete_clusterDelete a Databricks cluster by ID
start_clusterStart a terminated Databricks cluster
list_node_typesList all available node types for Databricks clusters
list_spark_versionsList all available Spark versions for Databricks clusters
get_clusterGet detailed information about a specific Databricks cluster

📦 Library Management

ToolDescription
install_librariesInstall libraries (JAR, WHL, PyPI, Maven, CRAN, etc.) on a running cluster

🖥️ Command Execution

ToolDescription
execute_commandExecute a command (Python, Scala, SQL) on a running Databricks cluster
create_execution_contextCreate an execution context for interactive command sessions

📊 SQL Warehouse Management

ToolDescription
list_sql_warehousesList all SQL warehouses in the workspace
create_sql_warehouseCreate a new SQL warehouse with configurable size and settings

📁 Workspace Objects

ToolDescription
delete_workspace_objectDelete an object from the Databricks workspace
get_workspace_object_statusGet the status of an object in the Databricks workspace
import_workspace_objectImport an object (notebook, file, etc.) into the workspace
create_workspace_directoryCreate a directory in the Databricks workspace