mcp_server_for_databricks

mcp_server_for_databricks

0

This project offers a server interface to Databricks, facilitating data retrieval and management via FastMCP commands. It simplifies interaction with Databricks resources, enhancing efficiency through OAuth authentication and metadata provision, without the need for manual token management.

MCP Server for Databricks Interaction

Project Aim

This project provides a server application serving as an interface to a Databricks workspace using the FastMCP framework. The server facilitates querying and retrieving information about Databricks resources like schemas, tables, and job results via MCP commands. It enhances the efficiency of agents and reduces manual input requirements through OAuth authentication, metadata provisioning for catalogs and schemas, and sample data retrieval for tables.

Main Features

  • OAuth U2M authentication without local token storage.
  • Metadata provision for catalog objects using Databricks SDK.
  • Sample data retrieval for tables using SQL warehouse.

Installation and Initialization

Prerequisites

  • Python 3.x
  • uv package manager
  • databricks-cli accessible in PATH

Installation Steps

  1. Clone repository and navigate to directory.
  2. Create and activate virtual environment, then install dependencies.
  3. Run initialization script for initial configuration.
  4. Optionally configure in Cursor IDE for MCP Server integration.

How it Works

  • Utilizes FastMCP for tool definition and communication.
  • Interacts with Databricks via databricks-sdk for Python.
  • Uses databricks-cli for authentication and metadata retrieval.
  • Configuration is stored in config.yaml.

MCP Tools Provided

  • Retrieve schemas and tables within them.
  • Get detailed metadata for tables including samples.
  • Fetch schema metadata including table details.
  • Retrieve job run results, with options for latest or failed runs.

Known Issues

  • Requires server restart once JWT token expires after 12 hours.
  • get_table_sample_tool may return excessive context on wide tables.
  • Manual workspace changes required for multiple workspaces.