gxtract
GXtract is an MCP server developed to enhance integration with the GroundX platform through VS Code and compatible systems. It offers tools for document interaction, efficient caching, and a streamlined setup process utilizing modern programming techniques. The server ensures performance and secure communication, focusing on user-friendly configuration and development.
GXtract MCP Server

GXtract is a Model Context Protocol (MCP) server designed to integrate with VS Code and other compatible editors. It provides a suite of tools for interacting with the GroundX platform, enabling you to leverage its powerful document understanding capabilities directly within your development environment.
Table of Contents
- Features
- Architecture
- Prerequisites
- Installing UV
- Quick Start: VS Code Integration
- Available Tools
- Configuration
- Development
- Documentation
- Cache Management
- Dependency Management
- Versioning
- License
Features
- GroundX Integration: Access GroundX functionalities like document search, querying, and semantic object explanation.
- MCP Compliant: Built for use with VS Code's MCP client and other MCP-compatible systems.
- Efficient and Modern: Developed with Python 3.12+ and FastMCP v2 for performance.
- Easy to Configure: Simple setup for VS Code.
- Caching: In-memory cache for GroundX metadata to improve performance and reduce API calls.
Architecture
The high-level system architecture of GXtract illustrates how the components interact:
graph TB
subgraph "Client"
VSC[VS Code / Editor]
end
subgraph "GXtract MCP Server"
MCP[MCP Interface<br>stdio/http]
Server[GXtract Server]
Cache[Metadata Cache]
Tools[Tool Implementations]
end
subgraph "External Services"
GXAPI[GroundX API]
end
VSC -->|MCP Protocol| MCP
MCP --> Server
Server --> Tools
Tools -->|Query| GXAPI
Tools -->|Read/Write| Cache
Cache -.->|Refresh| GXAPI
This diagram shows:
- Client Integration: VS Code communicates with GXtract using the MCP protocol
- Transport Layer: Supports both stdio (for direct VS Code integration) and HTTP transport
- Core Components: Server manages tool registration and requests
- Caching Layer: Maintains metadata to reduce API calls
- Tool Implementation: Provides specialized functions for interacting with GroundX
- API Communication: Secure connection to GroundX platform
For more detailed architecture information, see the full documentation.
Prerequisites
- Python 3.12 or higher.
- UV (Python package manager): Version 0.7.6 or higher. You can install it from astral.sh/uv.
- GroundX API Key: You need a valid API key from the GroundX Dashboard.
Installing UV
Before you can use GXtract, you need to install UV (version 0.7.6 or higher), a modern Python package manager written in Rust that offers significant performance improvements over traditional tools.
Quick Installation Methods
Windows (PowerShell 7):
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
macOS and Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
Alternative Installation Methods
Using pip:
pip install --upgrade uv
Using Homebrew (macOS):
brew install uv
Using pipx (isolated environment):
pipx install uv
After installation, verify that UV is working correctly:
uv --version
This should display version 0.7.6 or higher. For more information about UV, visit the official documentation.
Quick Start: VS Code Integration
-
Clone the GXtract Repository:
git clone <repository_url> # Replace <repository_url> with the actual URL cd gxtract
-
Install Dependencies using UV: Open a terminal in the
gxtract
project directory and run:uv sync
This command creates a virtual environment (if one doesn't exist or isn't active) and installs all necessary dependencies specified in
pyproject.toml
anduv.lock
. -
Set GroundX API Key: The GXtract server requires your GroundX API key. You need to make this key available as an environment variable named
GROUNDX_API_KEY
. VS Code will pass this environment variable to the server based on the configuration below. EnsureGROUNDX_API_KEY
is set in the environment where VS Code is launched, or configure your shell profile (e.g.,.bashrc
,.zshrc
, PowerShell Profile) to set it.Option 1: Using Environment Variables (as shown above)
This approach reads the API key from your system environment variables:
"env": { "GROUNDX_API_KEY": "${env:GROUNDX_API_KEY}" }
Option 2: Using VS Code's Secure Inputs
VS Code can prompt for your API key and store it securely. Add this to your
settings.json
:"inputs": [ { "type": "promptString", "id": "groundx-api-key", "description": "GroundX API Key", "password": true } ]
Then reference it in your server configuration:
"env": { "GROUNDX_API_KEY": "${input:groundx-api-key}" }
With this approach, VS Code will prompt you for the API key the first time it launches the server, then store it securely in your system's credential manager (Windows Credential Manager, macOS Keychain, or similar).
-
Configure VS Code
settings.json
: Open your VS Codesettings.json
file (Ctrl+Shift+P, then search for "Preferences: Open User Settings (JSON)"). Add or update themcp.servers
configuration:"mcp": { "servers": { "gxtract": { // You can name this server entry as you like, i.e. GXtract "command": "uv", "type": "stdio", // 💡 http is also supported but VS Code only supports stdio currently "args": [ // Adjust the path to your gxtract project directory if it's different "--directory", "DRIVE:\\path\\to\\your\\gxtract", // Example: C:\\Users\\yourname\\projects\\gxtract "--project", "DRIVE:\\path\\to\\your\\gxtract", // Example: C:\\Users\\yourname\\projects\\gxtract "run", "gxtract", // This matches the script name in pyproject.toml "--transport", "stdio" // 💡 Ensure this matches the "type" above ], "env": { // Option 1: Using environment variables (system-wide) "GROUNDX_API_KEY": "${env:GROUNDX_API_KEY}" // Option 2: Using secure VS Code input (uncomment to use) // "GROUNDX_API_KEY": "${input:groundx-api-key}" } } } }
If using Option 2 (secure inputs), add this section (
settings.json
):// 💡 Only needed for Option 2 (secure inputs) "inputs": [ { "type": "promptString", "id": "groundx-api-key", "description": "GroundX API Key", "password": true } ]
Important:
- Replace
"DRIVE:\\path\\to\\your\\gxtract"
with the absolute path to thegxtract
directory on your system. - The
"command": "uv"
assumesuv
is in your system's PATH. If not, you might need to provide the full path to theuv
executable. - The server name
"GXtract"
insettings.json
is how it will appear in VS Code's MCP interface.
- Replace
-
Reload VS Code: After saving
settings.json
, you might need to reload VS Code (Ctrl+Shift+P, "Developer: Reload Window") for the changes to take effect. -
Using GXtract Tools: Once configured, you can access GXtract's tools through VS Code's MCP features (e.g., via chat
@
mentions if your VS Code version supports it, or other MCP integrations).
Available Tools
GXtract provides the following tools for interacting with GroundX:
groundx/searchDocuments
: Search for documents within your GroundX projects.groundx/queryDocument
: Ask specific questions about a document in GroundX.groundx/explainSemanticObject
: Get explanations for diagrams, tables, or other semantic objects within documents.cache/refreshMetadataCache
: Manually refresh the GroundX metadata cache.cache/refreshCachedResources
: Manually refresh the GroundX projects and buckets cache.cache/getCacheStatistics
: Get statistics about the cached metadata.cache/listCachedResources
: List all currently cached GroundX resources (projects, buckets).
Configuration
The server can be configured via command-line arguments when run directly. When used via VS Code, these are typically set in the args
array in settings.json
.
--transport {stdio|http}
: Communication transport type (default:http
, butstdio
is used for VS Code).--host TEXT
: Host address for HTTP transport (default:127.0.0.1
).--port INTEGER
: Port for HTTP transport (default:8080
).--log-level {DEBUG|INFO|WARNING|ERROR|CRITICAL}
: Logging level (default:INFO
).--log-format {text|json}
: Log output format (default:text
).--disable-cache
: Disable the GroundX metadata cache.--cache-ttl INTEGER
: Cache Time-To-Live in seconds (default:3600
).
API Key Security
The GroundX API key is sensitive information that should be handled securely. GXtract supports several approaches to provide this key:
-
Environment Variables (recommended for development):
- Set
GROUNDX_API_KEY
in your system or shell environment - VS Code will pass it to the server using
${env:GROUNDX_API_KEY}
in settings.json
- Set
-
VS Code Secure Storage (recommended for shared workstations):
- Configure VS Code to prompt for the key and store it securely
- Uses your system's credential manager (Windows Credential Manager, macOS Keychain)
- Setup using the
inputs
section in settings.json as shown in the Quick Start
-
Direct Environment Variable in VS Code settings (not recommended):
- It's possible to set the key directly in settings.json:
"GROUNDX_API_KEY": "your-api-key-here"
- This is not recommended as it stores the key in plaintext in your settings.json file
- It's possible to set the key directly in settings.json:
Always ensure your API key is not committed to source control or shared with unauthorized users.
Development
To set up for development:
- Clone the repository.
- Navigate to the
gxtract
directory. - Create and activate a virtual environment using
uv
:uv venv # Create virtual environment in .venv
- Activate with Windows PowerShell:
.\.venv\Scripts\Activate.ps1
- Activate with Linux/macOS bash/zsh:
source .venv/bin/activate
- Activate with Windows PowerShell:
- Install main project dependencies into the virtual environment:
Development tools (like Ruff, Pytest, Sphinx, etc.) are managed by Hatch and will be installed automatically into a separate environment when you run Hatch scripts (see below). Alternatively, to explicitly create or ensure the Hatch 'default' development environment is set up:uv sync # Install main dependencies from pyproject.toml
If you need to force a complete refresh of this environment, you can remove it first with 'hatch env remove default' before running 'hatch env create default'.hatch env create default # Ensure your main .venv is active first
Run linters/formatters (this will also install them via Hatch if not already present):
uv run lint
uv run format
Documentation
The full documentation for GXtract is available at https://sascharo.github.io/gxtract/.
Building Documentation Locally
If you want to build and view the documentation locally:
-
Ensure you have installed all development dependencies:
uv sync
-
Build the documentation:
uv run hatch -e default run docs-build
-
Serve the documentation locally:
uv run hatch -e default run docs-serve
-
Open your browser and navigate to http://127.0.0.1:8000
Building Documentation (Sphinx)
The project documentation is built using Sphinx. The following Hatch scripts are available to manage the documentation:
-
Build Documentation:
uv run docs-build
This command generates the HTML documentation in the
docs/sphinx/build/html
directory. -
Serve Documentation Locally:
uv run docs-serve
This starts a local HTTP server (usually at
http://127.0.0.1:8000
) to preview the documentation. You can specify a different port if needed, e.g.,uv run docs-serve 8081
. -
Clean Documentation Build:
uv run docs-clean
This command removes the
docs/sphinx/build
directory, cleaning out old build artifacts.
Ensure your virtual environment is active before running these commands.
Cache Management
GXtract maintains an in-memory cache of GroundX metadata (projects and buckets) to improve performance and reduce API calls. While this cache is automatically populated during server startup and periodically refreshed, there are situations when you may need to manually refresh the cache.
When to Manually Refresh the Cache
You should manually refresh the cache when:
- You've recently created new projects or buckets in your GroundX account and want them to be immediately available in GXtract.
- You see warnings in the server logs about cache population failures.
- You're experiencing issues with project or bucket lookup when using GXtract tools.
How to Refresh the Cache
Using VS Code's MCP Interface
If your VS Code version supports MCP chat interfaces:
- Open VS Code's chat interface.
- Use the
@GXtract
mention (or whatever name you assigned to the server in your settings). - Type a command to refresh the cache:
@GXtract Please refresh the GroundX metadata cache
- The VS Code interface will use the appropriate cache refresh tool.
Using Direct JSON-RPC Requests
If you have access to the server through HTTP (when not using stdio transport), you can make direct requests:
curl -X POST http://127.0.0.1:8080/jsonrpc -H "Content-Type: application/json" -d '{
"jsonrpc": "2.0",
"method": "cache/refreshMetadataCache",
"params": {},
"id": "refresh-req-001"
}'
Troubleshooting Common Cache Issues
Warning: "No projects (groups) found or 'groups' attribute missing in API response"
This warning indicates that:
- Your API key might not have access to any projects, or
- No projects have been created in your GroundX account yet, or
- There might be an issue with the GroundX API or connectivity.
Solution:
- Verify you have correctly set up your GroundX account with at least one project.
- Check that your API key has proper permissions.
- Try refreshing the cache manually after confirming your account setup.
Warning: "GroundX metadata cache population failed. Check logs for details"
This warning appears during server startup if the initial cache population failed.
Solution:
- Check the full server logs for more details about the error.
- Verify your API key is correctly set in the environment.
- Check your internet connection and GroundX API availability.
- Try using the
cache/refreshMetadataCache
tool to manually populate the cache.
Checking Cache Status
You can check the current status of the cache with:
{
"jsonrpc": "2.0",
"method": "cache/getCacheStatistics",
"params": {},
"id": "stats-req-001"
}
Or list the currently cached resources:
{
"jsonrpc": "2.0",
"method": "cache/listCachedResources",
"params": {},
"id": "list-req-001"
}
Dependency Management
GXtract uses uv for dependency management. Dependencies are specified in pyproject.toml
and locked in uv.lock
to ensure reproducible installations.
Working with Dependencies
- Installing dependencies: Run
uv sync
to install all dependencies according to the lockfile. - Adding a new dependency: Add the dependency to
pyproject.toml
and runuv pip compile pyproject.toml -o uv.lock
to update the lockfile. - Updating dependencies: After manually changing versions in
pyproject.toml
, runuv pip compile pyproject.toml -o uv.lock --upgrade
to update the lockfile with newest compatible versions.
The uv.lock File
The uv.lock
file is committed to the repository to ensure that everyone working on the project uses exactly the same dependency versions. This prevents "works on my machine" problems and ensures consistent behavior across development environments and CI/CD pipelines.
When making changes to dependencies, always commit both the updated pyproject.toml
and the uv.lock
file.
Versioning
This project adheres to Semantic Versioning (SemVer 2.0.0).
License
This project is licensed under the GNU General Public License v3.0 - see the file for details.