watsonx-mcp-server
This document provides a comprehensive guide to building a Watsonx.ai Chatbot Server using the Model Context Protocol (MCP) in Python.
Building a Watsonx.ai Chatbot Server with MCP in Python
In this in-depth tutorial, youâll learn how to create a professional, productionâready chatbot server powered by IBM Watsonx.ai and exposed via the Model Context Protocol (MCP) Python SDK. By the end, youâll have a reusable MCP service that any MCPâcompatible client (e.g., Claude Desktop, custom Python clients) can invoke as a standardized âchatâ tool.
Weâll walk through:
- Setting up your environment stepâbyâstep
- Installing dependencies and managing credentials securely
- Writing clean, wellâdocumented Python code
- Exposing Watsonx.ai inference as an MCP tool
- Running, testing, and troubleshooting your server
- Tips for extending and hardening the service
Introduction
IBM Watsonx.ai offers cuttingâedge largeâlanguageâmodel (LLM) inference via IBM Cloud, while the Model Context Protocol (MCP) standardizes how applications expose tools, resources, and prompts to LLM clients. By combining these two, you get:
- Modularity: decouple your chatbot logic from client implementations.
- Reusability: any MCPâcompatible client can call the same âchatâ endpoint.
- Rapid iteration: builtâin development inspector with live reloading.
Whether youâre building an internal helpdesk bot or a public chatbot API, this pattern scales and adapts easily.
Prerequisites
Before you begin, ensure you have:
- IBM Cloud Watsonx.ai credentials: an API key, service URL, and project ID
- Python 3.9+ (we recommend 3.11+ for performance and typing improvements)
- pip (Python package installer)
- Virtual environment tool (
venv
orvirtualenv
) - Basic commandâline familiarity (Linux/macOS/Windows WSL)
Weâll install the following Python packages:
python-dotenv
 â load environment variables from a.env
fileibm-watsonx-ai
 â IBMâs official Watsonx.ai SDKmcp[cli]
 â MCP Python SDK & CLI tools
Project Structure
Create a new directory for your project. Your final tree will look like this:
watsonx-mcp-server/
âââ .env
âââ .gitignore
âââ requirements.txt
âââ server.py
.env
 â your secret credentials (never commit to source control!).gitignore
 â ignore.env
,__pycache__
,.venv
requirements.txt
 â pinned dependency listserver.py
 â the full MCP chatbot server implementation
Environment Setup
-
Create & activate a virtual environment
python3 -m venv .venv source .venv/bin/activate # macOS/Linux .venv\Scripts\activate.bat # Windows
-
Pin and install dependencies
For the requirements.txt
python-dotenv>=0.21.0
ibm-watsonx-ai==1.3.8
mcp[cli]>=1.6.0
pip install --upgrade pip
pip install -r requirements.txt
- Secure your credentials
- Create a file named
.env
in the project root. - Add your Watsonx.ai details:
WATSONX_APIKEY=your-ibm-watsonx-api-key WATSONX_URL=https://api.your-region.watsonx.ai PROJECT_ID=your-watsonx-project-id
- Add
.env
(and.venv/
,__pycache__/
) to.gitignore
:.env .venv/ __pycache__/
- Create a file named
Writing the Chatbot Server (server.py
)
Open server.py
and follow these sections.
Imports & Configuration
# server.py
import os
import logging
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP
# IBM Watsonx.ai SDK
from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models import ModelInference
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
- We configure logging for easier debugging.
- We load environment variables early.
# Load .env variables
load_dotenv()
# Fetch credentials
API_KEY = os.getenv("WATSONX_APIKEY")
URL = os.getenv("WATSONX_URL")
PROJECT_ID = os.getenv("PROJECT_ID")
MODEL_ID = os.getenv("MODEL_ID", "ibm/granite-13b-instruct-v2")
Credential Validation & Client Initialization
# Validate env vars
for name, val in [
("WATSONX_APIKEY", API_KEY),
("WATSONX_URL", URL),
("PROJECT_ID", PROJECT_ID)
]:
if not val:
raise RuntimeError(f"{name} is not set. Please add it to your .env file.")
# Configure logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s"
)
# Initialize IBM credentials & client
creds = Credentials(url=URL, api_key=API_KEY)
client = APIClient(credentials=creds, project_id=PROJECT_ID)
# Initialize the inference model
model = ModelInference(
model_id=MODEL_ID,
credentials=creds,
project_id=PROJECT_ID
)
logging.info(
f"Initialized Watsonx.ai model '{MODEL_ID}' "
f"for project '{PROJECT_ID}'."
)
Defining the MCP âchatâ Tool
# Create the MCP server instance
mcp = FastMCP("Watsonx Chatbot Server")
@mcp.tool()
def chat(query: str) -> str:
"""
MCP tool: generate a chatbot response via Watsonx.ai
:param query: User's input message
:return: Watsonx.ai generated response
"""
logging.info("Received chat query: %r", query)
# Define generation parameters
params = {
GenParams.DECODING_METHOD: "greedy",
GenParams.MAX_NEW_TOKENS: 200,
}
# Run the model
try:
# Request the full JSON response rather than just a string
resp = model.generate_text(
prompt=query,
params=params,
raw_response=True
)
print("AI raw response:", resp)
# Extract the generated text from the dict
text = resp["results"][0]["generated_text"].strip()
logging.info("Generated response: %r", text)
return text
except Exception as e:
logging.error("Inference error: %s", e, exc_info=True)
return f"Error generating response: {e}"
Prompts & Resources
Exposing Resources
Resources let you inject dynamic data into your LLMâs context. For example, here we expose a personalized greeting:
# Expose a greeting resource that dynamically constructs a personalized greeting.
@mcp.resource("greeting://patient/{name}")
def get_greeting(name: str) -> str:
"""
Return a medicalâstyle greeting for the given patient name.
:param name: The patient's name.
:return: A personalized greeting.
"""
return f"Hello {name}, Iâm your medical assistant. How can I help you today?"
Adding Prompts (Optional)
Prompts let you define reusable templates. Hereâs a simple medicalâchat example that asks the model to assess symptoms:
from mcp.server.fastmcp.prompts import base
@mcp.prompt()
def assess_symptoms(symptoms: str) -> str:
"""
Prompt template for symptom assessment.
:param symptoms: Description of patient symptoms.
:return: A prompt asking the LLM to analyze and suggest next steps.
"""
return (
f"{base}\n"
"You are a qualified medical assistant. The patient reports the following symptoms:\n"
f"{symptoms}\n\n"
"Please provide possible causes, recommended next steps, and when to seek immediate care."
)
Entry Point & Logging
if __name__ == "__main__":
# Start the MCP server (blocking call)
logging.info("Starting MCP server on STDIO transport...")
mcp.run()
Thatâs it! You now have:
- A working
chat
tool that correctly parsesraw_response
from Watsonx.ai. - A resource to inject personalized greetings.
- An optional prompt template for medical symptom assessment.
Running & Testing the Server
First we load our enviroment
source .venv/bin/activate
Development Mode with mcp dev
The fastest way to iterate is:
mcp dev server.py
- Live reloading on code changes
- Inspector UI at
http://localhost:6274/
- Interactive testing of the
chat
tool
How to interact with the Server
You can enter to http://localhost:6274/
When you open the Inspector in your browser, you will notice several key sections designed to facilitate server testing.
Go to the top of the MCP Inspector interface where it shows:
Transport Type: STDIO
Command: python
Arguments: run --with mcp mcp run server.py
Due to in our server.py is a standalone script using a normal pip-based virtual environment, we need to correct configuration in the MCP Inspector to:
Transport Type: STDIO
Command: python
Arguments: server.py
Then click Connect and our server will be launched properly using the Python interpreter.
Then go to tools then list tools and click on chat tna dhten in the query you can type
What is watsonx in IBM?
Prompt via the MCP Inspector
In the Inspectorâs sidebar, expand Prompts â List prompts.
You should see assess_symptoms
listed.
In the prompt pane, youâll find a form or JSON editor ready to accept arguments.
Supply the code
parameter. For example:
persistent dull ache, stiffness, and general back pain
Click Get Prompt.
You will receive:
{
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "<module 'mcp.server.fastmcp.prompts.base' from '/mnt/c/Blog/Watsonx-ai-Chatbot-Server-with-MCP/.venv/lib/python3.11/site-packages/mcp/server/fastmcp/prompts/base.py'>\nYou are a qualified medical assistant. The patient reports the following symptoms:\npersistent dull ache, stiffness, and general back pain\n\nPlease provide possible causes, recommended next steps, and when to seek immediate care."
}
}
]
}
Accessing a Resource
Once your server is up and running in the Inspector (using the setup from above), you can invoke any registered resource by its URI:
- Open the Resource Interaction Pane In the MCP Inspector sidebar, click Resources â Resource Templates. click List templates and select get_greeting
- Enter the Resource URI In the input field name, type: John
- Invoke the Resource Click Read Resource. The Inspector will send that URI to your @mcp.resource("greeting://{name}") handler.
- View the Response You should see:
{
"contents": [
{
"uri": "greeting://patient/John",
"mimeType": "text/plain",
"text": "Hello John, Iâm your medical assistant. How can I help you today?"
}
]
}
This confirms your dynamic greeting resource is wired up correctly and returns personalized output on demand.
Direct Execution
To run without the Inspector:
python server.py
It will sit silently waiting for MCPâformatted requests on STDIO.
Example Python Client
Save this as client.py
alongside server.py
:
# client.py
import asyncio
from mcp import ClientSession
from mcp.client.stdio import stdio_client
from mcp import StdioServerParameters
async def main():
server_params = StdioServerParameters(command="python", args=["server.py"])
async with stdio_client(server_params) as (reader, writer):
async with ClientSession(reader, writer) as session:
await session.initialize()
# Call the chat tool
user_msg = "Hello, how are you today?"
response = await session.call_tool("chat", arguments={"query": user_msg})
print("Bot:", response)
if __name__ == "__main__":
asyncio.run(main())
Run your server in one terminal (python server.py
),
then in another:
python client-tool.py
You should see:
Bot: meta=None content=[TextContent(type='text', text='I am good. Thank you.', annotations=None)] isError=False
Building a Watsonx.ai Medical Chatbot with Flask and MCP
Lets now walk through creating a webâbased medical chatbot using:
- MCP (Modular Chat Protocol) to structure our prompt and tool calls
- Watsonx.ai for LLM inference
- Flask for the frontend UI
By the end, youâll have a Flask app that:
- Greets the user by name
- Collects their symptoms
- Generates a diagnosis prompt via MCP
- Calls Watsonx.ai to get medical advice
- Displays the assessment on a styled web page
Project Structure
ââ chatbot.py
ââ server.py # Your MCP server implementation
ââ static/
â ââ assets/
â ââ watsonx-wallpaper.jpg
ââ templates/
ââ base.html
ââ home.html
ââ symptoms.html
ââ diagnosis.html
All frontâend templates live under templates/
. Static assets (like our wallpaper) go in Flaskâs static/
directory.
Step 1: Writing chatbot.py
This script initializes a single longâlived MCP client session on import and reuses it across requests.
import os
import atexit
import asyncio
from flask import Flask, render_template, request, redirect, url_for, session
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
# Flask app setup
app = Flask(__name__)
app.secret_key = os.environ.get("SECRET_KEY", os.urandom(24))
# MCP server parameters
SERVER_PARAMS = StdioServerParameters(command="python", args=["server.py"], env=None)
# Dedicated asyncio loop for MCP
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
# Globals for client and session contexts
_stdio_ctx = None
_session_ctx = None
SESSION = None
async def _init_session():
global _stdio_ctx, _session_ctx, SESSION
_stdio_ctx = stdio_client(SERVER_PARAMS)
_reader, _writer = await _stdio_ctx.__aenter__()
_session_ctx = ClientSession(_reader, _writer)
SESSION = await _session_ctx.__aenter__()
await SESSION.initialize()
# Initialize once at import
loop.run_until_complete(_init_session())
app.logger.info("MCP client session initialized once.")
async def _close_session():
if _session_ctx:
await _session_ctx.__aexit__(None, None, None)
if _stdio_ctx:
await _stdio_ctx.__aexit__(None, None, None)
atexit.register(lambda: loop.run_until_complete(_close_session()))
# Helper: fetch greeting text
def fetch_greeting(name: str) -> str:
resp = loop.run_until_complete(SESSION.read_resource(f"greeting://patient/{name}"))
contents = getattr(resp, 'contents', None)
if isinstance(contents, list):
return "\n".join(c.text for c in contents).strip()
return str(resp)
# Helper: assess symptoms via chat tool
def assess_symptoms(symptoms: str) -> str:
prompt_resp = loop.run_until_complete(
SESSION.get_prompt("assess_symptoms", arguments={"symptoms": symptoms})
)
# Extract clean text from prompt_resp.messages
msgs = getattr(prompt_resp, 'messages', None)
if msgs:
lines = []
for m in msgs:
txt = m.content.text if hasattr(m.content, 'text') else str(m.content)
if txt.startswith("<module"):
txt = txt.split("\n", 1)[1]
lines.append(txt)
diagnosis_prompt = "\n".join(lines).strip()
else:
diagnosis_prompt = str(prompt_resp)
tool_resp = loop.run_until_complete(
SESSION.call_tool("chat", arguments={"query": diagnosis_prompt})
)
cont = getattr(tool_resp, 'content', None)
if isinstance(cont, list):
return "\n".join(c.text for c in cont).strip()
return str(cont).strip()
# Flask routes
@app.route("/", methods=["GET", "POST"])
def home():
if request.method == "POST":
session['name'] = request.form['name']
return redirect(url_for('symptoms'))
return render_template("home.html")
@app.route("/symptoms", methods=["GET", "POST"])
def symptoms():
name = session.get('name')
if not name:
return redirect(url_for('home'))
if request.method == "POST":
diag = assess_symptoms(request.form['symptoms'])
return render_template("diagnosis.html", diagnosis=diag)
greet = fetch_greeting(name)
return render_template("symptoms.html", greeting=greet)
if __name__ == "__main__":
app.run(debug=True)
Explanation:
- We call
stdio_client
andClientSession
at import time to avoid reconnecting on every HTTP request.fetch_greeting
reads a resource and extracts only the text.assess_symptoms
builds a clean prompt, invokes thechat
tool, and returns the AIâs reply.
Step 2: Base Template (base.html
)
A single file to handle the wallpaper and layout for all pages.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>{% block title %}{% endblock %}</title>
<link href="https://cdn.jsdelivr.net/npm/tailwindcss/dist/tailwind.min.css" rel="stylesheet">
<style>
body {
background-image: url('{{ url_for("static", filename="assets/watsonx-wallpaper.jpg") }}');
background-size: cover;
background-position: center;
background-repeat: no-repeat;
}
</style>
</head>
<body class="min-h-screen flex items-center justify-center p-4">
{% block content %}{% endblock %}
</body>
</html>
Explanation:
- We load the wallpaper as a fullâscreen background.
- Use Tailwind utility classes to center content.
Step 3: Home Page (home.html
)
{% extends "base.html" %}
{% block title %}Medical Chatbot â Welcome{% endblock %}
{% block content %}
<div class="bg-white bg-opacity-80 p-8 rounded shadow-md w-full max-w-md">
<h1 class="text-2xl mb-4 font-bold">Welcome to Your Medical Assistant</h1>
<form method="post">
<label class="block mb-2">Whatâs your name?</label>
<input name="name" required class="w-full p-2 border rounded mb-4" placeholder="e.g. Angela"/>
<button type="submit" class="w-full bg-blue-500 text-white py-2 rounded">Continue</button>
</form>
</div>
{% endblock %}
Explanation:
- Extends
base.html
to inherit the background.- Simple form to collect
name
.
Step 4: Symptoms Page (symptoms.html
)
{% extends "base.html" %}
{% block title %}Medical Chatbot â Symptoms{% endblock %}
{% block content %}
<div class="bg-white bg-opacity-80 p-8 rounded shadow-md w-full max-w-md">
<h2 class="text-xl mb-4">{{ greeting }}</h2>
<form method="post">
<label class="block mb-2">Please describe your symptoms:</label>
<textarea name="symptoms" rows="4" required class="w-full p-2 border rounded mb-4"
placeholder="Headache, fever, coughing..."></textarea>
<button type="submit" class="w-full bg-green-500 text-white py-2 rounded">Get Assessment</button>
</form>
</div>
{% endblock %}
Explanation:
- Displays the greeting fetched from MCP.
- Collects
symptoms
text.
Step 5: Diagnosis Page (diagnosis.html
)
{% extends "base.html" %}
{% block title %}Medical Chatbot â Assessment{% endblock %}
{% block content %}
<div class="bg-white bg-opacity-80 p-8 rounded shadow-md w-full max-w-lg whitespace-pre-wrap">
<h2 class="text-xl mb-4 font-bold">Assessment & Recommendations</h2>
{{ diagnosis }}
<div class="mt-6">
<a
href="{{ url_for('home') }}"
class="w-full block bg-blue-500 text-white py-2 rounded text-center"
>
Start Over
</a>
</div>
</div>
{% endblock %}
Explanation:
- Renders the AIâs advice with preserved line breaks (
whitespace-pre-wrap
).
Once you have created the full project you simple run
pyhon server.py
and in another terminal
python chatbot.py
you got
then you can simply enter to http://127.0.0.1:5000/
- Home Page
First you can type the name of the patient
- Symptoms Page:
You can provide to the AI the following symptoms as example
Fever, Cough, Sore Throat, Muscle or Body Aches, Headache, Fatigue.
- Diagnosis Page
Here the AI will connect with WatsonX.ai by using the server MCP and we got the results.
Troubleshooting & Best Practices
- â.env not loadedâ: confirm
load_dotenv()
is called beforeos.getenv
. - âConnection refusedâ: ensure youâre running the server (
mcp dev
orpython server.py
) before the client. - Latency spikes: consider using streaming endpoints or a smaller model.
- Secrets management: for production, use a secure vault (AWS Secrets Manager, IBM Key Protect) instead of plain
.env
. - Logging levels: switch to
DEBUG
during development (logging.basicConfig(level=logging.DEBUG)
).
Next Steps & Extensions
- Conversation State: store recent dialogue in a resource or persistence layer.
- Prompts & Templates: add
@mcp.prompt()
methods to standardize greeting flows or FAQs. - Additional Tools: integrate knowledgeâbase search, sentiment analysis, or external APIs.
- Deployment: containerize with Docker, deploy to Kubernetes or a serverless platform.
- Authentication: secure MCP transport with TLS or tokenâbased auth.
Conclusion
Congratulations! You now have a fullâfeatured, productionâmindset Watsonx.ai chatbot server exposed via MCP. This pattern gives you a clean separation of concerns:
- MCP server: hosts and documents your tools
- Watsonx.ai: handles powerful LLM inference
- Clients: any standard MCP consumer (CLI, web UI, desktop app)
Happy coding!
Related MCP Servers
View all ai_chatbot servers âSequential Thinkingđ
by modelcontextprotocol
An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.
exa-mcp-server
by exa-labs
A Model Context Protocol (MCP) server allows AI assistants to use the Exa AI Search API for real-time web searches in a secure manner.
repomix
by yamadashy
Repomix is a tool that packages your entire codebase into a single, AI-friendly file, making it easier to use with AI tools like LLMs.
claude-task-master
by eyaltoledano
Task Master is a task management system for AI-driven development with Claude, designed to work seamlessly with Cursor AI.
blender-mcp
by ahujasid
BlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), enabling prompt-assisted 3D modeling, scene creation, and manipulation.
mcp-server-calculator
by githejie
A Model Context Protocol server for calculating. This server enables LLMs to use calculator for precise numerical calculations.
Cua Agent
by trycua
cua-mcp-server is a Model Context Protocol (MCP) server for the Computer-Use Agent (CUA), enabling integration with Claude Desktop and other MCP clients.