mcp_weather_scraper
MCP Weather Scraper is an innovative project that uses Model Context Protocol along with FastAPI and OpenAI's language tools to retrieve and structure real-time weather data. It offers a comprehensive tool for experimenting with AI-agent interactions in real-world applications.
🌦️ MCP Weather Scraper
This project is an experimental implementation of the Model Context Protocol (MCP) using a lightweight LLM via ** OpenAI** and FastAPI to fetch and structure real-time weather information from open web sources. The goal is to explore how LLMs can interact with tools and serve as intelligent agents for retrieving and reasoning over unstructured web data.
🚀 Features
- ✅ MCP-compliant server with weather scraping via browser search
- ✅ Integration with OpenAI LLM (e.g., gpt-3.5-turbo)
- ✅ FastAPI server provides weather info as callable MCP tool
- ✅ Automatic HTML parsing using
selectolax
for performance - ✅ LLM handles unstructured web content extraction into structured schema
- ✅ Streamlit app frontend for user interaction
- ✅ Response caching using
functools.lru_cache
🧠 Refresh Before You Dive In
Top 5 Concepts to Brush Up On for This Repo
🧩 Concept | 🔍 What It Is | ⚙️ Why It Matters |
---|---|---|
Model Context Protocol (MCP) | A new protocol for tool-calling in LLMs | Powers structured AI-agent communication |
Uvicorn | Fast ASGI server for Python web apps | Hosts the FastAPI-based MCP server |
Selectolax | High-speed HTML parser | Efficiently scrapes and extracts weather data |
functools.lru_cache | Built-in Python decorator to cache function calls | Boosts performance by avoiding repeated fetches |
Token Usage Metrics (OpenAI) | Info on how many tokens were used in an LLM call | Helps track cost and optimize prompt design |
💡 Even if you're familiar with Python and APIs, these tools represent cutting-edge AI stack engineering and are worth a quick look!
📊 Token Usage & Performance Metrics
The Streamlit UI now includes:
-
⏱️ Response Time
Time taken to fetch and process weather info -
🧠 Prompt Tokens
Tokens used in the LLM prompt -
💬 Completion Tokens
Tokens generated in the LLM response -
🔢 Total Tokens
Total token count per request, useful for cost tracking
These are displayed in a clean visual layout under each result card.
🖥️ Streamlit App Preview
Requirements
- Python 3.9 or higher
- Dependencies listed in
requirements.txt
🛠️ Setup
- Clone the repo
git clone https://github.com/your-username/mcp_weather_scraper.git cd mcp_weather_scraper
- Create and activate a virtual environment
python -m venv .venv .venv\Scripts\activate # On Windows
- Install dependencies
pip install -r requirements.txt
- Set environment variables
Create a .env file in the root directory and add your OpenAI API key
OPENAI_API_KEY=your_openai_api_key
- Running the Server
The server will be available at http://localhost:8000. You can access the API documentation at:uvicorn server:app --reload
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Making a Request
ORpython client.py
The script sends a POST request with the following payload:curl -X POST http://localhost:8000/weather -H "Content-Type: application/json" -d '{"location": "Seattle"}'
The server will respond with weather data in JSON format, such as:{ "location": "Seattle" }
{ "location": "Seattle", "temperature": "15°C", "humidity": "80%", "air_quality": "Good", "condition": "Cloudy" }
📦 Folder Structure
.
📁 mcp_weather_scraper/
│
├── assets/
│ └── streamlit_screenshot.png
├── server.py # MCP-compatible tool server
├── client.py # MCP client that interacts with model + tools
├── data_models.py # Pydantic schemas for request/response
├── utils.py # HTML cleaning, scraping, etc.
├── requirements.txt
└── .env
📄 License
This project is licensed under the MIT License.