mcp-gemini-server

mcp-gemini-server

19

The MCP Gemini Server is an MCP-compatible server that simplifies integration with Google's Gemini models for other language learning models and systems. It provides a consistent interface to leverage Gemini's capabilities, including text and image generation, audio transcription, and URL processing. Designed to be a robust backend solution, it offers extensive tools and features to enhance content processing and interaction.

Overview

This project provides a dedicated MCP (Model Context Protocol) server that wraps the @google/genai SDK (v0.10.0). It exposes Google's Gemini model capabilities as standard MCP tools, allowing other LLMs (like Claude) or MCP-compatible systems to leverage Gemini's features as a backend workhorse. This server aims to simplify integration with Gemini models by providing a consistent, tool-based interface managed via the MCP standard. It supports the latest Gemini models including gemini-1.5-pro-latest, gemini-1.5-flash-latest, and gemini-2.5-pro models.

Features

  • Core Generation: Standard and streaming text generation with support for system instructions and cached content.
  • Function Calling: Enables Gemini models to request client-defined functions.
  • Stateful Chat: Manages conversational context across multiple turns.
  • File Handling: Upload, list, retrieve, and delete files.
  • Caching: Manage cached content to optimize prompts.
  • Image Generation: Generate images from text prompts.
  • Object Detection: Detect objects in images and return bounding box coordinates.
  • Visual Content Understanding: Extract information from visual content.
  • Audio Transcription: Transcribe audio files with optional timestamps and multilingual support.
  • URL Context Processing: Fetch and analyze web content directly from URLs.
  • MCP Client: Connect to and interact with external MCP servers.