Home
MCP Servers
BigUncle
Fast-Whisper-MCP-Server

Fast-Whisper-MCP-Server

5

The Whisper Speech Recognition MCP Server provides efficient and high-performance audio transcription using the Faster Whisper model. It supports multiple model sizes, batch processing, and various output formats while utilizing CUDA acceleration for improved speed.

Whisper Speech Recognition MCP Server

A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.

Features

Integrated with Faster Whisper for efficient speech recognition
Batch processing acceleration for improved transcription speed
Automatic CUDA acceleration (if available)
Support for multiple model sizes (tiny to large-v3)
Output formats include VTT subtitles, SRT, and JSON
Support for batch transcription of audio files in a folder
Model instance caching to avoid repeated loading
Dynamic batch size adjustment based on GPU memory

Usage

Starting the Server

On Windows, simply run the batch file. On other platforms, execute the Python script.

Configuring Claude Desktop

Configure the Claude Desktop configuration file to link with the Whisper MCP server.

Available Tools

get_model_info - Retrieve available Whisper models information
transcribe - Transcribe a single audio file
batch_transcribe - Batch transcribe audio files in a folder

Error Handling

The server provides mechanisms for handling errors such as audio file checks, model loading failures, and GPU memory management.