mcp-tts
mcp-tts is a Text-to-Speech MCP Server that integrates various TTS tools into applications, enabling speech synthesis using macOS, ElevenLabs, Google, and OpenAI APIs. It offers a broad range of voices and features like speed control and custom voice instructions for versatile usage.
mcp-tts
MCP Server for TTS (Text-to-Speech)
What? 🤔
Adds Text-to-Speech to things like Claude Desktop and Cursor IDE.
It registers four TTS tools:
say_tts
elevenlabs_tts
google_tts
openai_tts
say_tts
Uses the macOS say
binary to speak the text with built-in system voices
elevenlabs_tts
Uses the ElevenLabs text-to-speech API to speak the text with premium AI voices
google_tts
Uses Google's Gemini TTS models to speak the text with 30 high-quality voices. Available voices include:
- Zephyr (Bright), Puck (Upbeat), Charon (Informative)
- Kore (Firm), Fenrir (Excitable), Leda (Youthful)
- Orus (Firm), Aoede (Breezy), Callirhoe (Easy-going)
- Autonoe (Bright), Enceladus (Breathy), Iapetus (Clear)
- And 18 more voices with various characteristics
openai_tts
Uses OpenAI's Text-to-Speech API to speak the text with 6 natural-sounding voices:
- coral (Default, warm and natural)
- alloy (Balanced tone)
- echo (Warm and engaging)
- fable (Expressive and storytelling)
- onyx (Deep and resonant)
- nova (Bright and articulate)
- shimmer (Smooth and pleasant)
Supports three quality models:
- gpt-4o-mini-tts - Default, optimized quality and speed
- tts-1 - Standard quality, faster generation
- tts-1-hd - High definition audio, premium quality
Additional features:
- Speed control from 0.25x to 4.0x (default: 1.0x)
- Custom voice instructions (e.g., "Speak in a cheerful and positive tone") via parameter or
OPENAI_TTS_INSTRUCTIONS
environment variable
Getting Started
Install
go install github.com/blacktop/mcp-tts@latest
❱ mcp-tts --help
TTS (text-to-speech) MCP Server.
Provides a text-to-speech service using the MacOS 'say' command.
Designed to be used with the MCP protocol.
Usage:
mcp-tts [flags]
Flags:
-h, --help help for mcp-tts
-v, --verbose Enable verbose debug logging
Set Claude Desktop Config
{
"mcpServers": {
"say": {
"command": "mcp-tts",
"env": {
"ELEVENLABS_API_KEY": "********",
"ELEVENLABS_VOICE_ID": "1SM7GgM6IMuvQlz2BwM3",
"GOOGLE_AI_API_KEY": "********",
"OPENAI_API_KEY": "********",
"OPENAI_TTS_INSTRUCTIONS": "Speak in a cheerful and positive tone"
}
}
}
}
Environment Variables
ELEVENLABS_API_KEY
: Your ElevenLabs API key (required forelevenlabs_tts
)ELEVENLABS_VOICE_ID
: ElevenLabs voice ID (optional, defaults to a built-in voice)GOOGLE_AI_API_KEY
orGEMINI_API_KEY
: Your Google AI API key (required forgoogle_tts
)OPENAI_API_KEY
: Your OpenAI API key (required foropenai_tts
)OPENAI_TTS_INSTRUCTIONS
: Custom voice instructions for OpenAI TTS (optional, e.g., "Speak in a cheerful and positive tone")
Test
Test macOS TTS
❱ cat test/say.json | go run main.go --verbose
2025/03/23 22:41:49 INFO Starting MCP server name="Say TTS Service" version=1.0.0
2025/03/23 22:41:49 DEBU Say tool called request="{Request:{Method:tools/call Params:{Meta:<nil>}} Params:{Name:say_tts Arguments:map[text:Hello, world!] Meta:<nil>}}"
2025/03/23 22:41:49 DEBU Executing say command args="[--rate 200 Hello, world!]"
2025/03/23 22:41:49 INFO Speaking text text="Hello, world!"
{"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"Speaking: Hello, world!"}]}}
Test Google TTS
❱ cat test/google_tts.json | go run main.go --verbose
2025/05/23 18:26:45 INFO Starting MCP server name="Say TTS Service" version=""
2025/05/23 18:26:45 DEBU Google TTS tool called request="{...}"
2025/05/23 18:26:45 DEBU Generating TTS audio model=gemini-2.5-flash-preview-tts voice=Kore text="Hello! This is a test of Google's TTS API. How does it sound?"
2025/05/23 18:26:49 INFO Playing TTS audio via beep speaker bytes=181006
2025/05/23 18:26:53 INFO Speaking via Google TTS text="Hello! This is a test of Google's TTS API. How does it sound?" voice=Kore
{"jsonrpc":"2.0","id":4,"result":{"content":[{"type":"text","text":"Speaking: Hello! This is a test of Google's TTS API. How does it sound? (via Google TTS with voice Kore)"}]}}
Test OpenAI TTS
❱ cat test/openai_tts.json | go run main.go --verbose
2025/05/23 19:15:32 INFO Starting MCP server name="Say TTS Service" version=""
2025/05/23 19:15:32 DEBU OpenAI TTS tool called request="{...}"
2025/05/23 19:15:32 DEBU Generating OpenAI TTS audio model=tts-1 voice=nova speed=1.2 text="Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed."
2025/05/23 19:15:34 DEBU Decoding MP3 stream from OpenAI
2025/05/23 19:15:34 DEBU Initializing speaker for OpenAI TTS sampleRate=22050
2025/05/23 19:15:36 INFO Speaking text via OpenAI TTS text="Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed." voice=nova model=tts-1 speed=1.2
{"jsonrpc":"2.0","id":5,"result":{"content":[{"type":"text","text":"Speaking: Hello! This is a test of OpenAI's text-to-speech API. I'm using the nova voice at 1.2x speed. (via OpenAI TTS with voice nova)"}]}}
License
MIT Copyright (c) 2025 blacktop