by mbailey
Conversational Coding with Voice Mode MCP
Voice Mode is a Python project that enables natural voice conversations with AI assistants like Claude and ChatGPT. It facilitates human-like voice interactions through the Model Context Protocol (MCP), allowing users to speak their prompts and hear responses in real-time. It supports various AI coding assistants and offers flexibility with both cloud and local speech-to-text (STT) and text-to-speech (TTS) services.
uvx voice-mode, pip install voice-mode, or npm install -g @anthropic-ai/claude-code followed by uvx voice-mode.export OPENAI_API_KEY="your-key". For local services, you can set STT_BASE_URL and TTS_BASE_URL.converse tool for voice conversations. For example, claude converse to start a conversation. You can also use listen_for_speech to convert speech to text.install_whisper_cpp and install_kokoro_fastapi or follow the documentation for Whisper.cpp and Kokoro setup.VOICEMODE_SAVE_AUDIO="true" to save all audio input and output.Voice Mode provides tools to help set up local services:
install_whisper_cpp: Installs Whisper.cpp for local STT.install_kokoro_fastapi: Installs Kokoro for local TTS.These tools facilitate the setup of free, private, open-source voice services locally, offering an OpenAI-compatible API interface for seamless switching between cloud and local processing.
Install via:
uv tool install voice-mode| getvoicemode.com
Natural voice conversations for AI assistants. VoiceMode brings human-like voice interactions to Claude Code, AI code editors through the Model Context Protocol (MCP).
Runs on: Linux • macOS • Windows (WSL) • NixOS | Python: 3.10-3.14
All you need to get started:
# Install VoiceMode MCP python package and dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
uvx voice-mode-install
# While local voice services can be installed automatically, we recommend
# providing an OpenAI API key as a fallback in case local services are unavailable
export OPENAI_API_KEY=your-openai-key # Optional but recommended
# Add VoiceMode to Claude
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
# Start a voice conversation
claude converse
For manual setup steps, see the Getting Started Guide.
Watch VoiceMode in action with Claude Code:
The converse function makes voice interactions natural - it automatically waits for your response by default, creating a real conversation flow.
curl -LsSf https://astral.sh/uv/install.sh | sh)Note on LiveKit: LiveKit integration is optional and requires Python 3.10-3.13 (Python 3.14 support pending upstream dependencies). Install with:
uv tool install voice-mode[livekit]. See LiveKit Integration Guide for details.
sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev
Note for WSL2 users: WSL2 requires additional audio packages (pulseaudio, libasound2-plugins) for microphone access.
sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel
# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install dependencies
brew install ffmpeg node portaudio
Follow the Ubuntu/Debian instructions above within WSL.
VoiceMode includes a flake.nix with all required dependencies. You can either:
nix develop github:mbailey/voicemode
# Using Claude Code (recommended)
claude mcp add --scope user voicemode uvx --refresh voice-mode
📖 Looking for detailed setup instructions? Check our comprehensive Getting Started Guide for step-by-step instructions!
Below are quick configuration snippets. For full installation and setup instructions, see the integration guides above.
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
Or with environment variables:
claude mcp add --scope user --env OPENAI_API_KEY=your-openai-key voicemode -- uvx --refresh voice-mode
git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .
1. Install with nix profile (user-wide):
nix profile install github:mbailey/voicemode
2. Add to NixOS configuration (system-wide):
# In /etc/nixos/configuration.nix
environment.systemPackages = [
(builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
3. Add to home-manager:
# In home-manager configuration
home.packages = [
(builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
4. Run without installing:
nix run github:mbailey/voicemode
The only required configuration is your OpenAI API key:
export OPENAI_API_KEY="your-key"
For privacy-focused or offline usage, VoiceMode supports local speech services:
These services provide the same API interface as OpenAI, allowing seamless switching between cloud and local processing.
curl -LsSf https://astral.sh/uv/install.sh | shOPENAI_API_KEY is set correctlyTo save all audio files (both TTS output and STT input):
export VOICEMODE_SAVE_AUDIO=true
Audio files are saved to: ~/.voicemode/audio/YYYY/MM/ with timestamps in the filename.
📚 Read the full documentation at voice-mode.readthedocs.io
MIT - A Failmode Project
mcp-name: com.failmode/voicemode
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.
by danny-avila
Provides a self‑hosted ChatGPT‑style interface supporting numerous AI models, agents, code interpreter, image generation, multimodal interactions, and secure multi‑user authentication.
by block
Automates engineering tasks on local machines, executing code, building projects, debugging, orchestrating workflows, and interacting with external APIs using any LLM.
by RooCodeInc
Provides an autonomous AI coding partner inside the editor that can understand natural language, manipulate files, run commands, browse the web, and be customized via modes and instructions.
by pydantic
A Python framework that enables seamless integration of Pydantic validation with large language models, providing type‑safe agent construction, dependency injection, and structured output handling.
by mcp-use
A Python SDK that simplifies interaction with MCP servers and enables developers to create custom agents with tool‑calling capabilities.
by lastmile-ai
Build effective agents using Model Context Protocol and simple, composable workflow patterns.
by Klavis-AI
Provides production‑ready MCP servers and a hosted service for integrating AI applications with over 50 third‑party services via standardized APIs, OAuth, and easy Docker or hosted deployment.
by nanbingxyz
A cross‑platform desktop AI assistant that connects to major LLM providers, supports a local knowledge base, and enables tool integration via MCP servers.
{
"mcpServers": {
"voice-mode": {
"command": "uvx",
"args": [
"voice-mode"
],
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}claude mcp add voice-mode uvx voice-mode