Zen MCP Server

What is Zen MCP Server about?

Zen MCP Server enables Claude (or any supported CLI) to act as a central coordinator that dynamically invokes other AI models (Gemini, OpenAI, Grok, Ollama, etc.) for specific sub‑tasks. It maintains conversation continuity across model switches, revives context after resets, and offers guided multi‑step workflows such as multi‑model code reviews, systematic debugging, and architectural planning.

How to use Zen MCP Server?

Prerequisites – Install Python 3.10+, Git and uv (or use the provided shell script).

Clone & launch

git clone https://github.com/BeehiveInnovations/zen-mcp-server.git
cd zen-mcp-server
./run-server.sh   # auto‑configures env, API keys and starts the MCP server

Alternatively, run with uvx as shown in the README.

Configure API keys – Set GEMINI_API_KEY, OPENAI_API_KEY, OPENROUTER_API_KEY, GROK_API_KEY, etc., in the generated .env or via your Claude settings JSON.
Enable/disable tools – Edit DISABLED_TOOLS in .env or your settings file to control which workflow tools are active.
Issue commands – From Claude Code, Gemini CLI, Codex CLI, or an IDE integration (Cursor, VS Code), issue natural‑language prompts like:
```
"Perform a codereview using gemini pro and o3, then generate a fix plan"
"Debug this race condition with max thinking mode and validate with precommit"
"Plan a microservices migration and get consensus from pro and o3"
```
The server routes each sub‑task to the appropriate model, preserves context, and returns consolidated results.

Key Features of Zen MCP Server

Multi‑model orchestration – Claude automatically selects the optimal model per sub‑task; manual overrides are also supported.
Conversation continuity – Context flows seamlessly across tools and models; other models can “remind” Claude after a reset.
Guided multi‑step workflows – Built‑in tools (codereview, debug, planner, precommit, consensus, etc.) enforce systematic analysis.
Extended token windows – Delegates large prompts to models with massive context (Gemini 1M tokens, O3 200K tokens).
Vision support – Image, diagram and screenshot analysis via vision‑enabled models.
Local model capability – Run Ollama, Llama, Mistral locally for privacy and zero API cost.
Tool configuration – Enable or disable individual tools to manage context consumption.
Automatic token‑limit bypass – Splits oversized prompts and stitches responses, circumventing MCP’s 25K token ceiling.

Use Cases of Zen MCP Server

Scenario	How Zen Helps
Multi‑model code review	Runs Claude‑led review, then consults Gemini Pro & O3 for deeper insights, merges feedback into a single actionable list.
Complex debugging	Systematic root‑cause analysis with `debug`, cross‑checks findings with multiple models, then validates fixes via `precommit`.
Architecture planning	Uses `planner` to break down migrations, gathers consensus from several experts (`consensus`), produces a roadmap with milestones.
Security auditing	Enables `secaudit` (or runs locally with Ollama) to scan for OWASP Top 10 issues, then consolidates recommendations.
Test generation	Activates `testgen` to produce unit and integration tests, leveraging diverse model perspectives for edge‑case coverage.
Documentation generation	Calls `docgen` to auto‑create API docs, architecture diagrams, and changelogs from codebase analyses.

FAQ from the Zen MCP Server

Q: Do I need to install every supported model locally? A: No. Zen can route requests to remote APIs (Gemini, OpenAI, OpenRouter, X.AI, etc.) or to locally hosted models via Ollama. Choose whichever fits your privacy and cost requirements.

Q: How does context revival work after Claude’s window resets? A: When Claude’s context expires, another model that still holds the conversation history can be prompted to summarize and feed the essential information back to Claude, effectively “reviving” the thread.

Q: Can I restrict which models are used for a particular workflow? A: Yes. You can explicitly mention the model in your prompt (e.g., use gemini pro) or set DEFAULT_MODEL / model‑selection rules in the .env configuration.

Q: What if I exceed a model’s token limit? A: Zen automatically splits large payloads, sends them to a model with a larger context window, and reassembles the response, transparently handling the limit.

Q: Is there a way to see which tools are currently enabled? A: The server reads the DISABLED_TOOLS environment variable. Running zen listtools (or checking logs) will display the active and disabled tool set.

Q: How do I integrate Zen with IDEs like VS Code or Cursor? A: Follow the "Cursor & VS Code Setup" section in the docs. It involves adding the MCP server URL to the extension’s settings so that the IDE routes its prompts through Zen.

Q: Is the project open source and under what license? A: Yes, it is released under the Apache 2.0 license.

Zen MCP: Many Workflows. One Context.

Zen in action

👉 Watch more examples

Your CLI + Multiple Models = Your AI Dev Team

Use the 🤖 CLI you love:
Claude Code · Gemini CLI · Codex CLI · Qwen Code CLI · Cursor · and more

With multiple models within a single prompt:
Gemini · OpenAI · Anthropic · Grok · Azure · Ollama · OpenRouter · DIAL · On-Device Model

🆕 Now with CLI-to-CLI Bridge

The new clink (CLI + Link) tool connects external AI CLIs directly into your workflow:

Connect external CLIs like Gemini CLI, Codex CLI, and Claude Code directly into your workflow
CLI Subagents - Launch isolated CLI instances from within your current CLI! Claude Code can spawn Codex subagents, Codex can spawn Gemini CLI subagents, etc. Offload heavy tasks (code reviews, bug hunting) to fresh contexts while your main session's context window remains unpolluted. Each subagent returns only final results.
Context Isolation - Run separate investigations without polluting your primary workspace
Role Specialization - Spawn planner, codereviewer, or custom role agents with specialized system prompts
Full CLI Capabilities - Web search, file inspection, MCP tool access, latest documentation lookups
Seamless Continuity - Sub-CLIs participate as first-class members with full conversation context between tools

# Codex spawns Codex subagent for isolated code review in fresh context
clink with codex codereviewer to audit auth module for security issues
# Subagent reviews in isolation, returns final report without cluttering your context as codex reads each file and walks the directory structure

# Consensus from different AI models → Implementation handoff with full context preservation between tools
Use consensus with gpt-5 and gemini-pro to decide: dark mode or offline support next
Continue with clink gemini - implement the recommended feature
# Gemini receives full debate context and starts coding immediately

👉 Learn more about clink

Why Zen MCP?

Why rely on one AI model when you can orchestrate them all?

A Model Context Protocol server that supercharges tools like Claude Code, Codex CLI, and IDE clients such as Cursor or the Claude Dev VS Code extension. Zen MCP connects your favorite AI tool to multiple AI models for enhanced code analysis, problem-solving, and collaborative development.

True AI Collaboration with Conversation Continuity

Zen supports conversation threading so your CLI can discuss ideas with multiple AI models, exchange reasoning, get second opinions, and even run collaborative debates between models to help you reach deeper insights and better solutions.

Your CLI always stays in control but gets perspectives from the best AI for each subtask. Context carries forward seamlessly across tools and models, enabling complex workflows like: code reviews with multiple models → automated planning → implementation → pre-commit validation.

You're in control. Your CLI of choice orchestrates the AI team, but you decide the workflow. Craft powerful prompts that bring in Gemini Pro, GPT 5, Flash, or local offline models exactly when needed.

A typical workflow with Claude Code as an example:

Multi-Model Orchestration - Claude coordinates with Gemini Pro, O3, GPT-5, and 50+ other models to get the best analysis for each task
Context Revival Magic - Even after Claude's context resets, continue conversations seamlessly by having other models "remind" Claude of the discussion
Guided Workflows - Enforces systematic investigation phases that prevent rushed analysis and ensure thorough code examination
Extended Context Windows - Break Claude's limits by delegating to Gemini (1M tokens) or O3 (200K tokens) for massive codebases
True Conversation Continuity - Full context flows across tools and models - Gemini remembers what O3 said 10 steps ago
Model-Specific Strengths - Extended thinking with Gemini Pro, blazing speed with Flash, strong reasoning with O3, privacy with local Ollama
Professional Code Reviews - Multi-pass analysis with severity levels, actionable feedback, and consensus from multiple AI experts
Smart Debugging Assistant - Systematic root cause analysis with hypothesis tracking and confidence levels
Automatic Model Selection - Claude intelligently picks the right model for each subtask (or you can specify)
Vision Capabilities - Analyze screenshots, diagrams, and visual content with vision-enabled models
Local Model Support - Run Llama, Mistral, or other models locally for complete privacy and zero API costs
Bypass MCP Token Limits - Automatically works around MCP's 25K limit for large prompts and responses

The Killer Feature: When Claude's context resets, just ask to "continue with O3" - the other model's response magically revives Claude's understanding without re-ingesting documents!

Example: Multi-Model Code Review Workflow

Perform a codereview using gemini pro and o3 and use planner to generate a detailed plan, implement the fixes and do a final precommit check by continuing from the previous codereview
This triggers a codereview workflow where Claude walks the code, looking for all kinds of issues
After multiple passes, collects relevant code and makes note of issues along the way
Maintains a confidence level between exploring, low, medium, high and certain to track how confidently it's been able to find and identify issues
Generates a detailed list of critical -> low issues
Shares the relevant files, findings, etc with Gemini Pro to perform a deep dive for a second codereview
Comes back with a response and next does the same with o3, adding to the prompt if a new discovery comes to light
When done, Claude takes in all the feedback and combines a single list of all critical -> low issues, including good patterns in your code. The final list includes new findings or revisions in case Claude misunderstood or missed something crucial and one of the other models pointed this out
It then uses the planner workflow to break the work down into simpler steps if a major refactor is required
Claude then performs the actual work of fixing highlighted issues
When done, Claude returns to Gemini Pro for a precommit review

All within a single conversation thread! Gemini Pro in step 11 knows what was recommended by O3 in step 7! Taking that context and review into consideration to aid with its final pre-commit review.

Think of it as Claude Code for Claude Code. This MCP isn't magic. It's just super-glue.

Remember: Claude stays in full control — but YOU call the shots. Zen is designed to have Claude engage other models only when needed — and to follow through with meaningful back-and-forth. You're the one who crafts the powerful prompt that makes Claude bring in Gemini, Flash, O3 — or fly solo. You're the guide. The prompter. The puppeteer.

You are the AI - Actually Intelligent.

Recommended AI Stack

For best results when using Claude Code:

Sonnet 4.5 - All agentic work and orchestration
Gemini 3.0 Pro OR GPT-5-Pro - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis

For best results when using Codex CLI:

GPT-5 Codex Medium - All agentic work and orchestration
Gemini 3.0 Pro OR GPT-5-Pro - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis

Quick Start (5 minutes)

Prerequisites: Python 3.10+, Git, uv installed

1. Get API Keys (choose one or more):

OpenRouter - Access multiple models with one API
Gemini - Google's latest models
OpenAI - O3, GPT-5 series
Azure OpenAI - Enterprise deployments of GPT-4o, GPT-4.1, GPT-5 family
X.AI - Grok models
DIAL - Vendor-agnostic model access
Ollama - Local models (free)

2. Install (choose one):

Option A: Clone and Automatic Setup (recommended)

git clone https://github.com/BeehiveInnovations/zen-mcp-server.git
cd zen-mcp-server

# Handles everything: setup, config, API keys from system environment. 
# Auto-configures Claude Desktop, Claude Code, Gemini CLI, Codex CLI, Qwen CLI
# Enable / disable additional settings in .env
./run-server.sh

Option B: Instant Setup with uvx

// Add to ~/.claude/settings.json or .mcp.json
// Don't forget to add your API keys under env
{
  "mcpServers": {
    "zen": {
      "command": "bash",
      "args": ["-c", "for p in $(which uvx 2>/dev/null) $HOME/.local/bin/uvx /opt/homebrew/bin/uvx /usr/local/bin/uvx uvx; do [ -x \"$p\" ] && exec \"$p\" --from git+https://github.com/BeehiveInnovations/zen-mcp-server.git zen-mcp-server; done; echo 'uvx not found' >&2; exit 1"],
      "env": {
        "PATH": "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:~/.local/bin",
        "GEMINI_API_KEY": "your-key-here",
        "DISABLED_TOOLS": "analyze,refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "auto"
      }
    }
  }
}

3. Start Using!

"Use zen to analyze this code for security issues with gemini pro"
"Debug this error with o3 and then get flash to suggest optimizations"
"Plan the migration strategy with zen, get consensus from multiple models"
"clink with cli_name=\"gemini\" role=\"planner\" to draft a phased rollout plan"

👉 Complete Setup Guide with detailed installation, configuration for Gemini / Codex / Qwen, and troubleshooting 👉 Cursor & VS Code Setup for IDE integration instructions 📺 Watch tools in action to see real-world examples

Provider Configuration

Zen activates any provider that has credentials in your .env. See .env.example for deeper customization.

Core Tools

Note: Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See Tool Configuration below to enable them.

Collaboration & Planning (Enabled by default)

clink - Bridge requests to external AI CLIs (Gemini planner, codereviewer, etc.)
chat - Brainstorm ideas, get second opinions, validate approaches. With capable models (GPT-5 Pro, Gemini 3.0 Pro), generates complete code / implementation
thinkdeep - Extended reasoning, edge case analysis, alternative perspectives
planner - Break down complex projects into structured, actionable plans
consensus - Get expert opinions from multiple AI models with stance steering

Code Analysis & Quality

debug - Systematic investigation and root cause analysis
precommit - Validate changes before committing, prevent regressions
codereview - Professional reviews with severity levels and actionable feedback
analyze (disabled by default - enable) - Understand architecture, patterns, dependencies across entire codebases

Development Tools (Disabled by default - enable)

refactor - Intelligent code refactoring with decomposition focus
testgen - Comprehensive test generation with edge cases
secaudit - Security audits with OWASP Top 10 analysis
docgen - Generate documentation with complexity analysis

Utilities

apilookup - Forces current-year API/SDK documentation lookups in a sub-process (saves tokens within the current context window), prevents outdated training data responses
challenge - Prevent "You're absolutely right!" responses with critical analysis
tracer (disabled by default - enable) - Static analysis prompts for call-flow mapping

Default Configuration

To optimize context window usage, only essential tools are enabled by default:

Enabled by default:

chat, thinkdeep, planner, consensus - Core collaboration tools
codereview, precommit, debug - Essential code quality tools
apilookup - Rapid API/SDK information lookup
challenge - Critical thinking utility

Disabled by default:

analyze, refactor, testgen, secaudit, docgen, tracer

Enabling Additional Tools

To enable additional tools, remove them from the DISABLED_TOOLS list:

Option 1: Edit your .env file

# Default configuration (from .env.example)
DISABLED_TOOLS=analyze,refactor,testgen,secaudit,docgen,tracer

# To enable specific tools, remove them from the list
# Example: Enable analyze tool
DISABLED_TOOLS=refactor,testgen,secaudit,docgen,tracer

# To enable ALL tools
DISABLED_TOOLS=

Option 2: Configure in MCP settings

// In ~/.claude/settings.json or .mcp.json
{
  "mcpServers": {
    "zen": {
      "env": {
        // Tool configuration
        "DISABLED_TOOLS": "refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "pro",
        "DEFAULT_THINKING_MODE_THINKDEEP": "high",
        
        // API configuration
        "GEMINI_API_KEY": "your-gemini-key",
        "OPENAI_API_KEY": "your-openai-key",
        "OPENROUTER_API_KEY": "your-openrouter-key",
        
        // Logging and performance
        "LOG_LEVEL": "INFO",
        "CONVERSATION_TIMEOUT_HOURS": "6",
        "MAX_CONVERSATION_TURNS": "50"
      }
    }
  }
}

Option 3: Enable all tools

// Remove or empty the DISABLED_TOOLS to enable everything
{
  "mcpServers": {
    "zen": {
      "env": {
        "DISABLED_TOOLS": ""
      }
    }
  }
}

Note:

Essential tools (version, listmodels) cannot be disabled
After changing tool configuration, restart your Claude session for changes to take effect
Each tool adds to context window usage, so only enable what you need

📺 Watch Tools In Action

Picking Redis vs Memcached:

Chat Redis or Memcached_web.webm

Multi-turn conversation with continuation:

Chat With Gemini_web.webm

Multi-model consensus debate:

Zen Consensus Debate

Pre-commit validation workflow:

Without Zen - outdated APIs:

API without Zen

With Zen - current APIs:

API with Zen

Without Zen:

without_zen@2x

With Zen:

with_zen@2x

Key Features

AI Orchestration

Auto model selection - Claude picks the right AI for each task
Multi-model workflows - Chain different models in single conversations
Conversation continuity - Context preserved across tools and models
Context revival - Continue conversations even after context resets

Model Support

Multiple providers - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama
Latest models - GPT-5, Gemini 3.0 Pro, O3, Grok-4, local Llama
Thinking modes - Control reasoning depth vs cost
Vision support - Analyze images, diagrams, screenshots

Developer Experience

Guided workflows - Systematic investigation prevents rushed analysis
Smart file handling - Auto-expand directories, manage token limits
Web search integration - Access current documentation and best practices
Large prompt support - Bypass MCP's 25K token limit

Example Workflows

Multi-model Code Review:

"Perform a codereview using gemini pro and o3, then use planner to create a fix strategy"

→ Claude reviews code systematically → Consults Gemini Pro → Gets O3's perspective → Creates unified action plan

Collaborative Debugging:

"Debug this race condition with max thinking mode, then validate the fix with precommit"

→ Deep investigation → Expert analysis → Solution implementation → Pre-commit validation

Architecture Planning:

"Plan our microservices migration, get consensus from pro and o3 on the approach"

→ Structured planning → Multiple expert opinions → Consensus building → Implementation roadmap

👉 Advanced Usage Guide for complex workflows, model configuration, and power-user features

Quick Links

📖 Documentation

Docs Overview - High-level map of major guides
Getting Started - Complete setup guide
Tools Reference - All tools with examples
Advanced Usage - Power user features
Configuration - Environment variables, restrictions
Adding Providers - Provider-specific setup (OpenAI, Azure, custom gateways)
Model Ranking Guide - How intelligence scores drive auto-mode suggestions

🔧 Setup & Support

WSL Setup - Windows users
Troubleshooting - Common issues
Contributing - Code standards, PR process

License

Apache 2.0 License - see LICENSE file for details.

Acknowledgments

Built with the power of Multi-Model AI collaboration 🤝

Zen MCP Server Overview

What is Zen MCP Server about?

How to use Zen MCP Server?

Key Features of Zen MCP Server

Use Cases of Zen MCP Server

FAQ from the Zen MCP Server

Zen MCP Server's README

Zen MCP: Many Workflows. One Context.

Your CLI + Multiple Models = Your AI Dev Team

🆕 Now with CLI-to-CLI Bridge

Why Zen MCP?

True AI Collaboration with Conversation Continuity

Example: Multi-Model Code Review Workflow

You are the AI - Actually Intelligent.

Recommended AI Stack

Quick Start (5 minutes)

Provider Configuration

Core Tools

Default Configuration

Enabling Additional Tools

📺 Watch Tools In Action

Key Features

Example Workflows

Quick Links

License

Acknowledgments

Star History

Zen MCP Server Reviews

Login Required

Similar MCP Servers like Zen MCP Server

Git

Zed

Everything

Time

Cline

Context7 MCP

Daytona

Continue

GitHub MCP Server

Actions

Zen MCP Server's Information