Universal MCP Server For AI Agent Roles, Skills & Cognitive Implants

What is Universal MCP Server For AI Agent Roles, Skills & Cognitive Implants about?

The project implements a universal MCP server that acts as a semantic router. It discovers the most suitable agent persona, enriches the prompt with relevant skill chunks and cognitive implants, and returns a ready‑to‑use context for the client. The system is self‑contained: embeddings are generated locally with FastEmbed (ONNX), vector stores are cached on disk, and optional observability is handled through LangFuse.

How to use Universal MCP Server For AI Agent Roles, Skills & Cognitive Implants?

Clone the repository and run the provided init script:

git clone https://github.com/WonderMr/Agents.git
cd Agents
./scripts/init_repo.sh

Activate the virtual environment (if you prefer manual setup):

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp env.example .env   # then add your API keys

Start the server (any MCP‑compatible client will connect):
```
.venv/bin/python src/server.py
```
Configure your client (Claude Code, Cursor, etc.) with a mcp.json that points to the command above.
Use the exposed tools such as route_and_load(query), list_agents(), load_implants(...), log_interaction(...), etc., from the client side.

Key Features

Semantic routing with cache‑first lookup for instant agent selection.
Dynamic loading of agents, skills, and implants based on query semantics.
Tiered enrichment (lite, standard, deep) to control context size.
Local embeddings via FastEmbed (ONNX) – no external model API required.
Auto‑update daemon that fast‑forwards the repository on clean main branch.
Repository memory subsystem (describe_repo, log_interaction, read_history).
Optional LangFuse tracing for full observability of tool calls.
Debug logging (JSON per‑call) toggleable with AGENTS_DEBUG.
Extensible capabilities registry to map high‑level capabilities to skill bundles.

Use Cases

Developer assistants that automatically load relevant code‑review, documentation, or security analysis skills.
Research copilots that fetch domain knowledge implants and provide structured summaries.
Customer support bots that route queries to specialized persona agents (e.g., billing, technical).
Personal knowledge bases where a user’s interaction history is logged and semantically searchable.
Multi‑agent orchestration in IDE extensions where each turn selects the best‑fit agent.

FAQ

Q: Do I need an API key for the routing engine? A: No external model API is required for routing; embeddings are generated locally. API keys are only needed for optional services such as LangFuse or Anthropic OCR.

Q: Can the server run on Windows? A: Yes, as long as Python 3.9+ and ONNX Runtime are available. The init script is POSIX‑sh, but the manual setup steps work on Windows PowerShell.

Q: How does auto‑update avoid breaking my local work? A: It only fast‑forwards the checked‑out main branch when the working tree is clean, never merges or rebases. Failures roll back to the previous commit.

Q: Where are interaction logs stored? A: In history.md at the repository root, rotated into history/YYYY-MM.md when the file exceeds 512 KB. The file is git‑ignored by default.

Q: How can I add a new agent? A: Create agents/<agent_name>/system_prompt.mdc with the required front‑matter (identity, routing keywords, etc.). The server discovers it on the next startup.

🤖 Agents Framework

Universal MCP Server for AI Agent Roles, Skills & Cognitive Implants

A semantic router that dynamically loads specialized agent personas, domain skills, and cognitive reasoning implants based on user queries. Works with any MCP-compatible client (Claude Code, Cursor, Windsurf, and others).

🚀 Quick Start

After Cloning

git clone <repository-url>
cd Agents

# Run initialization script
./scripts/init_repo.sh

The script will:

✅ Create Python virtual environment (.venv/)
✅ Install all dependencies
✅ Create .env configuration file
✅ Validate MCP server configuration

Manual Setup

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp env.example .env
# Edit .env with your API keys

⚙️ Configuration

Required Environment Variables

Create .env file with:

LANGFUSE_PUBLIC_KEY=pk-lf-... # Optional: observability
LANGFUSE_SECRET_KEY=sk-lf-... # Optional: observability
LANGFUSE_HOST=https://cloud.langfuse.com
ANTHROPIC_API_KEY=sk-ant-...  # Optional: for document OCR
AGENTS_DEBUG=0                # Set to 1 for JSON debug logging in logs/

Note: Embeddings are handled locally by fastembed (ONNX Runtime). Model is selected during setup — no external API key is required for core routing.

Background Auto-Update

The server can keep itself current. On startup a daemon thread (non-blocking, so it never delays serving) fast-forwards the install's own git repo and rebuilds the vector stores; the pulled code takes effect on the next start (for per-session stdio servers, the next spawn). The heavy reindex runs in the background of the current session so the next one starts fast.

It is safe by default:

acts only when the checked-out branch is AGENTS_AUTO_UPDATE_BRANCH (default main) — a no-op on feature branches, so local development is never touched;
only when the working tree is clean, and only fast-forward (never merge, rebase, or switch branches);
a failed reindex (e.g. broken new code) is rolled back to the previous commit;
any error (offline, lock held by another process, timeout) is logged and the server keeps serving the current code. Dependencies are not auto-installed.

AGENTS_AUTO_UPDATE=1                     # 0 to disable
AGENTS_AUTO_UPDATE_REMOTE=origin
AGENTS_AUTO_UPDATE_BRANCH=main           # only updates when this branch is checked out
AGENTS_AUTO_UPDATE_TIMEOUT=30            # seconds per git op
AGENTS_AUTO_UPDATE_INTERVAL=900          # throttle network checks (0 = every start)
AGENTS_AUTO_UPDATE_REINDEX_TIMEOUT=600

Run a manual rebuild any time with python -m src.reindex.

🎯 How It Works

The server exposes MCP tools that any compatible client can call:

Tool	Purpose
`route_and_load(query)`	Semantic routing — finds the best agent, enriches its prompt with relevant skills & implants
`get_agent_context(agent_name, query)`	Direct agent loading when the target is already known
`load_implants(query\|task_type)`	Load cognitive reasoning strategies by semantic query or preset bundle
`list_agents()`	Enumerate all available agents with metadata
`log_interaction(agent_name, query, response_content, intent?, action?, outcome?, files?, tags?)`	End-of-turn logger — appends to `history.md` (deduped by content hash) and, if configured, sends a Langfuse generation trace
`clear_session_cache()`	Reset session cache
`describe_repo(force_refresh=False)`	One-shot repo bootstrap — writes a structured summary into the managed Repository Memory section of CLAUDE.md
`read_history(limit?, since?, query?)`	Recent entries or lazy semantic recall over the action log

Routing Flow

route_and_load(query) → Single-hop routing via semantic cache
Meta Detection → Greetings/short queries auto-route to universal_agent
Cache Hit → Returns enriched prompt (SUCCESS) or sampled response (SUCCESS_SAMPLED)
Cache Miss → Returns ROUTE_REQUIRED with agent candidates for client selection
Tier-Based Enrichment → lite (no extras) / standard (2 skills + 2 implants) / deep (4+ skills + 3 implants)
Multi-Turn → context_hash enables delta optimization on follow-up queries

🏗️ Architecture

Agents/
├── agents/               # Agent personas (system prompts, 38 agents)
│   ├── software_engineer/
│   │   └── system_prompt.mdc
│   ├── common/           # Shared agent resources
│   ├── capabilities/     # Capability compositions (registry.yaml)
│   └── schemas/          # Validation schemas
├── skills/               # Reusable knowledge chunks (RAG)
│   └── skill-*.mdc
├── implants/             # Cognitive reasoning strategies (RAG)
│   └── implant-*.mdc
├── src/
│   ├── server.py         # MCP Server entrypoint (FastMCP)
│   ├── engine/
│   │   ├── router.py     # Semantic routing (cache-first)
│   │   ├── skills.py     # Skill retrieval (vector search)
│   │   ├── implants.py   # Implant retrieval (vector search)
│   │   ├── config.py     # Centralized configuration
│   │   ├── embedder.py   # FastEmbed wrapper (ONNX Runtime)
│   │   ├── vector_store.py # NumPy-based vector store
│   │   ├── enrichment.py # Tier-based context enrichment
│   │   ├── capabilities.py # Capability registry resolution
│   │   ├── context.py    # Context retrieval (history formatting)
│   │   └── language.py   # Language detection
│   └── utils/
│       ├── prompt_loader.py
│       ├── debug_logger.py     # Optional JSON debug logging
│       └── langfuse_compat.py  # Optional Langfuse layer
├── data/                 # Vector store cache (auto-initialized)
├── mcp.json              # MCP server configuration
├── pyproject.toml        # Python project metadata
└── requirements.txt

Key Components

Component	Description
Agents	Specialized personas with unique system prompts
Skills	Domain-specific knowledge chunks (retrieved via RAG)
Implants	Cognitive patterns & reasoning strategies
Router	Semantic matching + caching for fast agent selection

🔌 MCP Client Configuration

Claude Code (`.mcp.json` in project root)

{
  "mcpServers": {
    "Agents-Core": {
      "command": ".venv/bin/python",
      "args": ["src/server.py"]
    }
  }
}

Cursor (`mcp.json` in project root)

{
  "mcpServers": {
    "Agents-Core": {
      "command": ".venv/bin/python",
      "args": ["src/server.py"]
    }
  }
}

Generic stdio

source .venv/bin/activate
python src/server.py
# Server communicates via stdin/stdout using MCP protocol

🧠 Creating New Agents

Create directory: agents/<agent_name>/
Create system_prompt.mdc with frontmatter:

---
identity:
  name: "my_agent"
  display_name: "My Agent"
  role: "Expert in X"
  tone: "Professional, Clear"
routing:
  domain_keywords: ["keyword1", "keyword2"]
  trigger_command: "/my_command"
---
# My Agent System Prompt

## Identity
You are an expert in X...

The agent will be auto-discovered by the MCP server on next startup.

Capabilities System

Instead of listing skills per agent, you can declare high-level capabilities:

capabilities: [development, dev-security]

The enrichment pipeline resolves capabilities to skill bundles via agents/capabilities/registry.yaml. Available capabilities: critical-analysis, content-structure, development, dense-summary, trust-weighted-research, bio-health, tech-documentation, dev-security, consultative-intake, creative-writing, psychology, 3d-printing, data-investigation, epistemic-analysis, code-review, decision-making, product-thinking, temporal-research, performance-engineering, prompt-design, prompt-security, roblox-development, dev-tools, blender-scripting, health-optimization, consumer-research, visualization, child-psychology.

🧠 Repository Memory

The server ships with a per-repo memory subsystem so each new Claude session does not have to re-explore the codebase from scratch:

describe_repo — generates a compressed, LLM-consumable repo overview via MCP sampling and writes it into the managed Repository Memory section of CLAUDE.md. Idempotent: re-runs are no-ops unless the repo manifest changes or force_refresh=True.
log_interaction — end-of-turn logger. Appends intent / action / outcome entries (with optional files and tags) to history.md at the repo root; deduplicated by content hash; rotated to history/YYYY-MM.md when the file exceeds 512 KB. Also sends a Langfuse generation trace if keys are configured.
read_history — returns recent entries by recency/since filter, or runs a lazy semantic search backed by the same NumpyVectorStore used for routing.

The full design and step-by-step rationale lives in docs/memory-subsystem-spec.md.

⚠️ Privacy warning — history.md captures raw prompts and responses. If you paste secrets (API keys, tokens, credentials) into Claude, they will land in this file. It is gitignored by default to keep them out of git history; if you want the action log visible in PRs, remove history.md / history/ from .gitignore and review entries before pushing.

📊 Observability

The framework integrates with LangFuse for tracing:

All tool calls are automatically traced
Routing decisions are logged
Cache hits/misses are tracked

Configure LangFuse in .env or leave blank for local-only operation.

🛠️ Development

Running Server Manually

source .venv/bin/activate
python src/server.py

Debug Logging

Enable detailed per-call JSON logging:

AGENTS_DEBUG=1 python src/server.py

Logs are written to logs/{YYYY-MM-DD}/{HH-MM-SS.fff}_{tool}_{direction}.json. Zero overhead when disabled.

📝 License

MIT

Universal MCP Server For AI Agent Roles, Skills & Cognitive Implants