by ruvnet
Generate a customized AI agent harness for any repository, complete with a repo‑aware CLI, MCP server, scoped memory, governance policies, and out‑of‑the‑box support for Claude Code, OpenAI Codex, pi.dev, Hermes, OpenClaw, and RVM.
MetaHarness is a factory that mints a fully‑featured, branded AI agent harness from a GitHub repository (or a blank project) in seconds. The harness bundles a npx‑installable CLI, an MCP server, domain‑specific skills, slash commands, memory namespaces, governance policies, and cryptographically signed releases.
# Browser version (no install)
open https://ruvnet.github.io/agent-harness-generator/
# CLI version – create a harness for a repo or a blank slate
npx metaharness my‑agent --template vertical:coding --host claude-code
cd my‑agent && npx . --help
npx metaharness --wizard.npx metaharness score <repo>.harness CLI (e.g., harness doctor, harness validate, harness evolve)..zip with your branding and a custom npx <your‑name> CLI.@metaharness/router selects the cheapest model that meets quality thresholds.Do I need to run a server?
No. The Studio runs entirely client‑side, and the generated harness runs locally via the harness CLI.
Will my repository code be executed during analysis?
No. All analysis (metaharness analyze, metaharness genome, harness analyze-repo) is static‑only; inferred commands are marked as trusted but never run.
Can I publish the generated harness as my own package?
Yes. The scaffold includes a package.json and a branded npx entry point. After renaming/scoping, run npm publish.
How does model routing work?
Install @metaharness/router; it predicts the cheapest model that satisfies a quality threshold based on your own evaluation logs.
Is the output compatible with multiple LLM providers?
The same harness can be targeted to any of the supported hosts by selecting the corresponding adapter package (e.g., @metaharness/claude-code).
What security guarantees are provided?
MCP defaults to deny all network, shell, and file‑write permissions; policies are audited via harness mcp-scan. Witness signatures certify provenance, and SBOMs are emitted in SPDX‑2.3 format.
npx metaharness · open the Studio →
(Repo: ruvnet/agent-harness-generator · CLI: metaharness · Library: @ruvnet/agent-harness-generator)
Every serious repo deserves its own agent. A repo-aware CLI, a repo-aware coding agent, a local MCP server, memory scoped to the project, skills generated from the actual file layout, governance policy, release verification, witness-signed provenance.
metaharness mints those, on demand, from a GitHub URL or a blank slate. It is not another agent framework. It is a factory for agent frameworks.
The model is replaceable. The harness is the product.
In under 60 seconds, in your browser, with nothing leaving your machine:
Output is an npm-publishable .zip with your name on it, your branding, your npx <your-name> CLI.
npx metaharness score <repo> reads
the repo (never runs it) and prints a one-screen report card — how well a
harness fits, how likely it is to build, how safe the tools are, and the
rough cost per run — so you know what you'll get before scaffolding.@metaharness/router
routes each request to the right model from your own results — same quality,
far less spend. Works out of the box with zero native deps; train it on your
data for a sharper fit (npm i @metaharness/router). Add the optional
@ruvector/tiny-dancer
to train a fast native model instead — same training data, no API change.@metaharness/darwin) wired in —
run npm run evolve and the harness mutates its own config, tests each change in a
sandbox, and keeps only what measurably improves. The model stays frozen; the harness
evolves. Safe by default (no network, no API key; pure refactor/tuning behind a safety
gate). Validated on real SWE-bench Lite bug-fixing. --no-darwin to skip.A generated harness is a starting point you own, not a fixed framework. Open it and make it yours:
harness doctor / harness validate keep it healthy as you trim.npm publish — now anyone on your team runs
npx @your-org/your-harness and gets the same repo-tuned agent. One
command, org-wide, versioned like any other dependency. (The 19
@metaharness/* examples are exactly this pattern,
published live.)Make older, cheaper models punch like frontier ones. The right harness isn't a pile of extra steps bolted onto an expensive model — it's putting the right model on each task and getting out of the way. Our DRACO benchmark proves it: a small, cheap model delivers frontier-quality research at roughly one-tenth the cost, and a smart router squeezes out the rest. Stop paying frontier prices for work a $0.10 model does just as well.
That router ships as @metaharness/router
— route(query) returns the cheapest model predicted to clear your quality bar,
learned from your own eval logs. npm i @metaharness/router.
# In the browser — zero install, nothing leaves the page
open https://ruvnet.github.io/agent-harness-generator/
# Or in the terminal — the same harness (behaviourally equivalent output)
npx metaharness my-bot --template vertical:coding --host claude-code
cd my-bot && npx . --help
Don't know what to pick? Run the wizard:
npx metaharness --wizard
Already have a repo you want a harness for?
harness analyze-repo . # local — deterministic analysis only
harness analyze-repo . --scaffold my-bot # materialise the recommended harness
No repository code is executed. Inferred build/test commands are emitted as trust: inferred · execution: disabled.
📖 Read the plain-language user guide →
The same harness output runs on nine agent hosts — eight interactive, plus GitHub Actions (CI/CD):
| Host | What ships | Notes |
|---|---|---|
| Claude Code | MCP server + hooks + 3-scope settings | Richest surface; Ruflo-native |
| OpenAI Codex | MCP via ~/.codex/config.toml |
TOML, no hooks |
| pi.dev | Pi extension via pi.registerTool() |
No MCP by design |
| Hermes | MCP runtime, <think> scrubbing |
Per Hermes issue #741 |
| OpenClaw | ~/.openclaw/openclaw.json + workspace skills |
Personal-AI gateway |
| RVM | Bare-metal microhypervisor + capability tokens | Hardware isolation for untrusted peers |
| GitHub Copilot | MCP via .vscode/mcp.json |
VSCode 1.99+ (ADR-032) |
| OpenCode | MCP via .opencode/opencode.json |
sst/opencode TUI (ADR-036) |
| GitHub Actions | .github/workflows/ + composite action.yml |
Non-interactive CI/CD; default-deny via permissions: (ADR-033) |
See ADR-004 — Host integration model and ADR-033 — GitHub Actions host.
MCP is included as a first-class adapter surface, not the identity. It is gated and default-deny (ADR-022):
off · local (stdio) · remote (HTTPS + auth)src/mcp/{server,tools,resources,prompts,policy,audit}.ts + a scannable .harness/mcp-policy.jsonharness mcp-scan <path> — "npm audit for agent tools": static-only scan flagging shell/network grants, missing audit/timeouts, wildcard permissions, unguarded secrets, unpinned deps. Exits 1 on any HIGH.npx metaharness --list
npx metaharness my-bot --template vertical:coding
| Category | Templates |
|---|---|
| Starter / Operations | minimal, vertical:devops |
| Engineering | vertical:coding, vertical:ai, vertical:repo-maintainer (iter 113) |
| Knowledge | vertical:research, vertical:ruview, vertical:education |
| Finance / Pro | vertical:trading, vertical:legal, vertical:health |
| Customer / Growth | vertical:support, vertical:crm, vertical:marketing, vertical:advertising, vertical:sales |
| Business / Frontier | vertical:business, vertical:agentics, vertical:gaming, vertical:exotic |
Each ships bespoke domain agents (with system prompts), skills, commands, and per-host settings — all default-deny.
Don't want to pick flags? Each host and vertical has a dedicated
@metaharness/* wrapper — published, one npx away, no template/host
flags to remember. A scaffold from a wrapper is byte-identical to the
equivalent metaharness invocation.
Host integrations
Vertical workflows (ready-made multi-agent pods)
All 18 are live on npm under @metaharness. Source + per-package README:
examples-packages/ · plain-language deep-dive gists:
examples-packages/GISTS.md.
After scaffolding, every harness has a harness CLI:
| You're trying to … | Subcommand |
|---|---|
| Smoke-check the scaffold | harness doctor |
| Run every release gate | harness validate |
| Check kernel ↔ harness compatibility | harness diag |
| Score the harness 0-100 with badges | harness score |
| Pre-scaffold: is this REPO ready for an agent? | harness genome <repo> |
| Pre-scaffold: fit/cost/safety report card for a repo | metaharness score <repo> |
| MCP threat-model artifact for a PR review | harness threat-model |
| Declare OIA v0.1 layer alignment | harness oia-manifest |
| File a useful support ticket | harness diag --bundle > bundle.json |
| Diff two harnesses | harness compare a/ b/ |
| Share MCP + Bash + claims config for review | harness export-config |
| Run npm-audit per-harness | harness audit --bundle > audit.json |
| Emit SPDX-2.3 SBOM | harness sbom |
| Drift-detect against the latest template | harness upgrade |
| Sign / verify the witness | harness sign · harness verify |
| Pin the manifest to IPFS | harness publish --confirm |
| Recommend a harness from a repo | harness analyze-repo |
21 subcommands total. Every one respects --help / -h. Shell completion: harness completions bash | zsh | fish.
📖 Full reference: docs/USAGE.md
v0.1.x beta — published and usable, with the credibility/doc reconciliation in issue #4 / ADR-042 in progress. The release pipeline is mature: CI matrix green across Rust × 3 OS + WASM × 3 OS + Node 20+22 × 3 OS + Bench + pack+install × 3 OS
node scripts/release.mjs <bump> --push) atomically bump 15 sources, run all gates, and tag.| Layer | Status |
|---|---|
| Rust kernel (WASM + NAPI-RS) | Shipped — 7 subsystems |
| 6 host adapters | claude-code · codex · pi-dev · hermes · openclaw · rvm |
17 harness subcommands |
Shipped |
| 7 Codex skills | Shipped |
| Claude marketplace plugin | Shipped + schema-validated |
| Witness signing (Ed25519) | Shipped + tamper-tested |
| MCP tool dispatch | 11 end-to-end cases |
| Test suite | 568/568 across 67 files |
| CI matrix | 16 jobs green |
| Security pipeline | cargo-audit · cargo-deny · npm-audit · CodeQL · SBOM (SPDX-2.3) |
| Publish pipeline | GCP WIF + 2 gates + 11 packages + IPFS pin |
| Agent Harness Studio | Live at https://ruvnet.github.io/agent-harness-generator/ |
You (harness author)
└→ agent-harness-generator ← the factory
└→ Your harness (.zip) ← what you ship
├ npx <your-name> ← your identity
├ <your agents> ← your content
└ @metaharness/kernel ← shared primitives (Rust + WASM + NAPI-RS)
└→ Host adapter (Claude Code / Codex / pi.dev / Hermes / OpenClaw / RVM)
└→ LLM providers
You operate the factory. The factory produces your harness. Your users never see the factory — only the brand and CLI you ship. The kernel ships as @metaharness/kernel (Rust → wasm-pack + NAPI-RS); your content stays yours.
📖 Deeper: docs/ARCHITECTURE.md · docs/adrs/INDEX.md (31 ADRs)
| Concern | Where |
|---|---|
| CI | ci.yml — Rust 3-platform × fmt/clippy/test/doc + WASM build + size budget + Node 20/22 × 3-platform |
| Publish | publish.yml — GCP WIF → Secret Manager → smoke → npm publish --provenance (SLSA L2) |
| Security | security.yml — cargo-audit + cargo-deny + npm-audit + CodeQL + weekly cron |
| Provenance | ADR-011 — Ed25519-signed witness manifest, byte-deterministic across runners |
| Studio liveness | pages-monitor.yml — daily HTTP probe of live Studio |
| Research quality (DRACO) | draco.yml — cross-domain deep-research benchmark (ADR-037). Deterministic subset gates the scorer/runner machinery on every push (offline); a weekly judged cadence runs the real OpenRouter-fusion score. 5 dimensions (grounding/coverage/balance/cleanliness/faithfulness); the verifier + judge are different model families than the synthesizer (fusion). See packages/bench/draco/. |
git clone https://github.com/ruvnet/agent-harness-generator
cd agent-harness-generator
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings
npm install
npm run build:wasm
npm test
node scripts/healthcheck.mjs
See CONTRIBUTING.md.
MIT — see LICENSE.
MetaHarness is a CLI and browser Studio that turns any GitHub repo (or a
blank slate) into a custom AI agent harness. The output is a branded,
npm-publishable package with its own npx <name> CLI, MCP server, memory,
governance policy, and Ed25519 witness-signed releases. Runs on Claude
Code, OpenAI Codex, pi.dev, Hermes, OpenClaw, and RVM.
Frameworks help developers build agents. MetaHarness helps repositories ship agents. The model is replaceable; the harness is the product.
No. The Studio is 100% client-side (GitHub Pages). The CLI runs locally. There is no MetaHarness account, no hosted backend, no telemetry.
No. metaharness analyze and metaharness genome are deterministic
static-analysis only. Inferred build/test commands are marked
trust: inferred · execution: disabled.
Six today: Claude Code, OpenAI Codex, pi.dev, Hermes (Nous Research), OpenClaw, and RVM. GitHub Copilot and GitHub Actions are proposed in ADR-032 and ADR-033.
Rust, TypeScript / JavaScript, Python, and Go are detected deterministically via lockfile and manifest probing. Lexical scoring is the default; optional in-browser MiniLM embeddings via Transformers.js boost recall for unusual repos.
Yes — the generated harness ships with package.json, bin, a working
CLI, and harness validate to gate releases. harness sign adds the
Ed25519 witness; harness sbom emits SPDX-2.3.
Keywords: metaharness, AI agent CLI, AI agent scaffold, AI agent generator, repo to agent, GitHub repo to AI agent, agent harness, agent harness generator, agent framework alternative, agentic AI, agentic workflow, autonomous AI agents, multi-agent framework, multi-agent system, MCP, MCP server, model context protocol, Claude Code plugin, OpenAI Codex plugin, Anthropic agents, GPT agent, Codex agent, pi.dev extension, hermes agent, Nous Research, OpenClaw, RVM agent, vertical AI agents, custom AI CLI, npx metaharness, npm create AI agent, Rust WASM agent kernel, NAPI-RS, wasm-bindgen, agent memory, ReasoningBank, HNSW vector search, emergent time, witness manifest, Ed25519 signed, provenance, SBOM, SPDX, SLSA, plugin marketplace, IPFS registry, drift detection, anti-slop, TDD agents, self-evolving agents, federated agents, swarm intelligence, GCP Workload Identity Federation, Secret Manager, npm provenance, repo-aware AI, repo-native CLI, repo factory.
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.
by danny-avila
Provides a self‑hosted ChatGPT‑style interface supporting numerous AI models, agents, code interpreter, image generation, multimodal interactions, and secure multi‑user authentication.
by block
Automates engineering tasks on local machines, executing code, building projects, debugging, orchestrating workflows, and interacting with external APIs using any LLM.
by RooCodeInc
Provides an autonomous AI coding partner inside the editor that can understand natural language, manipulate files, run commands, browse the web, and be customized via modes and instructions.
by pydantic
A Python framework that enables seamless integration of Pydantic validation with large language models, providing type‑safe agent construction, dependency injection, and structured output handling.
by mcp-use
A Python SDK that simplifies interaction with MCP servers and enables developers to create custom agents with tool‑calling capabilities.
by lastmile-ai
Build effective agents using Model Context Protocol and simple, composable workflow patterns.
by Klavis-AI
Provides production‑ready MCP servers and a hosted service for integrating AI applications with over 50 third‑party services via standardized APIs, OAuth, and easy Docker or hosted deployment.
by nanbingxyz
A cross‑platform desktop AI assistant that connects to major LLM providers, supports a local knowledge base, and enables tool integration via MCP servers.