by Kirachon
Provides semantic code search, AI‑powered prompt enhancement, planning, execution tracking, persistent memories, and advanced code review through a local‑first MCP server built on the Auggie SDK.
The server enables agents to retrieve and reason over a project's codebase without sending any data to the cloud. It indexes source files, stores embeddings, and exposes 41 specialized tools – from simple file lookup to full‑stack planning, approval workflows, and deterministic code review.
npm install && npm run build.node dist/index.js --workspace /path/to/project --index --watch
– --workspace selects the codebase.
– --index performs an initial indexing pass.
– --watch enables incremental indexing on file changes.semantic_search("authentication flow") or create_plan({task:"Add JWT auth"}).Q: Do I need an internet connection?
A: No. All indexing, embedding, and retrieval happen locally. Only the optional Auggie API (for embedding models) requires network access, which can be disabled via CONTEXT_ENGINE_OFFLINE_ONLY=true.
Q: How are embeddings generated? A: The Auggie SDK handles model loading and inference. It can use a local model or call the Augment API depending on configuration.
Q: Can I use the server with other MCP‑compatible tools? A: Yes. The server follows the standard MCP tool API, so any client that supports MCP (Codex, Claude Desktop, Cursor, Antigravity, etc.) can connect.
Q: How do I enable static analysis for code review?
A: Include enable_static_analysis: true and specify the desired analyzers (e.g., tsc, semgrep) in the review_diff options.
Q: What is the workflow for planning?
A: create_plan → save_plan (optional) → start_step / complete_step → view_progress. Plans are versioned and can be rolled back.
Q: How are memories persisted?
A: Memories are saved as markdown files under .memories/ and loaded automatically when get_context_for_prompt is called.
Q: How can I expose metrics?
A: Start the server with --http and set CE_METRICS=true and CE_HTTP_METRICS=true. Metrics are available at GET /metrics.
A local-first, agent-agnostic Model Context Protocol (MCP) server implementation using the Auggie SDK as the core context engine.
📚 New here? Check out INDEX.md for a complete documentation guide!
🚀 Quick Start: QUICKSTART.md → GETTING_STARTED.md → API_REFERENCE.md
🪟 Windows Deployment: docs/WINDOWS_DEPLOYMENT_GUIDE.md
🏗️ Architecture: TECHNICAL_ARCHITECTURE.md for deep technical dive
This implementation follows a clean 5-layer architecture as outlined in plan.md:
┌────────────────────────────┐
│ Coding Agents (Clients) │ Layer 4: Claude, Cursor, etc.
│ Codex | Claude | Cursor │
└────────────▲───────────────┘
│ MCP (tools)
┌────────────┴───────────────┐
│ MCP Interface Layer │ Layer 3: server.ts, tools/
│ (standardized tool API) │
└────────────▲───────────────┘
│ internal API
┌────────────┴───────────────┐
│ Context Service Layer │ Layer 2: serviceClient.ts
│ (query orchestration) │
└────────────▲───────────────┘
│ domain calls
┌────────────┴───────────────┐
│ Core Context Engine │ Layer 1: Auggie SDK
│ (indexing, retrieval) │
└────────────▲───────────────┘
│ storage
┌────────────┴───────────────┐
│ Storage / Index Backend │ Layer 5: Auggie's internal
│ (vectors, metadata) │
└────────────────────────────┘
index_workspace(force?) - Index workspace files for semantic search
force (optional): Force re-indexing even if files haven't changedcodebase_retrieval(query, top_k?) - PRIMARY semantic search with JSON output for programmatic use
query: Natural language search querytop_k (optional): Number of results to return (default: 5)semantic_search(query, top_k?, mode?, bypass_cache?, timeout_ms?) - Semantic code search with markdown-formatted output
query: Natural language search querytop_k (optional): Number of results to return (default: 5)mode (optional): "fast" (default) or "deep" for higher recall at higher latencybypass_cache (optional): When true, bypass caches for this calltimeout_ms (optional): Cap time spent in retrieval pipeline (ms)get_file(path) - Retrieve complete file contents
path: Relative path to file from workspace rootget_context_for_prompt(query, max_files?, token_budget?, include_related?, min_relevance?, bypass_cache?) - Get comprehensive context bundle for prompt enhancement
query: Context request descriptionmax_files (optional): Maximum files to include (default: 5)token_budget (optional): Token budget for the bundle (default: 8000)include_related (optional): Include related/imported files (default: true)min_relevance (optional): Minimum relevance score (default: 0.3)bypass_cache (optional): When true, bypass caches for this callenhance_prompt(prompt) - AI-powered prompt enhancement with codebase context
prompt: Simple prompt to enhanceindex_status() - View index health metadata (status, fileCount, lastIndexed, isStale)reindex_workspace() - Clear and rebuild the entire index from scratchclear_index() - Remove index state without rebuildingtool_manifest() - Discovery tool for available capabilitiesadd_memory(category, content, title?) - Store persistent memories for future sessions
category: 'preferences', 'decisions', or 'facts'content: The memory content to store (max 5000 characters)title (optional): Title for the memorylist_memories(category?) - List all stored memories
category (optional): Filter to a specific categorycreate_plan(task, options?) - Generate structured execution plans with DAG analysis
task: Task or goal to plan forgenerate_diagrams (optional): Generate Mermaid diagrams (default: true)refine_plan(current_plan, feedback?, clarifications?) - Refine existing plans based on feedbackvisualize_plan(plan, diagram_type?) - Generate visual representations (Mermaid diagrams)execute_plan(plan, ...) - Execute plan steps with AI-powered code generationsave_plan(plan, name?, tags?, overwrite?) - Save plans to persistent storageload_plan(plan_id \| name) - Load previously saved planslist_plans(status?, tags?, limit?) - List saved plans with filteringdelete_plan(plan_id) - Delete saved plans from storagerequest_approval(plan_id, step_numbers?) - Create approval requests for plans or specific stepsrespond_approval(request_id, action, comments?) - Respond to approval requestsstart_step(plan_id, step_number) - Mark a step as in-progresscomplete_step(plan_id, step_number, notes?, files_modified?) - Mark a step as completedfail_step(plan_id, step_number, error, ...) - Mark a step as failedview_progress(plan_id) - View execution progress and statisticsview_history(plan_id, limit?, include_plans?) - View version history of a plancompare_plan_versions(plan_id, from_version, to_version) - Generate diff between versionsrollback_plan(plan_id, version, reason?) - Rollback to a previous plan versionreview_changes(diff, file_contexts?, options?) - AI-powered code review with structured outputreview_git_diff(target?, base?, include_patterns?, options?) - Review code changes from git automaticallyreview_diff(diff, changed_files?, options?) - Enterprise review with risk scoring and static analysis
check_invariants(diff, changed_files?, invariants_path?) - Run YAML invariants deterministically (no LLM)run_static_analysis(changed_files?, options?) - Run local static analyzers (tsc, semgrep)reactive_review_pr(...) - Start a session-based, parallelized code reviewget_review_status(session_id) - Track progress of a reactive reviewpause_review(session_id) - Pause a running review sessionresume_review(session_id) - Resume a paused sessionget_review_telemetry(session_id) - Detailed metrics (tokens, speed, cache hits)scrub_secrets(content) - Mask API keys and sensitive datavalidate_content(content, content_type, ...) - Multi-tier validation for AI-generated contentVersion 1.8.0 introduces massive performance improvements to the reactive code review system, reducing review times from 30-50 minutes to 3-15 seconds for typical PRs.
| Phase | Feature | Performance Gain | Description |
|---|---|---|---|
| Phase 1 | AI Agent Executor | 15-50x | Executes reviews directly via the AI agent instead of external API calls. |
| Phase 2 | Multi-Layer Cache | 2-4x (cached) | 3-layer system: Memory (fastest) -> Commit (git-aware) -> File Hash (content-based). |
| Phase 3 | Continuous Batching | 2-3x | Accumulates and processes multiple files in a single AI request. |
| Phase 4 | Worker Pool Optimization | 1.5-2x | CPU-aware parallel execution with intelligent load balancing. |
| Scenario | v1.7.1 | v1.8.0 | Improvement |
|---|---|---|---|
| Cold Run (10 steps) | 30-50 min | ~60-90 sec | 25-45x ⚡ |
| Cached Run | 30-50 min | ~10-30 sec | 60-180x ⚡ |
| Batched Run | 30-50 min | ~5-15 sec | 120-360x ⚡ |
| Full Optimization | 30-50 min | 3-10 sec | 180-600x 🚀 |
Version 1.9.0 introduces optional static analysis and deterministic invariants checking for enhanced code review capabilities.
| Analyzer | Description | Opt-in |
|---|---|---|
| TypeScript | Type checking via tsc --noEmit |
Default |
| Semgrep | Pattern-based security/quality checks | Optional (requires installation) |
review_diff({
diff: "<unified diff>",
changed_files: ["src/file.ts"],
options: {
enable_static_analysis: true,
static_analyzers: ["tsc", "semgrep"],
static_analysis_timeout_ms: 60000
}
})
run_static_analysis({
changed_files: ["src/file.ts"],
options: {
analyzers: ["tsc", "semgrep"],
timeout_ms: 60000,
max_findings_per_analyzer: 20
}
})
check_invariants({
diff: "<unified diff>",
changed_files: ["src/file.ts"],
invariants_path: ".review-invariants.yml"
})
Create .review-invariants.yml in your workspace root:
invariants:
- id: no-console-log
pattern: "console\\.log"
message: "Remove console.log statements before committing"
severity: MEDIUM
- id: no-todo-comments
pattern: "TODO|FIXME"
message: "Resolve TODO/FIXME comments"
severity: LOW
- id: require-error-handling
pattern: "catch\\s*\\(\\s*\\)"
message: "Empty catch blocks should log or handle errors"
severity: HIGH
The review_diff tool now reports detailed timing breakdowns in stats.timings_ms:
{
"stats": {
"timings_ms": {
"preflight": 45,
"invariants": 12,
"static_analysis": 3200,
"context_fetch": 890,
"secrets_scrub": 5,
"llm_structural": 1200,
"llm_detailed": 2400
}
}
}
This allows you to:
The Context Engine now includes a complete planning and execution system:
create_plan({
task: "Implement user authentication with JWT tokens",
generate_diagrams: true
})
save_plan({
plan: "<plan JSON>",
name: "JWT Authentication",
tags: ["auth", "security"]
})
// Start a step
start_step({ plan_id: "plan_abc123", step_number: 1 })
// Complete it
complete_step({
plan_id: "plan_abc123",
step_number: 1,
notes: "Created User model"
})
// Check progress
view_progress({ plan_id: "plan_abc123" })
// View version history
view_history({ plan_id: "plan_abc123" })
// Compare versions
compare_plan_versions({
plan_id: "plan_abc123",
from_version: 1,
to_version: 2
})
// Rollback if needed
rollback_plan({ plan_id: "plan_abc123", version: 1 })
See EXAMPLES.md for complete planning workflow examples.
The Context Engine includes a cross-session memory system that persists preferences, decisions, and project facts across sessions.
| Category | Purpose | Examples |
|---|---|---|
preferences |
Coding style and tool preferences | "Prefer TypeScript strict mode", "Use Jest for testing" |
decisions |
Architecture and design decisions | "Chose JWT over sessions", "Using PostgreSQL" |
facts |
Project facts and environment info | "API runs on port 3000", "Uses monorepo structure" |
// Store a preference
add_memory({
category: "preferences",
content: "Prefers functional programming patterns over OOP"
})
// Store an architecture decision with a title
add_memory({
category: "decisions",
title: "Authentication Strategy",
content: "Chose JWT with refresh tokens for stateless authentication. Sessions were considered but rejected due to horizontal scaling requirements."
})
// Store a project fact
add_memory({
category: "facts",
content: "The API uses PostgreSQL 15 with pgvector extension for embeddings"
})
Memories are automatically included in get_context_for_prompt results when relevant:
// Memories are retrieved alongside code context
const context = await get_context_for_prompt({
query: "How should I implement authentication?"
})
// Returns: code context + relevant memories about auth decisions
Memories are stored in .memories/ as markdown files:
preferences.md - Coding style preferencesdecisions.md - Architecture decisionsfacts.md - Project factsThese files are human-editable and can be version controlled with Git.
npm install -g @augmentcode/auggie
auggie login or set environment variables:
export AUGMENT_API_TOKEN="your-token"
export AUGMENT_API_URL="https://api.augmentcode.com"
# Clone or navigate to the repository
cd context-engine
# Install dependencies
npm install
# Build the project
npm run build
For Windows users, a convenient batch file is provided for managing the server:
# Start the server with indexing and file watching
manage-server.bat start
# Check server status
manage-server.bat status
# Restart the server
manage-server.bat restart
# Stop the server
manage-server.bat stop
The management script automatically:
--index)--watch).server.log.server.pid# Start server with current directory
node dist/index.js
# Start with specific workspace
node dist/index.js --workspace /path/to/project
# Index workspace before starting
node dist/index.js --workspace /path/to/project --index
# Enable file watcher for automatic incremental indexing (v1.1.0)
node dist/index.js --workspace /path/to/project --watch
| Option | Alias | Description |
|---|---|---|
--workspace <path> |
-w |
Workspace directory to index (default: current directory) |
--index |
-i |
Index the workspace before starting server |
--watch |
-W |
Enable filesystem watcher for incremental indexing |
--http |
- | Enable HTTP server (in addition to stdio) |
--http-only |
- | Enable HTTP server only (for VS Code integration) |
--port <port> |
-p |
HTTP server port (default: 3333) |
--help |
-h |
Show help message |
Build the project:
npm run build
Add the MCP server to Codex CLI:
codex mcp add context-engine -- node /absolute/path/to/context-engine/dist/index.js --workspace /path/to/your/project
Or edit ~/.codex/config.toml directly:
[mcp_servers.context-engine]
command = "node"
args = [
"/absolute/path/to/context-engine/dist/index.js",
"--workspace",
"/path/to/your/project"
]
Restart Codex CLI
Type /mcp in the TUI to verify the server is connected
For other MCP clients, add this server to your client's MCP configuration:
{
"mcpServers": {
"context-engine": {
"command": "node",
"args": [
"/absolute/path/to/context-engine/dist/index.js",
"--workspace",
"/path/to/your/project"
]
}
}
}
See QUICKSTART.md - Step 5B for detailed instructions for each client.
# Watch mode for development
npm run dev
# Build for production
npm run build
# Run the server
npm start
context-engine/
├── src/
│ ├── index.ts # Entry point with CLI parsing
│ ├── mcp/
│ │ ├── server.ts # MCP server implementation
│ │ ├── serviceClient.ts # Context service layer
│ │ ├── tools/
│ │ │ ├── index.ts # index_workspace tool
│ │ │ ├── search.ts # semantic_search tool
│ │ │ ├── file.ts # get_file tool
│ │ │ ├── context.ts # get_context_for_prompt tool
│ │ │ ├── enhance.ts # enhance_prompt tool
│ │ │ ├── status.ts # index_status tool (v1.1.0)
│ │ │ ├── lifecycle.ts # reindex/clear tools (v1.1.0)
│ │ │ ├── manifest.ts # tool_manifest tool (v1.1.0)
│ │ │ ├── plan.ts # Planning tools (v1.4.0)
│ │ │ └── planManagement.ts # Plan persistence/workflow tools (v1.4.0)
│ │ ├── services/ # Business logic services (v1.4.0)
│ │ │ ├── planningService.ts # Plan generation, DAG analysis
│ │ │ ├── planPersistenceService.ts # Save/load/list plans
│ │ │ ├── approvalWorkflowService.ts # Approval request handling
│ │ │ ├── executionTrackingService.ts # Step progress tracking
│ │ │ └── planHistoryService.ts # Version history, rollback
│ │ ├── types/ # TypeScript type definitions (v1.4.0)
│ │ │ └── planning.ts # Planning-related types
│ │ └── prompts/ # AI prompt templates (v1.4.0)
│ │ └── planning.ts # Planning system prompts
│ ├── watcher/ # File watching (v1.1.0)
│ │ ├── FileWatcher.ts # Core watcher logic
│ │ ├── types.ts # Event types
│ │ └── index.ts # Exports
│ └── worker/ # Background indexing (v1.1.0)
│ ├── IndexWorker.ts # Worker thread
│ └── messages.ts # IPC messages
├── tests/ # Unit tests (186 tests)
├── plan.md # Architecture documentation
├── package.json
├── tsconfig.json
└── README.md
Once connected to Codex CLI, you can use natural language:
The server will automatically use the appropriate tools to provide relevant context.
| Variable | Description | Default |
|---|---|---|
AUGMENT_API_TOKEN |
Auggie API token (or use auggie login) |
- |
AUGMENT_API_URL |
Auggie API URL | https://api.augmentcode.com |
CONTEXT_ENGINE_OFFLINE_ONLY |
Enforce offline-only policy (v1.1.0) | false |
REACTIVE_ENABLED |
Enable reactive review features | false |
REACTIVE_USE_AI_AGENT_EXECUTOR |
Use local AI agent for reviews (Phase 1) | false |
REACTIVE_ENABLE_MULTILAYER_CACHE |
Enable 3-layer caching (Phase 2) | false |
REACTIVE_ENABLE_BATCHING |
Enable request batching (Phase 3) | false |
REACTIVE_OPTIMIZE_WORKERS |
Enable CPU-aware worker optimization (Phase 4) | false |
REACTIVE_PARALLEL_EXEC |
Enable concurrent worker execution | false |
CE_INDEX_STATE_STORE |
Persist per-file index hashes to .augment-index-state.json |
false |
CE_SKIP_UNCHANGED_INDEXING |
Skip re-indexing unchanged files (requires CE_INDEX_STATE_STORE=true) |
false |
CE_HASH_NORMALIZE_EOL |
Normalize CRLF/LF when hashing (recommended with state store across Windows/Linux) | false |
CE_METRICS |
Enable in-process metrics collection (Prometheus format) | false |
CE_HTTP_METRICS |
Expose GET /metrics when running with --http |
false |
CE_AI_REQUEST_TIMEOUT_MS |
Default timeout for AI calls (searchAndAsk) in milliseconds |
120000 |
CE_SEARCH_AND_ASK_QUEUE_MAX |
Max queued searchAndAsk requests before rejecting (0 = unlimited) |
50 |
CE_TSC_INCREMENTAL |
Enable incremental tsc runs for static analysis |
true |
CE_TSC_BUILDINFO_DIR |
Directory to store tsbuildinfo cache (defaults to OS temp) |
(os tmp) |
CE_SEMGREP_MAX_FILES |
Max files per semgrep invocation before chunking | 100 |
CE_PLAN_AI_REQUEST_TIMEOUT_MS |
Timeout for planning AI calls in milliseconds (create_plan, refine_plan, step execution) |
300000 |
CE_HTTP_PLAN_TIMEOUT_MS |
HTTP POST /api/v1/plan request timeout in milliseconds |
360000 |
To expose a Prometheus-style endpoint, start the server in HTTP mode and enable both flags:
export CE_METRICS=true
export CE_HTTP_METRICS=true
node dist/index.js --workspace /path/to/project --http --port 3333
Then fetch:
curl http://localhost:3333/metrics
Notes:
To enforce that no data is sent to remote APIs, set:
export CONTEXT_ENGINE_OFFLINE_ONLY=true
When enabled, the server will fail to start if a remote API URL is configured. This is useful for enterprise environments with strict data locality requirements.
~/.codex/config.toml for syntax errorscodex mcp list to see configured servers/mcp command in the TUI to check connection statusRun auggie login or verify environment variables are set correctly.
Index your workspace first:
node dist/index.js --workspace /path/to/project --index
--watch flag.gitignore or .contextignoreIf you see an error about offline-only mode:
CONTEXT_ENGINE_OFFLINE_ONLY environment variable, orAUGMENT_API_URLThe create_plan tool can take longer than default MCP client timeouts for complex tasks. If you experience timeout errors, increase the timeout in your MCP client configuration:
Edit ~/.codex/config.toml and add or modify the tool_timeout_sec setting under the [mcp_servers.context-engine] section:
[mcp_servers.context-engine]
command = "node"
args = ["/absolute/path/to/context-engine/dist/index.js", "--workspace", "/path/to/your/project"]
tool_timeout_sec = 600 # 10 minutes for complex planning tasks
Consult your client's documentation for timeout configuration. Common locations:
~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows).cursor/mcp.json in your workspaceAdd a timeout setting appropriate for your client's configuration format. A value of 600 seconds (10 minutes) is recommended for complex planning tasks.
# Run all tests
npm test
# Quieter ESM run (use if you see pipe/stream errors)
node --experimental-vm-modules node_modules/jest/bin/jest.js --runInBand --silent
# Run tests in watch mode
npm run test:watch
# Run tests with coverage
npm run test:coverage
# Interactive MCP testing
npm run inspector
Test Status: 397 tests passing (100% completion) ✅
MIT
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A Model Context Protocol server for Git repository interaction and automation.
by zed-industries
A high‑performance, multiplayer code editor designed for speed and collaboration.
by modelcontextprotocol
Model Context Protocol Servers
by modelcontextprotocol
A Model Context Protocol server that provides time and timezone conversion capabilities.
by cline
An autonomous coding assistant that can create and edit files, execute terminal commands, and interact with a browser directly from your IDE, operating step‑by‑step with explicit user permission.
by upstash
Provides up-to-date, version‑specific library documentation and code examples directly inside LLM prompts, eliminating outdated information and hallucinated APIs.
by daytonaio
Provides a secure, elastic infrastructure that creates isolated sandboxes for running AI‑generated code with sub‑90 ms startup, unlimited persistence, and OCI/Docker compatibility.
by continuedev
Enables faster shipping of code by integrating continuous AI agents across IDEs, terminals, and CI pipelines, offering chat, edit, autocomplete, and customizable agent workflows.
by github
Connects AI tools directly to GitHub, enabling natural‑language interactions for repository browsing, issue and pull‑request management, CI/CD monitoring, code‑security analysis, and team collaboration.