by Hmbown
Enables AI assistants to work with documents that exceed their context window by storing content externally, providing search, code execution, evidence tracking, and recursive sub‑agents via an MCP server.
Aleph implements the Recursive Language Model (RLM) paradigm, allowing large‑scale textual or JSON data to be kept outside the model's prompt context. The server supplies tools for searching, peeking, running Python code, chunking, and spawning sub‑agents, while automatically recording citations that link answers back to source fragments.
pip install aleph-rlm[mcp]
aleph-rlm install # auto‑configure popular MCP clients
aleph-rlm doctor # verify installation
aleph-rlm install <client> helper.load_context(context="<large document>", context_id="doc")
search_context, peek_context, or exec_python to retrieve relevant slices or compute statistics.chunk_context and sub_query to spawn recursive agents that process chunks in parallel before synthesising a final response.Q: Do I need an OpenAI API key?
A: Only if you use the api backend for sub_query. Set OPENAI_API_KEY (and optionally OPENAI_BASE_URL) in your environment.
Q: Is the Python sandbox secure? A: It is a best‑effort sandbox and not hardened. Do not run untrusted code; run Aleph inside a container for extra safety.
Q: Why are action tools disabled by default?
A: To avoid accidental file system or command execution. Enable them with the --enable-actions flag when you trust the environment.
Q: How does recursion work?
A: Use chunk_context to split a large document, then call sub_query on each chunk. Aleph coordinates parallel sub‑agents and aggregates their results.
Q: Which MCP clients are supported? A: Claude Desktop, Cursor, Windsurf, VS Code, Claude Code, Codex CLI, and any client adhering to the Model Context Protocol.
"What my eyes beheld was simultaneous, but what I shall now write down will be successive, because language is successive." — Jorge Luis Borges, "The Aleph" (1945)
Aleph is an MCP server that lets AI assistants work with documents too large to fit in their context window.
It implements the Recursive Language Model (RLM) paradigm from arXiv:2512.24601.
LLMs have a fundamental limitation: they can only "see" what fits in their context window. When you paste a large document into a prompt, models often miss important details buried in the middle—a phenomenon called "lost in the middle."
The usual approach:
The RLM approach (what Aleph enables):
Think of Borges' Aleph: a point containing all points. You don't hold it all in attention at once—you move through it, zooming and searching, returning with what matters.
Aleph is an MCP server—a standardized way for AI assistants to use external tools. It works with Claude Desktop, Cursor, Windsurf, VS Code, Claude Code, Codex CLI, and other MCP-compatible clients.
When you install Aleph, your AI assistant gains:
| Capability | What it means |
|---|---|
| External memory | Store documents outside the context window as searchable state |
| Navigation tools | Search by regex, view specific line ranges, jump to matches |
| Compute sandbox | Run Python code over the loaded content (parsing, stats, transforms) |
| Evidence tracking | Automatically cite which parts of the source informed each answer |
| Recursive agents | Spawn sub-agents to process chunks in parallel, then aggregate |
The content you load can be anything representable as text or JSON: code repositories, build logs, incident reports, database exports, API responses, research papers, legal documents, etc.
pip install aleph-rlm[mcp]
# Auto-configure popular MCP clients
aleph-rlm install
# Verify installation
aleph-rlm doctor
Add to your MCP client config (Claude Desktop, Cursor, etc.):
{
"mcpServers": {
"aleph": {
"command": "aleph-mcp-local",
"args": ["--enable-actions"]
}
}
}
Claude Code auto-discovers MCP servers. Run aleph-rlm install claude-code or add to ~/.claude/settings.json:
{
"mcpServers": {
"aleph": {
"command": "aleph-mcp-local",
"args": ["--enable-actions"]
}
}
}
Install the /aleph skill for the RLM workflow prompt:
mkdir -p ~/.claude/commands
cp /path/to/aleph/docs/prompts/aleph.md ~/.claude/commands/aleph.md
Add to ~/.codex/config.toml:
[mcp_servers.aleph]
command = "aleph-mcp-local"
args = ["--enable-actions"]
Or run: aleph-rlm install codex
Install the /aleph skill for Codex:
mkdir -p ~/.codex/skills/aleph
cp /path/to/aleph/ALEPH.md ~/.codex/skills/aleph/SKILL.md
Once installed, you interact with Aleph through your AI assistant. Here's the typical flow:
load_context(context="<your large document>", context_id="doc")
The assistant stores this externally—it doesn't consume context window tokens.
search_context(pattern="error|exception|fail", context_id="doc")
peek_context(start=120, end=150, unit="lines", context_id="doc")
The assistant searches and views only the relevant slices.
# exec_python — runs in the sandbox with your content as `ctx`
matches = search(r"timeout.*\d+ seconds")
stats = {"total_matches": len(matches), "lines": [m["line_no"] for m in matches]}
The assistant's final answer includes evidence trails back to specific source locations.
/aleph commandIf you've installed the skill, just use:
/aleph: Find the root cause of this test failure and propose a fix.
For AI assistants using Aleph, see ALEPH.md for the detailed workflow.
When content is too large even for slice-based exploration, Aleph supports recursive decomposition:
# exec_python
chunks = chunk(100_000) # split into ~100K char pieces
results = [sub_query("Extract key findings.", context_slice=c) for c in chunks]
final = sub_query("Synthesize into a summary:", context_slice="\n\n".join(results))
sub_query can use an API backend (OpenAI-compatible) or spawn a local CLI (Claude, Codex, Aider)—whichever is available.
Core exploration:
| Tool | Purpose |
|---|---|
load_context |
Store text/JSON in external memory |
search_context |
Regex search with surrounding context |
peek_context |
View specific line or character ranges |
exec_python |
Run Python code over the content |
chunk_context |
Split content into navigable chunks |
Workflow management:
| Tool | Purpose |
|---|---|
think |
Structure reasoning for complex problems |
get_evidence |
Retrieve collected citations |
summarize_so_far |
Summarize progress on long tasks |
finalize |
Complete with answer and evidence |
Recursion:
| Tool | Purpose |
|---|---|
sub_query |
Spawn a sub-agent on a content slice |
Optional actions (disabled by default, enable with --enable-actions):
| Tool | Purpose |
|---|---|
load_file |
Load a workspace file into a context |
read_file, write_file |
File system access |
run_command, run_tests |
Shell execution |
save_session, load_session |
Persist/restore state |
Action tools that return JSON support output="object" for structured responses without double-encoding.
Environment variables for sub_query:
# Backend selection (auto-detects by default)
export ALEPH_SUB_QUERY_BACKEND=auto # or: api | claude | codex | aider
# API credentials (for API backend)
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.openai.com/v1
export ALEPH_SUB_QUERY_MODEL=gpt-4o-mini
Note: Some MCP clients don't reliably pass
envvars from their config to the server process. Ifsub_queryreports "API key not found" despite your client's MCP settings, add the exports to your shell profile (~/.zshrcor~/.bashrc) and restart your terminal/client.
See docs/CONFIGURATION.md for all options.
load_file and auto-created contexts for action tools when a context_id is providedinclude_raw for read_fileoutput="object" for structured responses and consistent JSON error payloadsrecord_evidence flags; cite now validates line rangesrun_tests reporting (exit codes/errors) and sub_query backend validation; added sandbox import introspection helpersgit clone https://github.com/Hmbown/aleph.git
cd aleph
pip install -e '.[dev,mcp]'
pytest
See DEVELOPMENT.md for architecture details.
MIT
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A basic implementation of persistent memory using a local knowledge graph. This lets Claude remember information about the user across chats.
by topoteretes
Provides dynamic memory for AI agents through modular ECL (Extract, Cognify, Load) pipelines, enabling seamless integration with graph and vector stores using minimal code.
by basicmachines-co
Enables persistent, local‑first knowledge management by allowing LLMs to read and write Markdown files during natural conversations, building a traversable knowledge graph that stays under the user’s control.
by smithery-ai
Provides read and search capabilities for Markdown notes in an Obsidian vault for Claude Desktop and other MCP clients.
by chatmcp
Summarize chat messages by querying a local chat database and returning concise overviews.
by dmayboroda
Provides on‑premises conversational retrieval‑augmented generation (RAG) with configurable Docker containers, supporting fully local execution, ChatGPT‑based custom GPTs, and Anthropic Claude integration.
by qdrant
Provides a Model Context Protocol server that stores and retrieves semantic memories using Qdrant vector search, acting as a semantic memory layer.
by doobidoo
Provides a universal memory service with semantic search, intelligent memory triggers, OAuth‑enabled team collaboration, and multi‑client support for Claude Desktop, Claude Code, VS Code, Cursor and over a dozen AI applications.
by GreatScottyMac
Provides a project‑specific memory bank that stores decisions, progress, architecture, and custom data, exposing a structured knowledge graph via MCP for AI assistants and IDE tools.