by epheterson
Search and read over 100 million offline articles from ZIM archives via a fast, API‑first knowledge server.
Zimi provides a unified interface—desktop app, Docker container, or Python CLI—to browse, download, search, and read ZIM‑based offline archives such as Wikipedia, Stack Overflow, DevDocs, WikiHow, and many others. It exposes a JSON API and an MCP server so scripts, bots, and AI agents can programmatically retrieve knowledge.
http://localhost:8899.mkdir zims
docker run -v $(pwd)/zims:/zims -p 8899:8899 epheterson/zimi
pip install -r requirements.txt
ZIM_DIR=./zims python3 zimi.py serve --port 8899
The CLI also offers commands like search, read, list, and suggest for quick one‑off queries.curl "http://localhost:8899/search?q=python+asyncio&limit=5"
curl "http://localhost:8899/read?zim=wikipedia&path=A/Water_purification"
zimi_mcp.py in an MCP configuration to let AI agents invoke tools such as search, read, suggest, list_sources, and random.Q: Do I need an internet connection after ZIM files are downloaded? A: No. All search and read operations work completely offline; the UI only contacts the server locally.
Q: How much disk space does Zimi require? A: It depends on the chosen ZIM archives. A full English Wikipedia dump is ~100 GB; title indexes add ~2‑3 % of the ZIM size.
Q: Can I protect the library management UI?
A: Yes. Set ZIMI_MANAGE_PASSWORD (or configure via the UI) and enable ZIMI_MANAGE=1. The password file is stored in ZIMI_DATA_DIR/password.
Q: How are full‑text searches performed? A: Zimi uses Xapian for full‑text indexing, guarded by a global lock, after the fast SQLite title phase.
Q: Is there a way to automate ZIM updates?
A: Set ZIMI_AUTO_UPDATE=1 and choose a frequency with ZIMI_UPDATE_FREQ (daily, weekly, monthly).
Q: How do I integrate Zimi with Claude Code?
A: Add an entry in the Claude Code mcpServers configuration pointing to zimi_mcp.py (see the project README for JSON examples).
Search and read 100M+ articles offline. Wikipedia, Stack Overflow, dev docs, WikiHow, and thousands more — all on your machine, no internet required.
Kiwix packages the world's knowledge into ZIM files — compressed offline archives of entire websites. Zimi is the fastest way to search and read them.
Three ways to run it:
What you get:
| Search Results | Article Reader |
|---|---|
![]() |
![]() |
| Catalog Browser | Homepage |
|---|---|
![]() |
![]() |
Download the latest release for your platform from GitHub Releases:
Zimi.dmg — open the DMG, drag to Applications, launch.zimi-windows-amd64.zip — extract and run Zimi.exe.zimi-linux-amd64.tar.gz — extract and run ./Zimi.On first launch, Zimi asks you to pick a folder for storing ZIM files. Then it opens the full UI in a native window. Browse the catalog, download your first ZIM, and start reading.
You can also access the same UI from any browser at http://localhost:8899 while the app is running.
Starting fresh? Run with an empty directory — the catalog browser lets you download ZIMs from the UI:
mkdir zims
docker run -v ./zims:/zims -p 8899:8899 epheterson/zimi
Already have ZIM files? Mount them and go:
docker run -v /path/to/zims:/zims -p 8899:8899 epheterson/zimi
Open http://localhost:8899 to search, read, and manage your library.
pip install -r requirements.txt
ZIM_DIR=./zims python3 zimi.py serve --port 8899
Or use the CLI directly:
python3 zimi.py search "water purification" --limit 10
python3 zimi.py read wikipedia "A/Water_purification"
python3 zimi.py list
python3 zimi.py suggest "pytho"
Every feature in the UI is backed by a JSON API you can hit directly:
# Search across all sources
curl "http://localhost:8899/search?q=python+asyncio&limit=5"
# Fast title-only search (instant, no full-text)
curl "http://localhost:8899/search?q=python+asyncio&fast=1"
# Search within a specific source
curl "http://localhost:8899/search?q=linked+list&zim=stackoverflow&limit=10"
# Read an article as plain text
curl "http://localhost:8899/read?zim=wikipedia&path=A/Water_purification"
# Title autocomplete
curl "http://localhost:8899/suggest?q=pytho&limit=5"
# List all sources
curl "http://localhost:8899/list"
| Endpoint | Description |
|---|---|
GET /search?q=...&limit=5&zim=...&fast=1 |
Full-text search (cross-ZIM or scoped). fast=1 returns title matches only. |
GET /read?zim=...&path=...&max_length=8000 |
Read article as plain text |
GET /suggest?q=...&limit=10&zim=... |
Title autocomplete |
GET /list |
List all ZIM sources with metadata |
GET /catalog?zim=... |
PDF catalog for zimgit-style ZIMs |
GET /snippet?zim=...&path=... |
Short text snippet |
GET /random?zim=... |
Random article |
GET /collections |
List all collections |
POST /collections |
Create/update a collection |
DELETE /collections?name=... |
Delete a collection |
GET /health |
Health check (includes version) |
GET /w/<zim>/<path> |
Serve raw ZIM content (HTML, images) |
Zimi includes an MCP (Model Context Protocol) server that exposes search/read tools to AI agents.
{
"mcpServers": {
"zimi": {
"command": "python3",
"args": ["/path/to/zimi_mcp.py"],
"env": { "ZIM_DIR": "/path/to/zims" }
}
}
}
{
"mcpServers": {
"zimi": {
"command": "ssh",
"args": ["your-server", "docker", "exec", "-i", "zimi", "python3", "/app/zimi_mcp.py"]
}
}
}
| Tool | Description |
|---|---|
search |
Full-text search across all ZIM sources. Supports collection parameter. |
read |
Read an article as plain text |
suggest |
Title autocomplete. Supports collection parameter. |
list_sources |
List all available sources |
random |
Random article |
services:
zimi:
image: epheterson/zimi
container_name: zimi
restart: unless-stopped
ports:
- "8899:8899"
volumes:
- ./zims:/zims
| Variable | Default | Description |
|---|---|---|
ZIM_DIR |
/zims |
Path to directory containing ZIM files |
ZIMI_DATA_DIR |
$ZIM_DIR/.zimi |
Data directory for indexes, cache, and config |
ZIMI_MANAGE |
1 |
Library manager (browse/download ZIMs). Set to 0 to disable. |
ZIMI_MANAGE_PASSWORD |
(none) | Password to protect library management. Can also be set from the UI. |
ZIMI_AUTO_UPDATE |
0 |
Auto-update ZIMs. Set to 1 to enable. |
ZIMI_UPDATE_FREQ |
weekly |
Auto-update frequency: daily, weekly, or monthly. |
ZIMI_RATE_LIMIT |
60 |
API rate limit (requests/minute per IP). Set to 0 to disable. |
Forgot your password? Delete password from your data directory ($ZIMI_DATA_DIR/password, default zims/.zimi/password) and restart.
Zimi stores its data (metadata cache, title indexes, password, collections) in ZIMI_DATA_DIR, which defaults to .zimi/ inside your ZIM directory.
zims/
.zimi/ # ZIMI_DATA_DIR
cache.json # ZIM metadata cache
password # Management password hash
collections.json # Saved collections
titles/ # SQLite title indexes (one per ZIM)
wikipedia.db
stackoverflow.db
wikipedia.zim
stackoverflow.zim
ZIM files are compressed offline archives of entire websites. You can download them from the catalog browser in Zimi, or grab them directly:
Popular ZIMs:
| Source | Size | Articles |
|---|---|---|
| Wikipedia (English, all) | ~100 GB | 6.8M |
| Stack Overflow | ~75 GB | 31M |
| Wikipedia (English, top) | ~12 GB | 200K |
| DevDocs | ~0.5 GB each | varies |
| WikiHow | ~4 GB | 240K |
Place .zim files in your ZIM directory and restart Zimi (or hit refresh in the UI).
zimi.py — HTTP server + CLI + core library (search, read, suggest, random)zimi_mcp.py — MCP server wrapping core functions for AI agent integrationzimi_desktop.py — Desktop app wrapper using pywebview (native window)templates/index.html — Single-page web UI (vanilla JS, no build step)tests.py — Unit and integration testsSearch uses a two-phase progressive approach:
SQLite title indexes are built automatically in the background on first startup. Connection pooling and pre-warming eliminate cold-start latency.
Storage: Title indexes use roughly 2-3% of your total ZIM size on disk (e.g. ~15 GB for 575 GB of ZIMs). They are stored in ZIMI_DATA_DIR/titles/ and can be safely deleted — they'll rebuild on next startup.
pip install -r requirements-desktop.txt
pyinstaller --noconfirm zimi_desktop.spec
open dist/Zimi.app # macOS
See RELEASING.md for detailed build instructions and platform notes.
# Unit tests (no server needed)
python3 tests.py
# Performance tests (requires running server)
python3 tests.py --perf --perf-host http://localhost:8899
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A basic implementation of persistent memory using a local knowledge graph. This lets Claude remember information about the user across chats.
by topoteretes
Provides dynamic memory for AI agents through modular ECL (Extract, Cognify, Load) pipelines, enabling seamless integration with graph and vector stores using minimal code.
by basicmachines-co
Enables persistent, local‑first knowledge management by allowing LLMs to read and write Markdown files during natural conversations, building a traversable knowledge graph that stays under the user’s control.
by agentset-ai
Provides an open‑source platform to build, evaluate, and ship production‑ready retrieval‑augmented generation (RAG) and agentic applications, offering end‑to‑end tooling from ingestion to hosting.
by smithery-ai
Provides read and search capabilities for Markdown notes in an Obsidian vault for Claude Desktop and other MCP clients.
by chatmcp
Summarize chat messages by querying a local chat database and returning concise overviews.
by dmayboroda
Provides on‑premises conversational retrieval‑augmented generation (RAG) with configurable Docker containers, supporting fully local execution, ChatGPT‑based custom GPTs, and Anthropic Claude integration.
by qdrant
Provides a Model Context Protocol server that stores and retrieves semantic memories using Qdrant vector search, acting as a semantic memory layer.
by doobidoo
Provides a universal memory service with semantic search, intelligent memory triggers, OAuth‑enabled team collaboration, and multi‑client support for Claude Desktop, Claude Code, VS Code, Cursor and over a dozen AI applications.