by andrea9293
Provides document management and AI-powered semantic search for storing, retrieving, and querying text, markdown, and PDF files locally without external databases.
MCP Documentation Server offers a TypeScript‑based solution that lets you upload documents, automatically split them into context‑aware chunks, embed those chunks with high‑quality models, and perform fast semantic search. All data is kept in a local ~/.mcp‑documentation‑server/ directory, making it ideal for private knowledge bases, API docs, or internal guides.
npx @andrea9293/mcp-documentation-server
{
"mcpServers": {
"documentation": {
"command": "npx",
"args": ["-y", "@andrea9293/mcp-documentation-server"],
"env": { "MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2" }
}
}
}
add_document, search_documents, get_context_window, list_documents, etc., using the MCP client or direct HTTP calls..txt, .md, and .pdf filesnpx and optional global installQ: Do I need an API key?
A: No external API key is required; the server runs entirely locally. You only need to set MCP_EMBEDDING_MODEL if you want a model other than the default.
Q: Can I change the embedding model after adding documents? A: Changing the model invalidates existing embeddings. Re‑add or re‑process all documents after switching models.
Q: How are large PDFs handled? A: Text is extracted (no OCR). Split into chunks automatically; processing time depends on file size.
Q: Where are the files stored?
A: In ~/.mcp-documentation-server/ under data/ for JSON documents and uploads/ for raw files.
Q: Is it cross‑platform? A: Yes. Runs on any system with Node.js (Linux, macOS, Windows).
A TypeScript-based Model Context Protocol (MCP) server that provides local-first document management and semantic search using embeddings. The server exposes a collection of MCP tools and is optimized for performance with on-disk persistence, an in-memory index, and caching.
NEW! Enhanced with Google Gemini AI for advanced document analysis and contextual understanding. Ask complex questions and get intelligent summaries, explanations, and insights from your documents. To get API Key go to Google AI Studio
DocumentIndex for instant retrievalEmbeddingCache to avoid recomputing embeddings and speed up repeated queries~/.mcp-documentation-server/Example configuration for an MCP client (e.g., Claude Desktop):
{
"mcpServers": {
"documentation": {
"command": "npx",
"args": [
"-y",
"@andrea9293/mcp-documentation-server"
],
"env": {
"GEMINI_API_KEY": "your-api-key-here", // Optional, enables AI-powered search
"MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
}
}
}
}
add_document tool or by placing .txt, .md, or .pdf files into the uploads folder and calling process_uploads.search_documents to get ranked chunk hits.get_context_window to fetch neighboring chunks and provide LLMs with richer context.The server exposes several tools (validated with Zod schemas) for document lifecycle and search:
add_document — Add a document (title, content, metadata)list_documents — List stored documents and metadataget_document — Retrieve a full document by iddelete_document — Remove a document, its chunks, and associated original filesprocess_uploads — Convert files in uploads folder into documents (chunking + embeddings + backup preservation)get_uploads_path — Returns the absolute uploads folder pathlist_uploads_files — Lists files in uploads foldersearch_documents_with_ai — 🤖 AI-powered search using Gemini for advanced document analysis (requires GEMINI_API_KEY)search_documents — Semantic search within a document (returns chunk hits and LLM hint)get_context_window — Return a window of chunks around a target chunk indexConfigure behavior via environment variables. Important options:
MCP_EMBEDDING_MODEL — embedding model name (default: Xenova/all-MiniLM-L6-v2). Changing the model requires re-adding documents.GEMINI_API_KEY — Google Gemini API key for AI-powered search features (optional, enables search_documents_with_ai).MCP_INDEXING_ENABLED — enable/disable the DocumentIndex (true/false). Default: true.MCP_CACHE_SIZE — LRU embedding cache size (integer). Default: 1000.MCP_PARALLEL_ENABLED — enable parallel chunking (true/false). Default: true.MCP_MAX_WORKERS — number of parallel workers for chunking/indexing. Default: 4.MCP_STREAMING_ENABLED — enable streaming reads for large files. Default: true.MCP_STREAM_CHUNK_SIZE — streaming buffer size in bytes. Default: 65536 (64KB).MCP_STREAM_FILE_SIZE_LIMIT — threshold (bytes) to switch to streaming path. Default: 10485760 (10MB).Example .env (defaults applied when variables are not set):
MCP_INDEXING_ENABLED=true # Enable O(1) indexing (default: true)
GEMINI_API_KEY=your-api-key-here # Google Gemini API key (optional)
MCP_CACHE_SIZE=1000 # LRU cache size (default: 1000)
MCP_PARALLEL_ENABLED=true # Enable parallel processing (default: true)
MCP_MAX_WORKERS=4 # Parallel worker count (default: 4)
MCP_STREAMING_ENABLED=true # Enable streaming (default: true)
MCP_STREAM_CHUNK_SIZE=65536 # Stream chunk size (default: 64KB)
MCP_STREAM_FILE_SIZE_LIMIT=10485760 # Streaming threshold (default: 10MB)
Default storage layout (data directory):
~/.mcp-documentation-server/
├── data/ # Document JSON files
└── uploads/ # Drop files (.txt, .md, .pdf) to import
Add a document via MCP tool:
{
"tool": "add_document",
"arguments": {
"title": "Python Basics",
"content": "Python is a high-level programming language...",
"metadata": {
"category": "programming",
"tags": ["python", "tutorial"]
}
}
}
Search a document:
{
"tool": "search_documents",
"arguments": {
"document_id": "doc-123",
"query": "variable assignment",
"limit": 5
}
}
Advanced Analysis (requires GEMINI_API_KEY):
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "explain the main concepts and their relationships"
}
}
Complex Questions:
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "what are the key architectural patterns and how do they work together?"
}
}
Summarization Requests:
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "summarize the core principles and provide examples"
}
}
Fetch context window:
{
"tool": "get_context_window",
"arguments": {
"document_id": "doc-123",
"chunk_index": 5,
"before": 2,
"after": 2
}
}
Smart Caching: File mapping prevents re-uploading the same content
Efficient Processing: Only relevant sections are analyzed by Gemini
Contextual Results: More accurate and comprehensive answers
Natural Interaction: Ask questions in plain English
Embedding models are downloaded on first use; some models require several hundred MB of downloads.
The DocumentIndex persists an index file and can be rebuilt if necessary.
The EmbeddingCache can be warmed by calling process_uploads, issuing curated queries, or using a preload API when available.
Set via MCP_EMBEDDING_MODEL environment variable:
Xenova/all-MiniLM-L6-v2 (default) - Fast, good quality (384 dimensions)Xenova/paraphrase-multilingual-mpnet-base-v2 (recommended) - Best quality, multilingual (768 dimensions)The system automatically manages the correct embedding dimension for each model. Embedding providers expose their dimension via getDimensions().
⚠️ Important: Changing models requires re-adding all documents as embeddings are incompatible.
git clone https://github.com/andrea9293/mcp-documentation-server.git
cd mcp-documentation-server
npm run dev
npm run build
npm run inspect
git checkout -b feature/nameMIT - see LICENSE file
Built with FastMCP and TypeScript 🚀
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A basic implementation of persistent memory using a local knowledge graph. This lets Claude remember information about the user across chats.
by topoteretes
Provides dynamic memory for AI agents through modular ECL (Extract, Cognify, Load) pipelines, enabling seamless integration with graph and vector stores using minimal code.
by basicmachines-co
Enables persistent, local‑first knowledge management by allowing LLMs to read and write Markdown files during natural conversations, building a traversable knowledge graph that stays under the user’s control.
by smithery-ai
Provides read and search capabilities for Markdown notes in an Obsidian vault for Claude Desktop and other MCP clients.
by chatmcp
Summarize chat messages by querying a local chat database and returning concise overviews.
by dmayboroda
Provides on‑premises conversational retrieval‑augmented generation (RAG) with configurable Docker containers, supporting fully local execution, ChatGPT‑based custom GPTs, and Anthropic Claude integration.
by qdrant
Provides a Model Context Protocol server that stores and retrieves semantic memories using Qdrant vector search, acting as a semantic memory layer.
by doobidoo
Provides a universal memory service with semantic search, intelligent memory triggers, OAuth‑enabled team collaboration, and multi‑client support for Claude Desktop, Claude Code, VS Code, Cursor and over a dozen AI applications.
by GreatScottyMac
Provides a project‑specific memory bank that stores decisions, progress, architecture, and custom data, exposing a structured knowledge graph via MCP for AI assistants and IDE tools.
{
"mcpServers": {
"documentation": {
"command": "npx",
"args": [
"-y",
"@andrea9293/mcp-documentation-server"
],
"env": {
"MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2"
}
}
}
}claude mcp add documentation npx -y @andrea9293/mcp-documentation-server