by ahmedEid1
An AI‑first learning platform that equips users with a citation‑rich tutor, multi‑modal course authoring, spaced‑repetition reviews, and verifiable Open Badges, all powered by interchangeable LLM providers and a Model Context Protocol (MCP) server.
Lumen delivers a production‑grade learning management system where an autonomous tutor reasons over course‑specific knowledge, cites source chunks, and guides learners through quizzes and reviews. The backend is built with FastAPI, Celery, Postgres 17 + pgvector, Redis, and MinIO, while the frontend uses Next.js 15. All LLM interactions are routed through a provider‑agnostic abstraction (Groq Llama 3.3 70B by default) and observed via a cost‑meter and trace logs.
docker compose up) on a local machine or an ARM‑based EC2 t4g.small (see the repo's runbook)..env with your LLM provider credentials (Groq key recommended).http://localhost:3000 and log in with one of the seeded accounts (admin, instructor, student).ask_tutor, create_course_draft) through standard I/O or HTTP.Q: Do I need to pay for the LLM calls? A: The demo uses Groq’s free tier for Llama 3.3 70B. Production deployments can switch to Anthropic or OpenAI; the cost‑meter enforces a per‑user 24‑hour budget (default $1/day).
Q: Can I run Lumen without an internet connection? A: Core LMS services (Postgres, Redis, MinIO, FastAPI) run offline, but AI features require an external LLM provider.
Q: How is data stored securely? A: All persistence is in encrypted Postgres and MinIO buckets; Open Badges are signed with Ed25519 keys.
Q: What is required to add Lumen to Claude Desktop?
A: Generate an MCP auth token (make mcp-token) and configure the desktop’s mcpServers entry with the command python -m app.mcp --transport stdio.
Q: Is the platform extensible? A: Yes. New agents can be added to the orchestrator, additional MCP tools can be registered, and the LLM provider abstraction allows plugging in any OpenAI‑compatible service.
Lumen started in late 2020 as a Django side-project — a learning platform for myself. Five years and one model revolution later, the original prototype is gone and what remains is the question: can an agent actually teach? Not just summarize and quote. So I rebuilt it. Custom orchestrator, no LangChain. Groq Llama 3.3 for the latency-per-dollar that makes "watch it think" real. Public evals so you can audit the agent's competence yourself.
— Ahmed Hobeishy
An open-source, AI-first LMS built as a portfolio anchor for agentic-AI engineering work.
Public deploy on AWS t4g.small (Graviton2 ARM, 2 vCPU + 2 GB RAM) — Caddy 2 fronts a single docker-compose.prod.yml running FastAPI + Celery + Postgres 17 (pgvector) + Redis 7 + MinIO. Real LLM calls via Groq Llama 3.3 70B; retrieval embeddings via Cloudflare Workers AI (@cf/baai/bge-small-en-v1.5). Runbook: docs/deployment/aws-vps.md.
Silent 1:50 captioned walkthrough — landing → multi-agent tutor → agent reasoning panel → observable trace → self-critique authoring replay → admin observability. A voiced Loom is queued for re-record once the live demo lands; script at docs/release/loom-recording-script.md.

Lumen is the live demo of how multi-agent systems, retrieval-augmented generation, the Model Context Protocol, and evaluation rigor come together inside a real product. It is the centrepiece of an agentic-AI engineering portfolio — a self-hostable LMS that doubles as a working argument for "I build production-grade AI systems, not toy demos."
If you've cloned this and want to take the streaming demo live, the
4-step operator handoff (flag flip → sealed eval run → screencap →
repo rename) is at docs/release/operator-handoff.md.
Until those steps run, code lands flag-OFF and the public /eval
surface stays honest-empty — by design.
If you're reviewing the code, these are the highest-signal files:
apps/backend/app/services/tutor_orchestrator_stream.py (async-iterator event sequence) + apps/backend/app/workers/tasks/tutor_streaming.py (Celery task with atomic phase fence + after_commit enqueue + contextlib.suppress-wrapped cleanup).apps/backend/app/api/v1/tutor_streaming.py (4 endpoints, all flag-gated) + apps/backend/app/services/redis_streams.py (XADD/XREAD + trim-detect).apps/backend/app/core/lua/ (reserve/reconcile/check/release in microcents, integer math, zero FP drift).apps/backend/app/evals/adversarial.py + apps/backend/evals/security/probes.jsonl (15-probe corpus; per-probe outputs deliberately NOT published).apps/frontend/src/lib/tutor/sse-parser.ts + use-tutor-stream.ts (useSyncExternalStore reducer + iOS Safari 15.0-15.3 UA sniff).docs/adr/0017–0019 (Celery prefork; Redis Streams not pub/sub; atomic phase fence + after_commit)./eval, /eval/methodology, /case-study.Architecture B+: AI-first OSS LMS. Provider-agnostic LLM layer; the live demo runs Groq Llama 3.3 70B for $0, prod-ready for Anthropic or OpenAI via the same LLMProvider abstraction. Every agent call goes through the cost-meter so observability and the per-user 24h budget guard work identically across providers. See docs/architecture.md for the full topology.
The resume bullets, with links to the code. Every item below is on the release branch today (1.1.0-agentic).
apps/backend/app/services/tutor_orchestrator.py reads the learner's question and picks among five sub-agents under apps/backend/app/services/tutor_subagents/ — retriever, web_searcher, code_runner, quiz_generator, concept_explainer — with a hard cap of 5 tool-call rounds per turn. Every step lands in agent_tracer.py so the frontend can render the plan and which tools fired. The moat is showing how the agent thinks, not just what it said.apps/backend/app/services/authoring_orchestrator.py drives researcher → outliner → critic → reviser → lesson-drafter → final-critic via the modules under authoring_subagents/; max three revision loops; the full chain persists as CourseDraftTrace so an instructor replays the reasoning before accepting a draft.apps/backend/app/mcp/server.py exposes nine tools (list_courses, get_course, ask_tutor, list_my_due_reviews, grade_review_card, create_course_draft, ingest_url_to_draft, list_my_progress, search_lesson_content) over stdio + HTTP; OAuth client-credentials for service-to-service; installable in Claude Desktop with the JSON snippet below. Registry metadata at apps/backend/app/mcp/registry_metadata.json ready for mcp-publisher publish against registry.modelcontextprotocol.io.apps/backend/evals/. Run with make eval or python -m app.evals run --suite tutor. Judge scores each item 0–5 on suite-specific axes; reports land as JSONL with mean + regression vs. previous run. CI smoke gate runs a 3-item subset on every PR. Admin dashboard at /admin/evals.llm_calls table (Alembic 0022) via apps/backend/app/services/llm_call_log.py. The per-user 24h budget guard returns HTTP 429 llm.budget_exceeded once the threshold trips. /admin/observability adds Celery queue depth, retrieval-quality drill-down, and a per-trace expander; learners get a per-turn trace drill-down at /dashboard/tutor/{conversation_id}/turn/{message_id} powered by learner_traces.py + agent_tracer.py (I4).apps/backend/app/services/learning_path.py takes a learner goal ("become a backend engineer in 6 months"), assembles an 8-course plan respecting prerequisites and FSRS load, schedules it weekly, and re-plans monthly as new courses and progress data arrive.| Feature | Status |
|---|---|
| Course-scoped RAG tutor with citations (Phase E1) | ✅ shipped (1.0.0-rebuild) |
| AI-assisted authoring (Phase E2) | ✅ shipped (1.0.0-rebuild) |
| Multi-modal ingest — YouTube / Notion / Google Docs (E3) | ✅ shipped (1.0.0-rebuild) |
| FSRS-6 spaced-repetition reviews (Phase E4) | ✅ shipped (1.0.0-rebuild) |
| Open Badges 3.0 / W3C VC credentials (Phase E5) | ✅ shipped (1.0.0-rebuild) |
| Tiptap block editor (Phase E6) | ✅ shipped (1.0.0-rebuild) |
| Mastery dashboard (Phase E7) | ✅ shipped (1.0.0-rebuild) |
| pgvector + provider-agnostic embeddings (Phase E0) | ✅ shipped (1.0.0-rebuild) |
| WCAG 2.2 AA axe-core CI gate (Phase D5) | ✅ shipped (1.0.0-rebuild) |
| LLM cost meter + per-user 24h budget guard (H1) | ✅ shipped (wave 1) |
| Eval harness + golden datasets + judge dashboard (H2) | ✅ shipped (wave 1) |
| Playwright e2e against the live stack (H3) | ✅ shipped (wave 1) |
| Production-exposure security pass (H6) | ✅ shipped (wave 1) |
| AWS t4g.small single-VM deploy runbook (H4) | ✅ shipped (1.1.0-agentic) |
| README rewrite for agentic-AI positioning (H5) | ✅ shipped (1.1.0-agentic) |
| Agent-trace + retrieval observability surface (H7) | ✅ shipped (1.1.0-agentic) |
| Lumen MCP server (I1) | ✅ shipped (1.1.0-agentic) |
| Multi-agent planner-orchestrator tutor (I2) | ✅ shipped (1.1.0-agentic) |
| Self-critique authoring agent (I3) | ✅ shipped (1.1.0-agentic) |
| Agent-trace observability surface for learners (I4) | ✅ shipped (1.1.0-agentic) |
| Personalized learning-path agent (I5) | ✅ shipped (1.1.0-agentic) |
Authoring suite, n=10, judge = Llama 3.3 70B (Groq): mean overall 3.85/5. Per-axis breakdown — coverage 4.0, scope 4.0, learning_arc 3.9, brief_fidelity 3.5. All 10/10 items judged, zero judge errors. Full JSONL: docs/eval/authoring-n10-groq-20260525.jsonl (10 individual items + summary record). Reproduce locally with the snippet below.
# Real eval run, n=10 — needs LLM_PROVIDER=openai + OPENAI_API_BASE=https://api.groq.com/openai/v1
# + OPENAI_API_KEY=<your-groq-key> + LLM_MODEL=llama-3.3-70b-versatile in .env:
docker compose exec api python -m app.evals run --suite authoring
| Suite | n | Score (latest) | Notes |
|---|---|---|---|
authoring |
10 | 3.85/5 | Real Groq signal — no retrieval needed, judge directly compares generated outline vs. ideal. |
tutor |
30 | 2.33/5 | Real retrieval + real LLM. Embeddings via Cloudflare Workers AI (@cf/baai/bge-small-en-v1.5, 384-dim, free tier), LLM + judge via Groq Llama 3.3 70B. 10/30 judged (20 skipped — courses not seeded), faithfulness 3.3, helpfulness 2.8, citation_correctness 0.9. The low citation score reflects a mismatch between the eval's expected must_cite_ids and what the retriever actually pulls — relevant chunks land but not the specific ones the dataset hardcodes. Report: docs/eval/tutor-n30-groq-cloudflare-20260525.jsonl. Prior run with noop embeddings (2.0/5) kept at docs/eval/tutor-n30-groq-noopembed-20260525.jsonl for comparison. |
ingest |
10 | 0.83/5** | **Of 10 YouTube items, 4 were fully ingested + judged; 6 hit upstream transcript fetch errors (rate-limited cloud IPs, age-restricted videos, etc). The judged 4 scored low on chapter_count_accuracy + structure_quality because the v1 chunker emits one module-per-video instead of detecting chapter boundaries — known follow-up. Report: docs/eval/ingest-n10-groq-20260525.jsonl. |
Each item is scored 0–5 by an LLM-as-judge on suite-specific axes (faithfulness, citation_correctness, helpfulness for tutor; coverage, learning_arc, scope, brief_fidelity for authoring; chunking_quality, metadata_completeness for ingest). Reports carry per-axis means, an overall mean, and a regression diff vs. the previous run. CI gates a 3-item smoke on every PR via .github/workflows/pnpm-eval-smoke.yml.
Prereqs. Docker Desktop 4.30+ (or Docker Engine 27 + Compose v2). Optional: an LLM API key — a Groq key is recommended for the free tier; without one, the AI features fall back to the deterministic noop provider so the rest of the app still works.
git clone https://github.com/ahmedEid1/E-Learning-Platform.git
cd E-Learning-Platform
cp .env.example .env
docker compose up
make migrate
make seed
Then open http://localhost:3000 and log in with one of the seeded accounts:
| Role | Password | |
|---|---|---|
| Admin | admin@lumen.test | Admin!2026 |
| Instructor | teacher@lumen.test | Teach!2026 |
| Student | student@lumen.test | Learn!2026 |
For real LLM features (tutor, authoring, ingest, evals), set the following in .env and restart:
LLM_PROVIDER=openai
OPENAI_API_BASE=https://api.groq.com/openai/v1
OPENAI_API_KEY=<your-groq-key>
LLM_MODEL=llama-3.3-70b-versatile
The same LLMProvider abstraction also accepts native Anthropic (LLM_PROVIDER=anthropic) and OpenAI (LLM_PROVIDER=openai with the default base URL) configurations — no code changes, switch by env var.
The live demo runs on one AWS EC2 t4g.small Graviton2 VM (2 vCPU + 2 GB RAM + 30 GB gp3, ARM64 Ubuntu 24.04) — covered by AWS's t4g.small free-trial promo through Dec 31 2026 and absorbed by the new-account Free Plan credits before that. The unmodified docker-compose.prod.yml brings up FastAPI + Celery worker + beat + Postgres-pgvector + Redis + MinIO + a containerised Caddy 2 that auto-fetches a Let's Encrypt cert. The 2 GB RAM cap is handled by a 4 GB swapfile + tuned Postgres config in the bootstrap script. Cloudflare's DNS proxy in front is an optional next step, not a prerequisite.
tl;dr after the EC2 instance is running and you've SSHed in:
ssh -i ~/.ssh/lumen-prod.pem ubuntu@<elastic-ip>
curl -fsSL https://raw.githubusercontent.com/ahmedEid1/E-Learning-Platform/main/scripts/aws-bootstrap.sh | sudo bash
# log out, log back in as the new admin user, then:
git clone https://github.com/ahmedEid1/E-Learning-Platform.git lumen && cd lumen
cp .env.example .env.production # fill APP_DOMAIN + secrets (see runbook step 5)
docker compose -f docker-compose.prod.yml --env-file .env.production up -d
Full runbook (EC2 creation through TLS + smokes): docs/deployment/aws-vps.md. Cost callout: **200 credits absorb the Elastic IP); ~1/user/day by default and the operator can dial it lower in .env. Migration path off AWS at end-of-trial: rerun the same runbook against Oracle Always Free A1 (if capacity ever appears) or Hetzner CAX11 — the compose stack is identical because all three targets are ARM64 Ubuntu 24.04.
Lumen ships an MCP server (Phase I, item I1) that exposes its catalog, RAG tutor, FSRS review queue, AI authoring pipeline, and multi-modal ingest as nine tools. Add it as an MCP source in Claude Desktop:
// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
// %APPDATA%\Claude\claude_desktop_config.json (Windows)
{
"mcpServers": {
"lumen": {
"command": "uvx",
"args": ["--from", "lumen-backend", "python", "-m", "app.mcp", "--transport", "stdio"],
"env": {
"LUMEN_MCP_AUTH_TOKEN": "<your-token>",
"DATABASE_URL": "<postgres-url>"
}
}
}
}
Generate the LUMEN_MCP_AUTH_TOKEN value with make mcp-token against your running Lumen instance — that prints a fresh OAuth client_id + client_secret pair; paste the secret as the env value. For Claude Code, the equivalent one-liner is claude mcp add lumen -- python -m app.mcp --transport stdio (set LUMEN_MCP_AUTH_TOKEN in your shell first). Full operator guide: docs/mcp.md.
Once installed, ask Claude 'list my Lumen courses' and watch the MCP tool calls fire in the desktop sidebar — the planner picks among list_courses, get_course, ask_tutor, list_my_due_reviews, grade_review_card, create_course_draft, ingest_url_to_draft, list_my_progress, and search_lesson_content.
Ahmed Hobeishy — full-stack engineer (Python + TypeScript + DevOps), based in Essen, Germany. Building Lumen as the centrepiece of an agentic-AI engineering portfolio. Currently open to senior agentic-AI engineering roles.
MIT — see LICENSE.
Status: actively built. 1.1.0-agentic shipped 2026-05-22 (Phase H + all five Phase I items — MCP server, multi-agent tutor, self-critique authoring, learner-trace surface, learning-path agent). Wave 2 portfolio-activation prep completed 2026-05-25 (eval harness wiring + agentic-demo seed + screenshot pack + single-VM deploy runbook + MCP registry metadata + README truthing). The deploy target pivoted from Oracle Always Free to AWS t4g.small after Frankfurt Always-Free capacity stayed saturated for 24h and Oracle's PAYG region-subscription cap blocked the Stockholm fallback; the new AWS runbook (docs/deployment/aws-vps.md) ships ~$0/mo wall-clock on a new-account Free Plan through end-of-2026. Remaining work is operator-side: provision the EC2 instance and run the deploy runbook, mint the live tutor-eval score against Groq, record the 90-second Loom, and start applying. The MCP server is already published to registry.modelcontextprotocol.io as io.github.ahmedEid1/lumen v1.1.0.
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.
by danny-avila
Provides a self‑hosted ChatGPT‑style interface supporting numerous AI models, agents, code interpreter, image generation, multimodal interactions, and secure multi‑user authentication.
by block
Automates engineering tasks on local machines, executing code, building projects, debugging, orchestrating workflows, and interacting with external APIs using any LLM.
by RooCodeInc
Provides an autonomous AI coding partner inside the editor that can understand natural language, manipulate files, run commands, browse the web, and be customized via modes and instructions.
by pydantic
A Python framework that enables seamless integration of Pydantic validation with large language models, providing type‑safe agent construction, dependency injection, and structured output handling.
by mcp-use
A Python SDK that simplifies interaction with MCP servers and enables developers to create custom agents with tool‑calling capabilities.
by lastmile-ai
Build effective agents using Model Context Protocol and simple, composable workflow patterns.
by Klavis-AI
Provides production‑ready MCP servers and a hosted service for integrating AI applications with over 50 third‑party services via standardized APIs, OAuth, and easy Docker or hosted deployment.
by nanbingxyz
A cross‑platform desktop AI assistant that connects to major LLM providers, supports a local knowledge base, and enables tool integration via MCP servers.