by Lekssays
Provides a containerized static code analysis service powered by Joern's Code Property Graph and exposed through an MCP interface.
Codebadger offers a ready‑to‑run MCP server that generates and queries Code Property Graphs (CPGs) for many languages (Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, Swift). It enables deep program analysis such as control‑flow, type inspection, and a rich set of taint‑ and vulnerability‑detectors.
python -m venv venv # optional
pip install -r requirements.txt
docker compose up -d
python main.py &
The server listens at http://localhost:4242.docker-compose down (or bash cleanup.sh for a full reset).Q: Which languages are supported? A: Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, and Swift.
Q: How do I integrate Codebadger with GitHub Copilot?
A: Add an entry to ~/.config/Code/User/mcp.json pointing to http://localhost:4242/mcp under the codebadger server key.
Q: Can I run Codebadger without Docker? A: The Joern engine runs inside Docker containers; the MCP server itself is a Python process, so Docker is required for full functionality.
Q: How is telemetry configured?
A: Set environment variables like OTEL_ENABLED=true and OTEL_EXPORTER_OTLP_ENDPOINT or configure them in config.yaml under the telemetry section.
Q: How do I add my own detector?
A: Place a Scala query in src/tools/queries/, register a Python wrapper in src/tools/custom_tools.py, then restart the server.
Q: What is the default port?
A: 4242 (configurable via MCP_PORT).
A containerized Model Context Protocol (MCP) server providing static code analysis using Joern's Code Property Graph (CPG) technology with support for Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, and Swift.
codebadger and its accompanying paper — Bridging Code Property Graphs and Language Models for Program Analysis — have been accepted at the Software Vulnerability Management Workshop @ ICSE 2026. 🎉
@article{lekssays2026bridging,
title={Bridging Code Property Graphs and Language Models for Program Analysis},
author={Lekssays, Ahmed},
journal={arXiv preprint arXiv:2603.24837},
year={2026}
}
If codebadger helped you discover a real-world vulnerability, we'd love to hear about it. Open a pull request adding it to TROPHIES.md — include the CVE ID, project, a one-line description, and the date.
Before you begin, make sure you have:
To verify your setup:
docker --version
docker-compose --version
python --version
# Create a virtual environment (optional but recommended)
python -m venv venv
# Install dependencies
pip install -r requirements.txt
docker compose up -d
This starts:
Verify services are running:
docker compose ps
# Start the server
python main.py &
The MCP server will be available at http://localhost:4242.
# Stop MCP server (Ctrl+C in terminal)
# Stop Docker services
docker-compose down
# Optional: Clean up everything
bash cleanup.sh
Use the provided cleanup script to reset your environment:
bash cleanup.sh
This will:
__pycache__, .pytest_cache)Edit the MCP configuration file for VS Code (GitHub Copilot):
Path:
~/.config/Code/User/mcp.json
Example configuration:
{
"inputs": [],
"servers": {
"codebadger": {
"url": "http://localhost:4242/mcp",
"type": "http"
}
}
}
To integrate codebadger into Claude Desktop, edit:
Path:
Claude → Settings → Developer → Edit Config → claude_desktop_config.json
Add the following:
{
"mcpServers": {
"codebadger": {
"url": "http://localhost:4242/mcp",
"type": "http"
}
}
}
generate_cpg: Generate a Code Property Graph (CPG) for a codebase (local path or GitHub URL).get_cpg_status: Check whether a CPG exists and retrieve status metadata.run_cpgql_query: Execute a raw CPGQL query against a CPG and return structured results.get_cpgql_syntax_help: Show CPGQL syntax helpers, tips, and common error fixes.list_methods: List methods/functions with optional regex/file filters.list_files: Show source files as a paginated tree view.get_method_source: Retrieve the source code for a named method.list_calls: List call sites between functions (caller → callee).get_call_graph: Build a human-readable call graph (incoming or outgoing).list_parameters: Get parameter names, types, and order for a method.get_codebase_summary: High-level metrics (files, methods, calls, language).get_code_snippet: Return a file snippet by start/end line numbers.get_cfg: Produce a control-flow graph (nodes and edges) for a method.get_type_definition: Inspect struct/class types and their members.get_macro_expansion: Heuristically detect likely macro-expanded calls.find_taint_sources: Find likely external input points (sources).find_taint_sinks: Locate dangerous sinks where tainted data can flow.find_taint_flows: Detect dataflows from sources to sinks (taint analysis).get_program_slice: Build backward/forward program slices for a call.get_variable_flow: Trace data dependencies for a variable at a location.find_bounds_checks: Search for bounds-checks near a buffer access.find_use_after_free: Heuristic detection of use-after-free patterns.find_double_free: Detect potential double-free issues.find_null_pointer_deref: Find likely null pointer dereferences.find_integer_overflow: Detect integer overflow patterns.find_format_string_vulns: Detect format string vulnerabilities (CWE-134) where non-literal format arguments are passed to printf-family functions.find_heap_overflow: Detect heap overflow vulnerabilities (CWE-122) where writes to heap buffers may exceed their allocated size.find_stack_overflow: Detect stack buffer overflow vulnerabilities (CWE-121) where writes to fixed-size local arrays (e.g. char buf[64]) may exceed their declared dimension.find_toctou: Detect Time-of-Check-Time-of-Use race conditions (CWE-367) where a file is checked with access()/stat() and then opened or operated on in a separate step.find_uninitialized_reads: Detect uninitialized variable reads (CWE-457) where local variables are used before being assigned a value.You can add your own detectors without modifying the core codebase:
src/tools/queries/your_query.scala.src/tools/custom_tools.py.See CUSTOM_TOOLS_GUIDE.md for the full step-by-step guide, CPGQL reference, and design decisions.
Thanks for contributing! Here's a quick guide to get started with running tests and contributing code.
python -m venv venv
pip install -r requirements.txt
docker-compose up -d
pytest tests/ -q
# Start MCP server in background
python main.py &
# Run integration tests
pytest tests/integration -q
# Stop MCP server
pkill -f "python main.py"
pytest tests/ -q
bash cleanup.sh
docker-compose down
Please follow these guidelines when contributing:
The MCP server can be configured via environment variables or config.yaml.
Key settings (optional - defaults shown):
# Server
MCP_HOST=0.0.0.0
MCP_PORT=4242
# Joern
JOERN_BINARY_PATH=joern
JOERN_JAVA_OPTS="-Xmx4G -Xms2G -XX:+UseG1GC -Dfile.encoding=UTF-8"
# CPG Generation
CPG_GENERATION_TIMEOUT=600
MAX_REPO_SIZE_MB=500
# Query
QUERY_TIMEOUT=30
QUERY_CACHE_ENABLED=true
QUERY_CACHE_TTL=300
# Telemetry (OpenTelemetry)
OTEL_ENABLED=false
OTEL_SERVICE_NAME=codebadger
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
Create a config.yaml from config.example.yaml:
cp config.example.yaml config.yaml
Then customize as needed.
CodeBadger has built-in OpenTelemetry support for distributed tracing. When enabled, all MCP tool calls are automatically traced, plus custom spans for CPG generation, Joern server management, and query execution.
requirements.txt):pip install opentelemetry-sdk opentelemetry-exporter-otlp
export OTEL_ENABLED=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
python main.py
Or via config.yaml:
telemetry:
enabled: true
service_name: codebadger
otlp_endpoint: http://localhost:4317
otlp_protocol: grpc # or "http/protobuf"
# Start Jaeger (provides UI at http://localhost:16686)
docker run -d --name jaeger \
-p 16686:16686 \
-p 4317:4317 \
jaegertracing/all-in-one:latest
# Start CodeBadger with telemetry
OTEL_ENABLED=true python main.py
| Span | Description |
|---|---|
tools/call {name} |
Every MCP tool invocation (automatic via FastMCP) |
cpg.generate |
Full CPG generation pipeline |
cpg.joern_cli_exec |
Joern CLI command execution inside Docker |
cpg.spawn_server |
Joern server instance creation |
cpg.load_cpg |
CPG file loading into Joern server |
query.execute |
CPGQL query execution with timing and success attributes |
| Setting | Env Variable | Default | Description |
|---|---|---|---|
enabled |
OTEL_ENABLED |
false |
Enable/disable telemetry |
service_name |
OTEL_SERVICE_NAME |
codebadger |
Service name in traces |
otlp_endpoint |
OTEL_EXPORTER_OTLP_ENDPOINT |
http://localhost:4317 |
OTLP collector endpoint |
otlp_protocol |
OTEL_EXPORTER_OTLP_PROTOCOL |
grpc |
Export protocol (grpc or http/protobuf) |
When telemetry is disabled (default), all tracing is no-op with zero overhead.
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A Model Context Protocol server for Git repository interaction and automation.
by zed-industries
A high‑performance, multiplayer code editor designed for speed and collaboration.
by modelcontextprotocol
Model Context Protocol Servers
by modelcontextprotocol
A Model Context Protocol server that provides time and timezone conversion capabilities.
by cline
An autonomous coding assistant that can create and edit files, execute terminal commands, and interact with a browser directly from your IDE, operating step‑by‑step with explicit user permission.
by upstash
Provides up-to-date, version‑specific library documentation and code examples directly inside LLM prompts, eliminating outdated information and hallucinated APIs.
by daytonaio
Provides a secure, elastic infrastructure that creates isolated sandboxes for running AI‑generated code with sub‑90 ms startup, unlimited persistence, and OCI/Docker compatibility.
by continuedev
Enables faster shipping of code by integrating continuous AI agents across IDEs, terminals, and CI pipelines, offering chat, edit, autocomplete, and customizable agent workflows.
by github
Connects AI tools directly to GitHub, enabling natural‑language interactions for repository browsing, issue and pull‑request management, CI/CD monitoring, code‑security analysis, and team collaboration.