by ondata
Provides a Model Context Protocol server that lets AI assistants query CKAN open data portals for datasets, organizations, groups, tags, and DataStore tables using natural language, without requiring direct knowledge of CKAN APIs.
Enables AI‑driven conversations with any CKAN‑powered open data portal. The server translates natural‑language requests into CKAN API calls (package search, DataStore SQL, organization and group queries, tag listings, etc.) and returns structured results that LLMs can incorporate directly.
npm install -g @aborruso/ckan-mcp-server) or run instantly via npx @aborruso/ckan-mcp-server@latest. Alternatively, point your AI client to the hosted endpoint https://ckan-mcp-server.andy-pr.workers.dev/mcp.mcpServers configuration.ckan:// links for direct resource access (datasets, resources, organizations, groups, tags, formats).dati.gov.it datasets via MQA scores.Q: Do I need an API key? A: No. CKAN portals expose public APIs; the server only forwards requests.
Q: What if a portal doesn’t have the DataStore extension?
A: ckan_datastore_search and ckan_datastore_search_sql will return an error indicating the resource is not datastore‑active.
Q: How are rate limits handled? A: The hosted endpoint enforces a shared 100k requests/day quota. Self‑hosted installations inherit only the portal’s own limits.
Q: Can I query multiple portals in one session?
A: Yes. Specify the server_url parameter for each tool call to target different CKAN instances.
Q: How do I troubleshoot no‑result responses?
A: Use facet queries (facet_field) to explore available values, widen the query, or verify field existence with field:* syntax.
Turn any (CKAN) open data portal into a conversation.
Give your AI assistant direct access to any CKAN open data portal — search datasets, explore organizations, query tabular data, and read metadata, all through natural language.
CKAN is the open-source platform behind most public open data portals worldwide (Italy's dati.gov.it, the US data.gov, Canada's open.canada.ca, and many more). Navigating these portals usually requires knowing their structure, APIs, and search syntax. This MCP server removes that barrier: once connected, your AI tool can do it all for you.
This is possible because of open standards and open source. CKAN exposes a fully documented, public API. Metadata follows DCAT, an open W3C standard for describing datasets. Both are free to use, free to build on, and maintained by open communities. This server stands on that foundation.
Who is this for? Everyone. Journalists looking for data to verify a story. Researchers exploring public datasets. Public servants checking what data their administration publishes. Developers building data pipelines. No CKAN knowledge required.
Two ways to use it — pick the one that suits you:
| Option A: Install locally | Option B: No install | |
|---|---|---|
| How | npm install -g @aborruso/ckan-mcp-server |
Point your tool to the hosted HTTP endpoint |
| Best for | Runs on your machine, works with any local tool | Quick start, zero setup |
| Limits | None | 100k requests/day shared quota |
Hosted endpoint: https://ckan-mcp-server.andy-pr.workers.dev/mcp
Recommendation: Option B is a great way to get started and try things out without any setup. Once you're familiar with what the server can do, switching to Option A (local install) gives you unlimited usage with no shared quotas.
👉 Want to explore the codebase? The AI-generated DeepWiki is a great starting point.
License: MIT — see LICENSE.txt for complete details. Third-party notices: NOTICE.md.

ChatGPT | Claude Desktop | Claude Code | Gemini CLI | VS Code | Codex CLI
This server works with any MCP-compatible client. The sections below cover some of the most popular ones — if your tool isn't listed, check its documentation for MCP configuration and use the same endpoint URL or command.
All examples below work with both the local installation and the hosted endpoint. Where both options differ, both are shown.
Using local installation? You need to install the server first — see Run locally.
Requires a ChatGPT Plus, Team, or Enterprise plan.
https://ckan-mcp-server.andy-pr.workers.dev/mcpFor a step-by-step walkthrough with screenshots, see the full ChatGPT guide.
Using the hosted endpoint (no install) — via connector UI:
https://ckan-mcp-server.andy-pr.workers.dev/mcpFor a detailed walkthrough with screenshots, see the full Claude guide.
Using the hosted endpoint (no install) — via config file:
Configuration file location:
~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.json~/.config/Claude/claude_desktop_config.json{
"mcpServers": {
"ckan": {
"url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
}
}
}
Using local installation:
{
"mcpServers": {
"ckan": {
"command": "npx",
"args": ["@aborruso/ckan-mcp-server@latest"]
}
}
}
Using the hosted endpoint (no install):
claude mcp add -s user -t http ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
Using local installation:
claude mcp add -s user ckan npx @aborruso/ckan-mcp-server@latest
--scope usermakes the server available globally across all your projects, not just the current one.
To add it only for a specific project, run from the project folder without the --scope user flag:
claude mcp add --transport http ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
gemini mcp add -s user -t http ckan https://ckan-mcp-server.andy-pr.workers.dev/mcp
Or add manually to ~/.gemini/settings.json:
{
"mcpServers": {
"ckan": {
"httpUrl": "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
}
}
}
Add to your User Settings or .vscode/settings.json:
Using the hosted endpoint (no install):
{
"mcpServers": {
"ckan": {
"url": "https://ckan-mcp-server.andy-pr.workers.dev/mcp",
"type": "http"
}
}
}
Using local installation:
{
"mcpServers": {
"ckan": {
"command": "npx",
"args": ["@aborruso/ckan-mcp-server@latest"]
}
}
}
Add to ~/.codex/config.toml:
Using the hosted endpoint (no install):
[mcp_servers.ckan]
url = "https://ckan-mcp-server.andy-pr.workers.dev/mcp"
Using local installation:
[mcp_servers.ckan]
command = "npx"
args = ["-y", "@aborruso/ckan-mcp-server@latest"]
The quickest way. Install the package globally and it's immediately available as a command:
npm install -g @aborruso/ckan-mcp-server
The server will be available as ckan-mcp-server, or you can run it without installing via:
npx @aborruso/ckan-mcp-server@latest
For development or if you want to run the latest unreleased code:
git clone https://github.com/ondata/ckan-mcp-server.git
cd ckan-mcp-server
npm install
npm run build
node dist/index.js
Direct data access via ckan:// URI scheme:
ckan://{server}/dataset/{id} - Dataset metadatackan://{server}/resource/{id} - Resource metadata and download URLckan://{server}/organization/{name} - Organization detailsckan://{server}/group/{name}/datasets - Datasets by group (theme)ckan://{server}/organization/{name}/datasets - Datasets by organizationckan://{server}/tag/{name}/datasets - Datasets by tagckan://{server}/format/{format}/datasets - Datasets by resource format (res_format + distribution_format)Examples:
ckan://dati.gov.it/dataset/vaccini-covid
ckan://demo.ckan.org/resource/abc-123
ckan://data.gov/organization/sample-org
ckan://dati.gov.it/group/ambiente/datasets
ckan://dati.gov.it/organization/regione-toscana/datasets
ckan://dati.gov.it/tag/turismo/datasets
ckan://dati.gov.it/format/csv/datasets
Once connected, just ask in plain language. No query syntax needed:
"Search dati.gov.it for datasets about air quality in Milan, then summarize what each contains — time coverage, license, and best download format."
The server finds 31 datasets, groups them by structural pattern, and returns a clear summary — including series names, years covered, publisher, and format. No CKAN knowledge required.
The examples below show natural language requests alongside the actual tool call the LLM will generate internally and send to the CKAN portal. You never write these queries yourself — they are shown here to illustrate how your question gets translated under the hood.
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "popolazione",
rows: 20
})
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "hotel OR alberghi OR \"strutture ricettive\" OR ospitalità OR ricettività",
query_parser: "text",
rows: 0 // returns only the total count, no dataset records — useful to check how many results match before fetching them
})
Note: when query_parser: "text" is used, Solr special characters in the query are escaped automatically.
ckan_find_relevant_datasets({
server_url: "https://www.dati.gov.it/opendata",
query: "mobilità urbana",
limit: 5
})
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
fq: "organization:regione-toscana",
sort: "metadata_modified desc"
})
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
facet_field: ["organization", "tags", "res_format"],
rows: 0 // skip dataset records, return only the facet counts
})
ckan_tag_list({
server_url: "https://www.dati.gov.it/opendata",
tag_query: "salute",
limit: 25
})
ckan_group_search({
server_url: "https://www.dati.gov.it/opendata",
pattern: "ambiente"
})
What is DataStore? CKAN DataStore is an optional extension that imports tabular resources (CSV, Excel) into a queryable database. It allows filtering, sorting, and field selection directly on the data — without downloading the file. Not all portals have it enabled, and not all datasets use it even when the portal supports it. Check
datastore_active: trueon a resource to confirm availability.
// Ordinanze viabili del Comune di Messina — resource with datastore_active: true
ckan_datastore_search({
server_url: "https://dati.comune.messina.it",
resource_id: "17301b8b-2a5b-425f-80b0-5b75bb1793e9",
filters: { "tipo": "lavori" },
sort: "data_pubblicazione desc",
limit: 10
})
👏 A shout-out to Comune di Messina and all public administrations that enable the DataStore extension: by doing so, they make their data dramatically easier to query and explore — including through AI tools like this one.
// Count ordinanze viabili by tipo — Comune di Messina
ckan_datastore_search_sql({
server_url: "https://dati.comune.messina.it",
sql: "SELECT tipo, COUNT(*) AS total FROM \"17301b8b-2a5b-425f-80b0-5b75bb1793e9\" GROUP BY tipo ORDER BY total DESC LIMIT 5"
})
Some examples of supported portals:
Datashades.info/portals maintains a live registry of ~950 CKAN portals from around the world, with metadata on version, plugins, dataset counts, and geographic coordinates.
The portal data is available as a public JSON API — no authentication required:
| Endpoint | Description |
|---|---|
GET https://datashades.info/api/portal/list |
Full list of portals with CKAN version, plugins, dataset/resource/organization counts, and country coordinates |
GET https://datashades.info/api/portal/stats |
Aggregate statistics across all monitored portals |
GET https://datashades.info/api/portal/historical/stats |
Historical trend data for the monitored portals |
CKAN uses Apache Solr as its default search engine. Understanding Solr syntax unlocks the full power of dataset search — from simple keywords to complex boolean expressions, fuzzy matching, proximity searches, and date math.
# Basic search
q: "popolazione"
# Field search
q: "title:popolazione"
q: "notes:sanità"
# Boolean operators
q: "popolazione AND sicilia"
q: "popolazione OR abitanti"
q: "popolazione NOT censimento"
# Filters (fq)
fq: "organization:comune-palermo"
fq: "tags:sanità"
fq: "res_format:CSV"
# Wildcard
q: "popolaz*"
# Date range
fq: "metadata_modified:[2023-01-01T00:00:00Z TO *]"
These real-world examples demonstrate powerful Solr query combinations tested on the Italian open data portal (dati.gov.it):
Find healthcare datasets (tolerating spelling errors) modified in the last 6 months, prioritizing title matches:
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "(title:sanità~2^3 OR title:salute~2^3 OR notes:sanità~1) AND metadata_modified:[NOW-6MONTHS TO *]",
sort: "score desc, metadata_modified desc",
rows: 30
})
Techniques used:
sanità~2 - Fuzzy search with edit distance 2 (finds "sanita", "sanitá", minor typos)^3 - Boosts title matches 3x higher in relevance scoringNOW-6MONTHS - Dynamic date math for rolling time windowsResults: 949 datasets including hospital units, healthcare organizations, medical services
Environmental datasets where "inquinamento" and "aria" (air pollution) appear close together, excluding water-related datasets:
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "(notes:\"inquinamento aria\"~5 OR title:\"qualità aria\"~3) AND NOT (title:acqua OR title:mare)",
facet_field: ["organization", "res_format"],
rows: 25
})
Techniques used:
"inquinamento aria"~5 - Proximity search (words within 5 positions)~3 - Tighter proximity for title matchesNOT (title:acqua OR title:mare) - Exclude water/sea datasetsResults: 305 datasets
Regional datasets published in the last month that have at least one resource format declared:
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "organization:regione* AND metadata_created:[NOW-1MONTH TO *] AND res_format:*",
sort: "metadata_modified desc",
facet_field: ["organization"],
rows: 10
})
Techniques used:
regione* - Wildcard matches all regional organizationsres_format:* - Field existence check (has at least one resource format declared)NOW-1MONTH - Rolling 30-day windowResults: 293 datasets
Datasets from the Italian Ministry of Labour modified during 2025, with facets by format and tags:
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "organization:ministero-del-lavoro AND metadata_modified:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]",
sort: "metadata_modified desc",
facet_field: ["res_format", "tags"],
rows: 10
})
Techniques used:
[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z] - Explicit date range (full year)organization:ministero-del-lavoro - Filter by specific organizationResults: 83 datasets
Boolean Operators: AND, OR, NOT, +required, -excluded
Wildcards: * (multiple chars), ? (single char) - Note: left truncation not supported
Fuzzy: ~N (edit distance), e.g., health~2
Proximity: "phrase"~N (words within N positions)
Boosting: ^N (relevance multiplier), e.g., title:water^2
Ranges:
[a TO b], e.g., num_resources:[5 TO 10]{a TO b}, e.g., num_resources:{0 TO 100}[2024-01-01T00:00:00Z TO *]Date Math: NOW, NOW-1YEAR, NOW-6MONTHS, NOW-7DAYS, NOW/DAY
Field Existence: field:* (field exists), NOT field:* (field missing)
CKAN portals can be source catalogs (data published directly by the organization) or harvesting aggregators (data collected from many other portals). This distinction matters a lot when filtering by date.
| Field | Meaning on source portal | Meaning on aggregator |
|---|---|---|
issued |
When the publisher released the dataset | When the publisher released the dataset |
metadata_created |
When the record was first created | When the record was first harvested |
metadata_modified |
When the record was last updated | When the record was last re-harvested |
On an aggregator like dati.gov.it, metadata_modified is updated every time the portal re-harvests — even if the dataset content hasn't changed. This makes it unsuitable for finding "recently updated content".
Example — same dataset, three different timestamps on dati.gov.it (aggregator):
{
"issued": "2024-12-10",
"metadata_created": "2024-12-16",
"metadata_modified": "2026-02-28"
}
metadata_modifiedis February 2026 only because the portal re-harvested it then — not because the data changed.
Which date fields are filterable on dati.gov.it?
All three fields are Solr-indexed and usable in queries:
| Field | Solr-indexed | What queries return |
|---|---|---|
issued |
✅ | Datasets by publisher release date — most meaningful, but ~14% of datasets lack it |
metadata_created |
✅ | Datasets by first harvesting date on dati.gov.it |
metadata_modified |
✅ | Datasets by last re-harvesting date — often noisy |
Query examples (dati.gov.it):
# Datasets about road accidents published by the original source in 2025
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "incidenti stradali",
fq: "issued:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
})
// → ~121 results (only datasets where publisher filled in `issued`)
# Datasets first appearing on dati.gov.it in 2025
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "incidenti stradali",
fq: "metadata_created:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]"
})
// → ~164 results (includes older datasets harvested for the first time in 2025)
Note on
issuedcoverage: ~59,700 of 69,000+ datasets on dati.gov.it haveissuedpopulated. Queries onissuedare accurate but incomplete — datasets without the field are silently excluded. Preferissuedfor content-date queries; usemetadata_createdonly as a fallback for "when did this appear on the portal".
Recommendation: use issued to find datasets by publication date. Use metadata_created to find datasets that appeared on the portal recently.
ckan-mcp-server/
├── src/
│ ├── index.ts # Entry point
│ ├── server.ts # MCP server setup
│ ├── worker.ts # Cloudflare Workers entry
│ ├── types.ts # Types & schemas
│ ├── utils/
│ │ ├── http.ts # CKAN API client
│ │ ├── formatting.ts # Output formatting
│ │ └── url-generator.ts
│ ├── tools/
│ │ ├── package.ts # Package search/show
│ │ ├── organization.ts # Organization tools
│ │ ├── datastore.ts # DataStore queries
│ │ ├── status.ts # Server status
│ │ ├── tag.ts # Tag tools
│ │ └── group.ts # Group tools
│ ├── resources/ # MCP Resource Templates
│ │ ├── index.ts
│ │ ├── uri.ts
│ │ ├── dataset.ts
│ │ ├── resource.ts
│ │ └── organization.ts
│ ├── prompts/ # MCP Guided Prompts
│ │ ├── index.ts
│ │ ├── theme.ts
│ │ ├── organization.ts
│ │ ├── format.ts
│ │ ├── recent.ts
│ │ └── dataset-analysis.ts
│ └── transport/
│ ├── stdio.ts
│ └── http.ts
├── tests/ # Test suite
├── dist/ # Compiled output (generated)
├── package.json
└── README.md
# Build (esbuild, ~4ms)
npm run build
# Watch mode
npm run watch
# Run all tests
npm test
# Watch mode for tests
npm run test:watch
# Coverage report
npm run test:coverage
The MCP Inspector lets you browse tools, test calls interactively, and debug responses in a web UI:
npm install -g @modelcontextprotocol/inspector
npm run build
npx @modelcontextprotocol/inspector node dist/index.js
Opens at http://localhost:5173.
# Start server
TRANSPORT=http PORT=3001 node dist/index.js
# List available tools
curl -s -X POST http://localhost:3001/mcp \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}'
# Call a tool
curl -s -X POST http://localhost:3001/mcp \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-d '{
"jsonrpc":"2.0","method":"tools/call",
"params":{"name":"ckan_package_search","arguments":{"server_url":"https://www.dati.gov.it/opendata","q":"ambiente","rows":3}},
"id":1
}' | jq -r '.result.content[0].text'
Some CKAN portals expose non-standard web URLs for viewing datasets or organizations. To support those cases, this project ships with src/portals.json, which maps known portal API URLs (and aliases) to custom view URL templates.
When generating a dataset or organization view link, the server:
server_url against api_url and api_url_aliases in src/portals.jsondataset_view_url / organization_view_url template when available{server_url}/dataset/{name} and {server_url}/organization/{name})Wrong URL for Italian portal — use https://www.dati.gov.it/opendata (not https://dati.gov.it).
Connection error
Error: Server not found: https://example.gov
Verify the URL is reachable and use ckan_status_show to confirm the portal is responding.
No results — broaden your query or check what's available with facets:
ckan_package_search({
server_url: "https://www.dati.gov.it/opendata",
q: "*:*",
facet_field: ["tags", "organization"],
rows: 0
})
LLM uses external data when no results are found — when a tool returns no results, some LLMs (e.g. ChatGPT) may supplement the answer with information from their training data without warning. This is a known LLM behavior, not a server issue. To avoid it, instruct the model in your system prompt to only use data returned by the MCP tools and not rely on external sources.
For issues or questions, open an issue on GitHub.
Created with ❤️ by onData for the open data community
Please log in to share your review and rating for this MCP.
Explore related MCPs that share similar capabilities and solve comparable challenges
by modelcontextprotocol
A Model Context Protocol server for Git repository interaction and automation.
by zed-industries
A high‑performance, multiplayer code editor designed for speed and collaboration.
by modelcontextprotocol
Model Context Protocol Servers
by modelcontextprotocol
A Model Context Protocol server that provides time and timezone conversion capabilities.
by cline
An autonomous coding assistant that can create and edit files, execute terminal commands, and interact with a browser directly from your IDE, operating step‑by‑step with explicit user permission.
by upstash
Provides up-to-date, version‑specific library documentation and code examples directly inside LLM prompts, eliminating outdated information and hallucinated APIs.
by daytonaio
Provides a secure, elastic infrastructure that creates isolated sandboxes for running AI‑generated code with sub‑90 ms startup, unlimited persistence, and OCI/Docker compatibility.
by continuedev
Enables faster shipping of code by integrating continuous AI agents across IDEs, terminals, and CI pipelines, offering chat, edit, autocomplete, and customizable agent workflows.
by github
Connects AI tools directly to GitHub, enabling natural‑language interactions for repository browsing, issue and pull‑request management, CI/CD monitoring, code‑security analysis, and team collaboration.
{
"mcpServers": {
"ckan": {
"command": "npx",
"args": [
"@aborruso/ckan-mcp-server@latest"
],
"env": {}
}
}
}claude mcp add ckan npx @aborruso/ckan-mcp-server@latest