by privetin
Interact with Hugging Face datasets via an MCP server, enabling browsing, searching, filtering, and analysis of public and private collections.
Dataset Viewer provides an MCP server that bridges the Hugging Face Dataset Viewer API with client applications. It lets users explore datasets hosted on the Hugging Face Hub, retrieve metadata, paginate rows, run SQL‑like filters, search text, and download entire splits in Parquet format.
uv
).HUGGINGFACE_TOKEN
for private dataset access.uv run dataset-viewer
).validate
, get_info
, get_rows
, get_first_rows
, get_statistics
, search_dataset
, filter
, and get_parquet
—through the MCP client.dataset://
) for direct referencing.Q: Do I need a Hugging Face token? A: Only when accessing private datasets; public datasets work without a token.
Q: Which Python version is required? A: Python 3.12 or newer.
Q: How do I install the server?
A: Clone the repo, create a virtual environment with uv venv
, activate it, and run uv add -e .
.
Q: How is pagination handled?
A: Use the page
parameter (0‑based) with get_rows
or filter
to retrieve successive batches.
Q: Can I retrieve the whole dataset at once?
A: Yes, via the get_parquet
tool, which returns the dataset in Parquet format.
An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.
dataset://
URI scheme for accessing Hugging Face datasetsThe server provides the following tools:
validate
dataset
: Dataset identifier (e.g. 'stanfordnlp/imdb')auth_token
(optional): For private datasetsget_info
dataset
: Dataset identifierauth_token
(optional): For private datasetsget_rows
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namepage
(optional): Page number (0-based)auth_token
(optional): For private datasetsget_first_rows
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split nameauth_token
(optional): For private datasetsget_statistics
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split nameauth_token
(optional): For private datasetssearch_dataset
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namequery
: Text to search forauth_token
(optional): For private datasetsfilter
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namewhere
: SQL WHERE clause (e.g. "score > 0.5")orderby
(optional): SQL ORDER BY clausepage
(optional): Page number (0-based)auth_token
(optional): For private datasetsget_parquet
dataset
: Dataset identifierauth_token
(optional): For private datasetsgit clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
# Create virtual environment
uv venv
# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install in development mode
uv add -e .
HUGGINGFACE_TOKEN
: Your Hugging Face API token for accessing private datasetsAdd the following to your Claude Desktop config file:
On Windows: %APPDATA%\Claude\claude_desktop_config.json
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"dataset-viewer": {
"command": "uv",
"args": [
"--directory",
"parent_to_repo/dataset-viewer",
"run",
"dataset-viewer"
]
}
}
}
MIT License - see LICENSE for details
Please log in to share your review and rating for this MCP.
{ "mcpServers": { "dataset-viewer": { "command": "uv", "args": [ "--directory", "parent_to_repo/dataset-viewer", "run", "dataset-viewer" ], "env": { "HUGGINGFACE_TOKEN": "<YOUR_HUGGINGFACE_TOKEN>" } } } }
Explore related MCPs that share similar capabilities and solve comparable challenges
by antvis
Offers over 25 AntV chart types for automated chart generation and data analysis, callable via MCP tools, CLI, HTTP, SSE, or streamable transports.
by reading-plus-ai
A versatile tool that enables interactive data exploration through prompts, CSV loading, and script execution.
by Canner
Provides a semantic engine that lets MCP clients and AI agents query enterprise data with contextual understanding, precise calculations, and built‑in governance.
by surendranb
Provides natural‑language access to Google Analytics 4 data via MCP, exposing over 200 dimensions and metrics for Claude, Cursor and other compatible clients.
by ergut
Provides secure, read‑only access to BigQuery datasets, allowing large language models to query and analyze data through a standardized interface.
by isaacwasserman
Provides an interface for LLMs to visualize data using Vega‑Lite syntax, supporting saving of data tables and rendering visualizations as either a full Vega‑Lite specification (text) or a base64‑encoded PNG image.
by vantage-sh
Fetch and explore cloud cost and usage data from a Vantage account using natural language through AI assistants and MCP clients.
by acryldata
Provides a Model Context Protocol server that enables searching, metadata retrieval, lineage traversal, and SQL query listing for DataHub entities.
by rishijatia
Provides programmatic access to Fantasy Premier League statistics, team information, gameweeks, and analysis tools via a Model Context Protocol server.