loading…
Search for a command to run...
loading…
An MCP server for the Hugging Face Dataset Viewer API that enables searching, fetching, and filtering datasets on the Hugging Face Hub. It allows users to explo
An MCP server for the Hugging Face Dataset Viewer API that enables searching, fetching, and filtering datasets on the Hugging Face Hub. It allows users to explore schemas, perform full-text searches, and analyze dataset statistics through natural language.
MCP server for the Hugging Face Dataset Viewer API. Search datasets, fetch rows, filter data, and more.
npx @cfahlgren1/hf-dataset-mcp
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"hf-datasets": {
"command": "npx",
"args": ["-y", "@cfahlgren1/hf-dataset-mcp"],
"env": {
"HF_TOKEN": "hf_..."
}
}
}
}
| Variable | Description |
|---|---|
HF_TOKEN |
Hugging Face API token (required for private/gated datasets) |
HF_DATASETS_SERVER |
Custom Dataset Viewer API URL (default: https://datasets-server.huggingface.co) |
search_datasetsFind datasets on the Hugging Face Hub by name, tag, or author.
search_datasets(search?: string, author?: string, filter?: string[], sort?: string, limit?: number)
validate_datasetCheck if a dataset is accessible and which viewer features are available.
validate_dataset(dataset: string)
list_splitsGet all available configurations and splits for a dataset.
list_splits(dataset: string)
get_dataset_infoGet the schema, metadata, and row counts for a dataset configuration.
get_dataset_info(dataset: string, config: string)
get_rowsFetch a slice of rows from a dataset split.
get_rows(dataset: string, config: string, split: string, offset?: number, length?: number)
search_datasetFull-text search within a dataset split using BM25 ranking.
search_dataset(dataset: string, config: string, split: string, query: string, offset?: number, length?: number)
filter_rowsFilter dataset rows using SQL-like WHERE conditions.
filter_rows(dataset: string, config: string, split: string, where: string, orderby?: string, offset?: number, length?: number)
WHERE syntax: Column names in double quotes, strings in single quotes. Supports =, <>, >, <, >=, <=, AND, OR, NOT.
Example: "label"=1 AND "text" LIKE '%hello%'
get_dataset_sizeGet row counts and byte sizes for all configs and splits.
get_dataset_size(dataset: string)
list_parquet_filesGet URLs for the dataset's Parquet files for direct download or processing.
list_parquet_files(dataset: string)
get_statisticsGet descriptive statistics for each column in a dataset split.
get_statistics(dataset: string, config: string, split: string)
search_datasets(filter: ["task_categories:text-classification"], sort: "downloads", limit: 10)
list_splits(dataset: "stanfordnlp/imdb")
get_dataset_info(dataset: "stanfordnlp/imdb", config: "plain_text")
get_rows(dataset: "stanfordnlp/imdb", config: "plain_text", split: "train", offset: 0, length: 10)
search_dataset(dataset: "stanfordnlp/imdb", config: "plain_text", split: "train", query: "amazing movie")
filter_rows(dataset: "stanfordnlp/imdb", config: "plain_text", split: "train", where: "\"label\"=1", length: 10)
MIT
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"hf-dataset-mcp": {
"command": "npx",
"args": []
}
}
}