loading…
Search for a command to run...
loading…
Enables AI assistants to search, explore, and query San Francisco's open data portal through a standardized interface for public datasets. It supports SQL-like
Enables AI assistants to search, explore, and query San Francisco's open data portal through a standardized interface for public datasets. It supports SQL-like querying via the Socrata platform and includes features like fuzzy column matching and schema caching.
A Model Context Protocol (MCP) server that provides LLMs with seamless access to San Francisco's open data portal (DataSF), powered by the Socrata platform.
This MCP server enables AI assistants like Claude to search, explore, and query San Francisco's public datasets through a simple, standardized interface. It handles the complexity of the Socrata API, provides intelligent column name correction, and includes schema caching for optimal performance.
search_datasfSearch for datasets by keywords.
Parameters:
query (string, required): Search keywords (1-500 characters)limit (number, optional): Max results (default: 5, max: 20)Example:
Search for police incident datasets
list_datasfBrowse available datasets, optionally filtered by category.
Parameters:
category (string, optional): Filter by categorylimit (number, optional): Max results (default: 5, max: 20)Example:
List recent public safety datasets
get_schemaGet the schema (columns and data types) for a specific dataset.
Parameters:
dataset_id (string, required): Dataset 4x4 ID (format: xxxx-xxxx)Example:
Get the schema for dataset wg3w-h783
query_datasfExecute a SoQL (Socrata Query Language) query against a dataset.
Parameters:
dataset_id (string, required): Dataset 4x4 IDsoql (string, required): SoQL query (1-4000 characters)auto_correct (boolean, optional): Enable column name correction (default: true)Example:
Query dataset wg3w-h783: SELECT incident_category, COUNT(*) GROUP BY incident_category LIMIT 10
If you want to run or modify the server locally:
git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp
npm install
npm start
The server uses tsx to run TypeScript directly without a build step.
For the MCP Inspector, you'll need to use the local installation:
# First, clone and install locally
git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp
npm install
# Then run the inspector
npx -y @modelcontextprotocol/inspector tsx src/index.ts
In the inspector UI, use:
tsxsrc/index.ts (or absolute path if running from outside the directory)The easiest way to use the server is directly from GitHub using npx:
{
"mcpServers": {
"datasf": {
"command": "npx",
"args": ["-y", "github:fwextensions/datasf-mcp"],
"env": {
"SOCRATA_APP_TOKEN": "your-optional-token"
}
}
}
}
This will automatically download and run the latest version from GitHub without any manual installation.
Alternatively, clone and install locally:
git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp
npm install
Then use the absolute path in your MCP configuration (see below).
Add to your Claude Desktop config file:
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
Option 1: Using npx (recommended)
{
"mcpServers": {
"datasf": {
"command": "npx",
"args": ["-y", "github:fwextensions/datasf-mcp"],
"env": {
"SOCRATA_APP_TOKEN": "your-optional-token"
}
}
}
}
Option 2: Using local installation
{
"mcpServers": {
"datasf": {
"command": "npx",
"args": ["tsx", "/absolute/path/to/datasf-mcp/src/index.ts"],
"env": {
"SOCRATA_APP_TOKEN": "your-optional-token"
}
}
}
}
Important: Replace /absolute/path/to/datasf-mcp with the actual full path to where you cloned this project.
Create or edit .kiro/settings/mcp.json:
Option 1: Using npx from GitHub (recommended)
{
"mcpServers": {
"datasf": {
"command": "npx",
"args": ["-y", "github:fwextensions/datasf-mcp"],
"env": {
"SOCRATA_APP_TOKEN": "your-optional-token"
},
"disabled": false,
"autoApprove": []
}
}
}
Option 2: Using local installation
{
"mcpServers": {
"datasf": {
"command": "npx",
"args": ["tsx", "src/index.ts"],
"env": {
"SOCRATA_APP_TOKEN": "your-optional-token"
},
"disabled": false,
"autoApprove": []
}
}
}
The server works without authentication for public data, but an App Token increases rate limits:
datasf-mcp-server/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── socrataClient.ts # Socrata API client
│ ├── validator.ts # Input validation with Zod
│ ├── fuzzyMatcher.ts # Column name auto-correction
│ ├── cache.ts # Schema caching
│ ├── errorHandler.ts # Error handling utilities
│ └── __tests__/
│ └── property/ # Property-based tests
├── dist/ # Compiled JavaScript output
├── package.json
└── tsconfig.json
npm run build - Compile TypeScript to JavaScriptnpm start - Run the compiled servernpm test - Run all testsnpm run test:watch - Run tests in watch modenpm test
The project uses property-based testing with fast-check to ensure correctness across a wide range of inputs.
The server follows a modular architecture:
Once configured in your LLM, you can ask questions like:
The server interacts with three Socrata APIs:
https://api.us.socrata.com/api/catalog/v1 - Dataset search and browsinghttps://data.sfgov.org/api/views/{id}.json - Schema retrievalhttps://data.sfgov.org/resource/{id}.json - Data queryingThe server provides descriptive error messages for:
Contributions are welcome! The project uses:
MIT
Server not starting
npm run build firstTools not showing up in LLM
Rate limiting errors
Column name errors in queries
get_schema first to see valid column namesauto_correct: true (default) for automatic typo correctionДобавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"datasf-mcp-server": {
"command": "npx",
"args": []
}
}
}