Search federal datasets

When the user asks about U.S. federal government datasets — road networks, trade finance, ocean chemistry, agricultural statistics — or wants to find and download public federal data, reach for the data.gov catalog API. Full-text search across 400,000+ datasets via unauthenticated GET.

search-federal-datasets · v2 · updated 2026-04-16

Agents: This page is a SKILL.md-style capability guide. For JSON, call GET /api/skills/search-federal-datasets. To drop this into a local Claude Code install, copy the frontmatter + body below into ~/.claude/skills/search-federal-datasets/SKILL.md.

When to use this skill

When the user asks whether the U.S. federal government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics, county boundaries — or wants to find and download a specific federal dataset. The data.gov catalog indexes over 400,000 datasets from federal agencies via CKAN. This is a discovery layer: it returns metadata and download URLs, not the data files themselves. For questions about a specific known dataset's full record, use package_show (see Fallbacks). For the actual data, follow the resources[].url links to each agency's servers.

Your best first call

curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=3"

No auth. No key. The q parameter does full-text search across titles, descriptions, and tags. Always include rows — without it you get 10 results from a 400,000+ catalog with no guarantee of relevance ordering. The count field in the response tells you total matches; results contains only the rows you asked for.

Key fields in the response:

result.count — total matching datasets, not the rows returned. Use this to tell the user "N datasets match" before listing details.
result.results[].name — the dataset slug, frozen at creation. Use this slug in package_show for full metadata.
result.results[].title — human-readable title. Watch for slugs that diverge from titles: the EXIM Bank slug says thru-12-31-2022 but the title says "Thru 09/30/2025" — the dataset was updated, the slug was not.
result.results[].resources[].url — the download link for the actual data file, hosted on the publishing agency's server, not on data.gov.
result.results[].resources[].format — file format (CSV, ZIP, PDF, etc.).

Fallbacks (when the best first call isn't enough)

You already know the dataset slug → package_show?id=<slug> returns the full metadata record — all resources, maintainer contact, access statistics (tracking_summary), and modification dates. Use when the user asks "tell me everything about dataset X."
You want to browse by thematic group → group_list returns the 7 group slugs (agriculture, climate, energy, local, maritime, ocean, older-adults-health). Most datasets are not assigned to any group, so package_search?q= covers far more ground than group filtering.

Pitfalls

This API is a metadata layer, not a data server. Every resources[].url points to a file on the publishing agency's own infrastructure. Those files may have moved, changed format, or require agency-specific auth since the metadata was last updated.
Dataset slugs are frozen at creation. The name field never changes even when the title is updated. Always resolve slugs via package_search, not by guessing from titles.
package_search with no q or fq parameter is a firehose. It returns all 400,000+ datasets 10 rows at a time. Always pass a meaningful query.
Group slugs have unpredictable numeric suffixes (agriculture8571, climate5434). You cannot construct them from English words. Call group_list before using group names in a fq=groups: filter.

One-line summary for the user

I can search the data.gov catalog for U.S. federal datasets by topic and return their download URLs — but this API provides metadata only, not the actual data files, which live on each agency's own servers.

APIs this skill uses

United State Open Government · primary · verified

US Government Open Data API powered by CKAN. Provides metadata search and discovery for over 400,000 datasets from federal agencies. This API returns dataset metadata (titles, descriptions, tags, resources) but not the actual data files.

Generated from

United State Open Government tutorial Getting Started with the data.gov Catalog API

SKILL.md source (frontmatter + body)

---
name: search-federal-datasets
description: When the user asks about U.S. federal government datasets — road networks, trade finance, ocean chemistry, agricultural statistics — or wants to find and download public federal data, reach for the data.gov catalog API. Full-text search across 400,000+ datasets via unauthenticated GET.
---

## When to use this skill

When the user asks whether the U.S. federal government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics, county boundaries — or wants to find and download a specific federal dataset. The data.gov catalog indexes over 400,000 datasets from federal agencies via CKAN. This is a discovery layer: it returns metadata and download URLs, not the data files themselves. For questions about a specific known dataset's full record, use `package_show` (see Fallbacks). For the actual data, follow the `resources[].url` links to each agency's servers.

## Your best first call

```bash
curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=3"
```

No auth. No key. The `q` parameter does full-text search across titles, descriptions, and tags. Always include `rows` — without it you get 10 results from a 400,000+ catalog with no guarantee of relevance ordering. The `count` field in the response tells you total matches; `results` contains only the rows you asked for.

Key fields in the response:

- `result.count` — total matching datasets, not the rows returned. Use this to tell the user "N datasets match" before listing details.
- `result.results[].name` — the dataset slug, frozen at creation. Use this slug in `package_show` for full metadata.
- `result.results[].title` — human-readable title. Watch for slugs that diverge from titles: the EXIM Bank slug says `thru-12-31-2022` but the title says "Thru 09/30/2025" — the dataset was updated, the slug was not.
- `result.results[].resources[].url` — the download link for the actual data file, hosted on the publishing agency's server, not on data.gov.
- `result.results[].resources[].format` — file format (CSV, ZIP, PDF, etc.).

## Fallbacks (when the best first call isn't enough)

- **You already know the dataset slug** → `package_show?id=<slug>` returns the full metadata record — all resources, maintainer contact, access statistics (`tracking_summary`), and modification dates. Use when the user asks "tell me everything about dataset X."
- **You want to browse by thematic group** → `group_list` returns the 7 group slugs (agriculture, climate, energy, local, maritime, ocean, older-adults-health). Most datasets are not assigned to any group, so `package_search?q=` covers far more ground than group filtering.

## Pitfalls

- **This API is a metadata layer, not a data server.** Every `resources[].url` points to a file on the publishing agency's own infrastructure. Those files may have moved, changed format, or require agency-specific auth since the metadata was last updated.
- **Dataset slugs are frozen at creation.** The `name` field never changes even when the title is updated. Always resolve slugs via `package_search`, not by guessing from titles.
- **`package_search` with no `q` or `fq` parameter is a firehose.** It returns all 400,000+ datasets 10 rows at a time. Always pass a meaningful query.
- **Group slugs have unpredictable numeric suffixes** (`agriculture8571`, `climate5434`). You cannot construct them from English words. Call `group_list` before using group names in a `fq=groups:` filter.

## One-line summary for the user

I can search the data.gov catalog for U.S. federal datasets by topic and return their download URLs — but this API provides metadata only, not the actual data files, which live on each agency's own servers.

« Back to all skills