Search federal datasets
When the user asks about U.S. federal government datasets — road networks, trade finance, ocean chemistry, agricultural statistics — or wants to find and download public federal data, reach for the data.gov catalog API. Full-text search across 400,000+ datasets via unauthenticated GET.
search-federal-datasets
· v2
· updated 2026-04-16
When to use this skill
When the user asks whether the U.S. federal government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics, county boundaries — or wants to find and download a specific federal dataset. The data.gov catalog indexes over 400,000 datasets from federal agencies via CKAN. This is a discovery layer: it returns metadata and download URLs, not the data files themselves. For questions about a specific known dataset's full record, use package_show (see Fallbacks). For the actual data, follow the resources[].url links to each agency's servers.
Your best first call
curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=3"
No auth. No key. The q parameter does full-text search across titles, descriptions, and tags. Always include rows — without it you get 10 results from a 400,000+ catalog with no guarantee of relevance ordering. The count field in the response tells you total matches; results contains only the rows you asked for.
Key fields in the response:
result.count — total matching datasets, not the rows returned. Use this to tell the user "N datasets match" before listing details.
result.results[].name — the dataset slug, frozen at creation. Use this slug in package_show for full metadata.
result.results[].title — human-readable title. Watch for slugs that diverge from titles: the EXIM Bank slug says thru-12-31-2022 but the title says "Thru 09/30/2025" — the dataset was updated, the slug was not.
result.results[].resources[].url — the download link for the actual data file, hosted on the publishing agency's server, not on data.gov.
result.results[].resources[].format — file format (CSV, ZIP, PDF, etc.).
Fallbacks (when the best first call isn't enough)
- You already know the dataset slug →
package_show?id=<slug> returns the full metadata record — all resources, maintainer contact, access statistics (tracking_summary), and modification dates. Use when the user asks "tell me everything about dataset X."
- You want to browse by thematic group →
group_list returns the 7 group slugs (agriculture, climate, energy, local, maritime, ocean, older-adults-health). Most datasets are not assigned to any group, so package_search?q= covers far more ground than group filtering.
Pitfalls
- This API is a metadata layer, not a data server. Every
resources[].url points to a file on the publishing agency's own infrastructure. Those files may have moved, changed format, or require agency-specific auth since the metadata was last updated.
- Dataset slugs are frozen at creation. The
name field never changes even when the title is updated. Always resolve slugs via package_search, not by guessing from titles.
package_search with no q or fq parameter is a firehose. It returns all 400,000+ datasets 10 rows at a time. Always pass a meaningful query.
- Group slugs have unpredictable numeric suffixes (
agriculture8571, climate5434). You cannot construct them from English words. Call group_list before using group names in a fq=groups: filter.
One-line summary for the user
I can search the data.gov catalog for U.S. federal datasets by topic and return their download URLs — but this API provides metadata only, not the actual data files, which live on each agency's own servers.
SKILL.md source (frontmatter + body)
---
name: search-federal-datasets
description: When the user asks about U.S. federal government datasets — road networks, trade finance, ocean chemistry, agricultural statistics — or wants to find and download public federal data, reach for the data.gov catalog API. Full-text search across 400,000+ datasets via unauthenticated GET.
---
## When to use this skill
When the user asks whether the U.S. federal government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics, county boundaries — or wants to find and download a specific federal dataset. The data.gov catalog indexes over 400,000 datasets from federal agencies via CKAN. This is a discovery layer: it returns metadata and download URLs, not the data files themselves. For questions about a specific known dataset's full record, use `package_show` (see Fallbacks). For the actual data, follow the `resources[].url` links to each agency's servers.
## Your best first call
```bash
curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=3"
```
No auth. No key. The `q` parameter does full-text search across titles, descriptions, and tags. Always include `rows` — without it you get 10 results from a 400,000+ catalog with no guarantee of relevance ordering. The `count` field in the response tells you total matches; `results` contains only the rows you asked for.
Key fields in the response:
- `result.count` — total matching datasets, not the rows returned. Use this to tell the user "N datasets match" before listing details.
- `result.results[].name` — the dataset slug, frozen at creation. Use this slug in `package_show` for full metadata.
- `result.results[].title` — human-readable title. Watch for slugs that diverge from titles: the EXIM Bank slug says `thru-12-31-2022` but the title says "Thru 09/30/2025" — the dataset was updated, the slug was not.
- `result.results[].resources[].url` — the download link for the actual data file, hosted on the publishing agency's server, not on data.gov.
- `result.results[].resources[].format` — file format (CSV, ZIP, PDF, etc.).
## Fallbacks (when the best first call isn't enough)
- **You already know the dataset slug** → `package_show?id=<slug>` returns the full metadata record — all resources, maintainer contact, access statistics (`tracking_summary`), and modification dates. Use when the user asks "tell me everything about dataset X."
- **You want to browse by thematic group** → `group_list` returns the 7 group slugs (agriculture, climate, energy, local, maritime, ocean, older-adults-health). Most datasets are not assigned to any group, so `package_search?q=` covers far more ground than group filtering.
## Pitfalls
- **This API is a metadata layer, not a data server.** Every `resources[].url` points to a file on the publishing agency's own infrastructure. Those files may have moved, changed format, or require agency-specific auth since the metadata was last updated.
- **Dataset slugs are frozen at creation.** The `name` field never changes even when the title is updated. Always resolve slugs via `package_search`, not by guessing from titles.
- **`package_search` with no `q` or `fq` parameter is a firehose.** It returns all 400,000+ datasets 10 rows at a time. Always pass a meaningful query.
- **Group slugs have unpredictable numeric suffixes** (`agriculture8571`, `climate5434`). You cannot construct them from English words. Call `group_list` before using group names in a `fq=groups:` filter.
## One-line summary for the user
I can search the data.gov catalog for U.S. federal datasets by topic and return their download URLs — but this API provides metadata only, not the actual data files, which live on each agency's own servers.