Link text entities to dbpedia

When the user wants to identify and link named entities — people, places, organizations — in unstructured text to DBpedia URIs, reach for DBpedia Spotlight. Disambiguates entity mentions with confidence scores via one unauthenticated GET.

link-text-entities-to-dbpedia · v1 · updated 2026-04-16

Agents: This page is a SKILL.md-style capability guide. For JSON, call GET /api/skills/link-text-entities-to-dbpedia. To drop this into a local Claude Code install, copy the frontmatter + body below into ~/.claude/skills/link-text-entities-to-dbpedia/SKILL.md.

When to use this skill

When the user has unstructured text and wants to identify named entities — people, places, organizations — and link each to a canonical DBpedia (Wikipedia-backed) URI with a disambiguation score. Use this when you need the API to decide which entity a mention refers to. For auditing the disambiguation reasoning before committing to a link, use /candidates instead. For raw surface-form extraction with no linking, this is the wrong skill — /spot does that but its output is noisy with common nouns.

Your best first call

curl -H "Accept: application/json" \
  "https://api.dbpedia-spotlight.org/annotate?text=Apple+is+a+technology+company+in+California&confidence=0.5"

No auth. No key. The Accept: application/json header is required — the API returns XML by default. Pass text (the prose to annotate) and confidence (threshold 0–1; 0.5 is a reasonable starting point). Lower confidence catches more entities at the cost of false positives.

The response Resources array contains one object per detected entity. The fields an agent uses:

@URI — canonical DBpedia URI (e.g. http://dbpedia.org/resource/Apple_Inc.)
@surfaceForm — the text span matched (e.g. "Apple")
@types — type hierarchy from Wikidata and DBpedia (e.g. DBpedia:Company,DBpedia:Organisation)
@similarityScore — disambiguation confidence, 0–1
@percentageOfSecondRank — how close the runner-up candidate was; near-zero means unambiguous, values above 0.1 signal ambiguity worth investigating
@support — Wikipedia article count mentioning this entity; higher support means higher prior probability, which is how the model resolves "Paris" to Paris, France rather than Paris, Texas
@offset — character position in the input text

Fallbacks (when the best call isn't enough)

Need to audit why the API chose a particular entity → /candidates returns prior, contextual, and final scores per surface form without committing to a single URI — useful when @percentageOfSecondRank is high and you want to see the alternatives.
Only need candidate surface forms, no linking → /spot identifies potential entity mentions via dictionary lookup. Output is noisy — common nouns like "city" get flagged — so use it only when piping to your own downstream NER model.

Pitfalls

The API returns XML by default. Always send Accept: application/json or you'll get SDMX-style XML with no warning.
/annotate and /candidates//spot use incompatible response shapes. /annotate returns a top-level Resources array; the other two nest results under annotation.surfaceForm. A single parser will break on one or the other.
Every JSON key carries an @ prefix (@URI, @surfaceForm, @offset) — JSON-LD convention from Spotlight's Linked Data heritage. Access these as obj["@URI"], not obj.uri.
The public api.dbpedia-spotlight.org endpoint is shared infrastructure. Under load it becomes slow or unresponsive; the project ships self-hosted Docker images for production use.

One-line summary for the user

I can identify named entities in your text — people, places, organizations — and link each to a DBpedia URI with a disambiguation confidence score, via an unauthenticated GET to DBpedia Spotlight.

APIs this skill uses

DBpedia Spotlight API · primary · verified

batch_172

DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia.

Generated from

DBpedia Spotlight API tutorial Getting Started with DBpedia Spotlight

SKILL.md source (frontmatter + body)

---
name: link-text-entities-to-dbpedia
description: When the user wants to identify and link named entities — people, places, organizations — in unstructured text to DBpedia URIs, reach for DBpedia Spotlight. Disambiguates entity mentions with confidence scores via one unauthenticated GET.
---

## When to use this skill

When the user has unstructured text and wants to identify named entities — people, places, organizations — and link each to a canonical DBpedia (Wikipedia-backed) URI with a disambiguation score. Use this when you need the API to decide *which* entity a mention refers to. For auditing the disambiguation reasoning before committing to a link, use `/candidates` instead. For raw surface-form extraction with no linking, this is the wrong skill — `/spot` does that but its output is noisy with common nouns.

## Your best first call

```bash
curl -H "Accept: application/json" \
  "https://api.dbpedia-spotlight.org/annotate?text=Apple+is+a+technology+company+in+California&confidence=0.5"
```

No auth. No key. The `Accept: application/json` header is required — the API returns XML by default. Pass `text` (the prose to annotate) and `confidence` (threshold 0–1; 0.5 is a reasonable starting point). Lower `confidence` catches more entities at the cost of false positives.

The response `Resources` array contains one object per detected entity. The fields an agent uses:

- `@URI` — canonical DBpedia URI (e.g. `http://dbpedia.org/resource/Apple_Inc.`)
- `@surfaceForm` — the text span matched (e.g. "Apple")
- `@types` — type hierarchy from Wikidata and DBpedia (e.g. `DBpedia:Company,DBpedia:Organisation`)
- `@similarityScore` — disambiguation confidence, 0–1
- `@percentageOfSecondRank` — how close the runner-up candidate was; near-zero means unambiguous, values above 0.1 signal ambiguity worth investigating
- `@support` — Wikipedia article count mentioning this entity; higher support means higher prior probability, which is how the model resolves "Paris" to Paris, France rather than Paris, Texas
- `@offset` — character position in the input text

## Fallbacks (when the best call isn't enough)

- **Need to audit why the API chose a particular entity** → `/candidates` returns prior, contextual, and final scores per surface form without committing to a single URI — useful when `@percentageOfSecondRank` is high and you want to see the alternatives.
- **Only need candidate surface forms, no linking** → `/spot` identifies potential entity mentions via dictionary lookup. Output is noisy — common nouns like "city" get flagged — so use it only when piping to your own downstream NER model.

## Pitfalls

- The API returns XML by default. Always send `Accept: application/json` or you'll get SDMX-style XML with no warning.
- `/annotate` and `/candidates`/`/spot` use incompatible response shapes. `/annotate` returns a top-level `Resources` array; the other two nest results under `annotation.surfaceForm`. A single parser will break on one or the other.
- Every JSON key carries an `@` prefix (`@URI`, `@surfaceForm`, `@offset`) — JSON-LD convention from Spotlight's Linked Data heritage. Access these as `obj["@URI"]`, not `obj.uri`.
- The public `api.dbpedia-spotlight.org` endpoint is shared infrastructure. Under load it becomes slow or unresponsive; the project ships self-hosted Docker images for production use.

## One-line summary for the user

I can identify named entities in your text — people, places, organizations — and link each to a DBpedia URI with a disambiguation confidence score, via an unauthenticated GET to DBpedia Spotlight.

« Back to all skills