Return the text snippets that most closely match the query. Text snippets are excerpts of approximately 500 words, drawn from a paper's title, abstract, and body text, but excluding figure captions and the bibliography.
It will return the highest ranked snippet first, as well as some basic data about the paper it was found in.
Examples:
<ul>
<li><code>https://api.semanticscholar.org/graph/v1/snippet/search?query=The literature graph is a property graph with directed edges&limit=1</code></li>
<ul>
<li>Returns a single snippet that is the highest ranked match.</li>
<li>Each snippet has text, snippetKind, section, annotation data, and score. As well as the following data about the paper it comes from: corpusId, title, authors, and openAccessInfo.</li>
</ul>
</ul>
<br>
Limitations:
<ul>
<li>You must include a query.</li>
<li>If you don't set a limit, it will automatically return 10 results.</li>
<li>The max limit allowed is 1000.</li>
</ul>
</ul>
Parameters (11)
authors(string, query, optional)
Restricts results to papers with authors matching the given names, formatted as a comma-separated list (<code>...?authors=name1,name2,...</code>).
The search criteria are 'fuzzy', so matches that are <em>close</em> will also return results.
<br><br>
Example: <code>galileo,kepler</code> will return papers that include <em>both</em> an author similar to "galileo" <em>and</em> an author similar to "kepler" as co-authors.
This query will also match fuzzy variations like 'keppler' and 'Kepler' (default max 'edit distance' is 2).
<strong>Important:</strong> Multiple author names are combined with AND logic, meaning results must include <em>all</em> specified authors.
Adding more authors will narrow your results, not expand them.
To search for papers by <em>any</em> of several authors (OR logic), perform separate searches for each author name.
The maximum number of author filters is by default <code>10</code> and will return an HTTP code 400 (Bad Request) if more than 10 are supplied.
fields(string, query, optional)
A comma-separated list of the fields to be returned with each snippet element.
Paper info and the score are currently always returned. What you can specify using this <code>fields</code> param is which fields under the 'snippet' section (see the response schema) will be returned.
Examples:
<ul>
<li><code>fields=snippet.text</code>: you'll get just the <code>text</code> field in the snippet section</li>
<li><code>fields=snippet.text,snippet.snippetKind</code>: you'll get just the <code>text</code> and <code>snippetKind</code> fields in the snippet section</li>
<li><code>fields=snippet.annotations.sentences</code>: you'll get just the sentence annotations in the snippet section</li>
</ul>
In general, you can use periods to identify nested fields (as in the examples above).
Not all fields in the response schema can be identified using this <code>fields</code> param though.
E.g. you can't pick what you get within <code>snippet.snippetOffset</code> - you can either get the snippet offset with all the possible snippet offset fields, or you can not get it at all.
You also can't provide <code>paper</code> or <code>score</code> or anything under <code>paper</code>, since those are always provided.
If you attempt to identify a field that's not supported, you'll get an error with the relevant field name. E.g.
<code>Unrecognized or unsupported fields: [paper]</code>
If you don't specify the fields param, you'll get a default set of fields in the snippet section. These are the default fields:
- <code>snippet.text</code>
- <code>snippet.snippetKind</code>
- <code>snippet.section</code>
- <code>snippet.snippetOffset</code> (including nested <code>start</code> and <code>end</code>)
- <code>snippet.annotations.refMentions</code> (including nested <code>start</code>, <code>end</code>, and <code>matchedPaperCorpusId</code> for each element)
- <code>snippet.annotations.sentences</code> (including nested <code>start</code> and <code>end</code> for each element)
fieldsOfStudy(string, query, optional)
Restricts results to papers in the given fields of study, formatted as a comma-separated list:
<ul>
<li>Computer Science</li>
<li>Medicine</li>
<li>Chemistry</li>
<li>Biology</li>
<li>Materials Science</li>
<li>Physics</li>
<li>Geology</li>
<li>Psychology</li>
<li>Art</li>
<li>History</li>
<li>Geography</li>
<li>Sociology</li>
<li>Business</li>
<li>Political Science</li>
<li>Economics</li>
<li>Philosophy</li>
<li>Mathematics</li>
<li>Engineering</li>
<li>Environmental Science</li>
<li>Agricultural and Food Sciences</li>
<li>Education</li>
<li>Law</li>
<li>Linguistics</li>
</ul>
Example: <code>Physics,Mathematics</code> will return papers with either Physics or Mathematics in their list of fields-of-study.
insertedBefore(string, query, optional)
Restricts results to snippets from papers inserted into the index before the provided date (excludes things inserted on the provided date).
Acceptable formats: YYYY-MM-DD, YYYY-MM, YYYY
limit(integer, query, optional, default: 10)
The maximum number of results to return.<br>
Must be <= 1000
minCitationCount(string, query, optional)
Restricts results to only include papers with the minimum number of citations.
<br>
<br>
Example:
<code>minCitationCount=200</code>
paperIds(string, query, optional)
Restricts results to snippets from specific papers. To specify papers, provide a comma-separated list of their IDs. You can provide up to approximately 100 IDs.
The following types of IDs are supported:
<ul>
<li><code><sha></code> - a Semantic Scholar ID, e.g. <code>649def34f8be52c8b66281af98ae884c09aef38b</code></li>
<li><code>CorpusId:<id></code> - a Semantic Scholar numerical ID, e.g. <code>CorpusId:215416146</code></li>
<li><code>DOI:<doi></code> - a <a href="http://doi.org">Digital Object Identifier</a>,
e.g. <code>DOI:10.18653/v1/N18-3011</code></li>
<li><code>ARXIV:<id></code> - <a href="https://arxiv.org/">arXiv.rg</a>, e.g. <code>ARXIV:2106.15928</code></li>
<li><code>MAG:<id></code> - Microsoft Academic Graph, e.g. <code>MAG:112218234</code></li>
<li><code>ACL:<id></code> - Association for Computational Linguistics, e.g. <code>ACL:W12-3903</code></li>
<li><code>PMID:<id></code> - PubMed/Medline, e.g. <code>PMID:19872477</code></li>
<li><code>PMCID:<id></code> - PubMed Central, e.g. <code>PMCID:2323736</code></li>
<li><code>URL:<url></code> - URL from one of the sites listed below, e.g. <code>URL:https://arxiv.org/abs/2106.15928v1</code></li>
</ul>
URLs are recognized from the following sites:
<ul>
<li><a href="https://www.semanticscholar.org/">semanticscholar.org</a></li>
<li><a href="https://arxiv.org/">arxiv.org</a></li>
<li><a href="https://www.aclweb.org">aclweb.org</a></li>
<li><a href="https://www.acm.org/">acm.org</a></li>
<li><a href="https://www.biorxiv.org/">biorxiv.org</a></li>
</ul>
publicationDateOrYear(string, query, optional)
Restricts results to the given range of publication dates or years (inclusive). Accepts the format <code><startDate>:<endDate></code> with each date in <code>YYYY-MM-DD</code> format.
<br>
<br>
Each term is optional, allowing for specific dates, fixed ranges, or open-ended ranges. In addition, prefixes are supported as a shorthand, e.g. <code>2020-06</code> matches all dates in June 2020.
<br>
<br>
Specific dates are not known for all papers, so some records returned with this filter will have a <code>null</code> value for </code>publicationDate</code>. <code>year</code>, however, will always be present.
For records where a specific publication date is not known, they will be treated as if published on January 1st of their publication year.
<br>
<br>
Examples:
<ul>
<li><code>2019-03-05</code> on March 5th, 2019</li>
<li><code>2019-03</code> during March 2019</li>
<li><code>2019</code> during 2019</li>
<li><code>2016-03-05:2020-06-06</code> as early as March 5th, 2016 or as late as June 6th, 2020</li>
<li><code>1981-08-25:</code> on or after August 25th, 1981</li>
<li><code>:2015-01</code> before or on January 31st, 2015</li>
<li><code>2015:2020</code> between January 1st, 2015 and December 31st, 2020</li>
</ul>
query(string, query, required)
A plain-text search query string.
* No special query syntax is supported.
venue(string, query, optional)
Restricts results to papers published in the given venues, formatted as a comma-separated list. <br><br>
Input could also be an ISO4 abbreviation.
Examples include:
<ul>
<li>Nature</li>
<li>New England Journal of Medicine</li>
<li>Radiology</li>
<li>N. Engl. J. Med.</li>
</ul>
Example: <code>Nature,Radiology</code> will return papers from venues Nature and/or Radiology.
year(string, query, optional)
Restricts results to the given publication year or range of years (inclusive).
<br>
<br>
Examples:
<ul>
<li><code>2019</code> in 2019</li>
<li><code>2016-2020</code> as early as 2016 or as late as 2020</li>
<li><code>2010-</code> during or after 2010</li>
<li><code>-2015</code> before or during 2015</li>
</ul>
import zingu_apis
api = zingu_apis.api("semanticscholar")
result = api.fetch("snippet/search", query="machine learning")
for item in result:
print(item)
import zingu_apis
api = zingu_apis.api("semanticscholar")
result = api.fetch("snippet/search", query="natural language processing", fieldsOfStudy="Computer Science", limit=5)
for item in result:
print(item)
import zingu_apis
api = zingu_apis.api("semanticscholar")
result = api.fetch("snippet/search", query="transformer architecture", authors="Ashish Vaswani", year=2017)
for item in result:
print(item)