You have access to 10 academic paper databases through their REST APIs. Your job is to figure out which database(s) best serve the user's query, call them, and return the results.
Understand the query -- What is the user looking for? A specific paper by DOI? Papers on a topic? An author's publications? Open access PDFs? Full text? This determines which database(s) to hit.
Select database(s) -- Use the database selection guide below. Many queries benefit from hitting multiple databases -- for example, searching PubMed for papers and then checking Unpaywall for open access copies.
Read the reference file -- Each database has a reference file in references/ with endpoint details, query formats, and example calls. Read the relevant file(s) before making API calls.
Make the API call(s) -- See the Making API Calls section below for which HTTP fetch tool to use on your platform.
Return results -- Always return:
Match the user's intent to the right database(s).
| User is asking about... | Primary database(s) | Also consider |
|---|---|---|
| Papers on a biomedical topic | PubMed | Semantic Scholar, OpenAlex |
| Full text of a biomedical article | PMC | CORE |
| Biology preprints | bioRxiv | Semantic Scholar, OpenAlex |
| Health/medical preprints | medRxiv | Semantic Scholar, OpenAlex |
| Physics, math, or CS preprints | arXiv | Semantic Scholar, OpenAlex |
| Papers across all fields | OpenAlex | Semantic Scholar, Crossref |
| A specific paper by DOI | Crossref | Unpaywall, Semantic Scholar |
| Open access PDF for a paper | Unpaywall | CORE, PMC |
| Citation graph (who cites whom) | Semantic Scholar | OpenAlex |
| Author's publications | Semantic Scholar | OpenAlex |
| Paper recommendations | Semantic Scholar | -- |
| Full text (any field) | CORE | PMC (biomedical only) |
| Journal/publisher metadata | Crossref | OpenAlex |
| Funder information | Crossref | OpenAlex |
| Convert between PMID/PMCID/DOI | PMC (ID Converter) | Crossref |
| Recent preprints by date | bioRxiv, medRxiv | arXiv |
| User is asking about... | Databases to query |
|---|---|
| Everything about a paper (metadata + citations + OA) | Crossref + Semantic Scholar + Unpaywall |
| Comprehensive literature search | PubMed + OpenAlex + Semantic Scholar |
| Find and read a paper | PubMed (find) + Unpaywall (OA link) + PMC or CORE (full text) |
| Preprint and its published version | bioRxiv/medRxiv + Crossref |
| Author overview with citation metrics | Semantic Scholar + OpenAlex |
When a query spans multiple needs (e.g., "find papers about CRISPR and get me the PDFs"), query the relevant databases in parallel.
Different databases use different identifier systems. If a query fails, the identifier format may be wrong.
| Identifier | Format | Example | Used by |
|---|---|---|---|
| DOI | 10.xxxx/xxxxx |
10.1038/nature12373 |
All databases |
| PMID | Integer | 34567890 |
PubMed, PMC, Semantic Scholar |
| PMCID | PMC + digits |
PMC7029759 |
PMC, Europe PMC |
| arXiv ID | YYMM.NNNNN |
2103.15348 |
arXiv, Semantic Scholar |
| OpenAlex ID | W + digits |
W2741809807 |
OpenAlex |
| Semantic Scholar ID | 40-char hex | 649def34f8be... |
Semantic Scholar |
| ORCID | 0000-XXXX-XXXX-XXXX |
0000-0001-6187-6610 |
OpenAlex, Crossref |
| ISSN | XXXX-XXXX |
0028-0836 |
Crossref, OpenAlex |
Cross-referencing IDs: Semantic Scholar accepts DOI, PMID, PMCID, and arXiv ID via prefixes (e.g., DOI:10.1038/nature12373, PMID:34567890, ARXIV:2103.15348). OpenAlex accepts DOI and PMID via prefixes (doi:10.1038/..., pmid:34567890). Use the PMC ID Converter to translate between PMID, PMCID, and DOI.
Most of these databases are fully open. A few benefit from API keys for higher rate limits.
| Database | Env Variable | Required? | Registration |
|---|---|---|---|
| NCBI (PubMed, PMC) | NCBI_API_KEY |
No (3 req/s without, 10 with) | https://www.ncbi.nlm.nih.gov/account/settings/ |
| CORE | CORE_API_KEY |
Yes for full text | https://core.ac.uk/services/api |
| Semantic Scholar | S2_API_KEY |
No (shared pool without) | https://www.semanticscholar.org/product/api#api-key-form |
| OpenAlex | OPENALEX_API_KEY |
Recommended | https://openalex.org/settings/api |
| Database | Notes |
|---|---|
| bioRxiv / medRxiv | No auth, no documented rate limits |
| arXiv | No auth, max 1 request per 3 seconds |
| Crossref | No auth; add mailto param for polite pool (2x rate limit) |
| Unpaywall | No auth; requires email parameter |
$NCBI_API_KEY)..env -- check .env in the current working directory.Use your environment's HTTP fetch tool to call REST endpoints:
| Platform | HTTP Fetch Tool | Fallback |
|---|---|---|
| Claude Code | WebFetch |
curl via Bash |
| Gemini CLI | web_fetch |
curl via shell |
| Windsurf | read_url_content |
curl via terminal |
| Cursor | No dedicated fetch tool | curl via run_terminal_cmd |
| Codex CLI | No dedicated fetch tool | curl via shell |
| Cline | No dedicated fetch tool | curl via execute_command |
If the fetch tool fails, fall back to curl via whatever shell tool is available.
curl and extract the relevant fields. Consider piping through a simple parser if available.mailto parameter or email for the polite/fast pool.mailto).Structure your response like this:
## Databases Queried
- **PubMed** -- esearch + esummary for "CRISPR gene therapy"
- **Unpaywall** -- DOI lookup for 10.1038/...
## Results
### PubMed
[raw JSON response or formatted results]
### Unpaywall
[raw JSON response]
If results are very large, present the most relevant portion and note that more data is available. But default to showing the full raw JSON -- the user asked for it.
Read the relevant reference file before making any API call.
| Database | Reference File | What it covers |
|---|---|---|
| PubMed | references/pubmed.md |
37M+ biomedical citations, abstracts, MeSH terms |
| PMC | references/pmc.md |
10M+ full-text biomedical articles (JATS XML), ID conversion |
| Database | Reference File | What it covers |
|---|---|---|
| bioRxiv | references/biorxiv.md |
Biology preprints (browse by date/DOI, no keyword search) |
| medRxiv | references/medrxiv.md |
Health sciences preprints (browse by date/DOI, no keyword search) |
| arXiv | references/arxiv.md |
Physics, math, CS, biology, economics preprints (keyword search, Atom XML) |
| Database | Reference File | What it covers |
|---|---|---|
| OpenAlex | references/openalex.md |
250M+ works, authors, institutions, topics, citation data |
| Crossref | references/crossref.md |
150M+ DOI metadata, journals, funders, references |
| Semantic Scholar | references/semantic-scholar.md |
200M+ papers, citation graphs, AI-generated TLDRs, recommendations |
| Database | Reference File | What it covers |
|---|---|---|
| CORE | references/core.md |
37M+ full texts from OA repositories worldwide |
| Unpaywall | references/unpaywall.md |
OA status and PDF links for any DOI |