DEV Community

Alex Spinov
Alex Spinov

Posted on • Edited on

PubMed Has a Free API — Search 36M+ Medical Papers Programmatically

I was building a health-tech prototype last week. Needed medical research data. Expected paywalls everywhere.

Then I found PubMed's E-utilities API. 36 million biomedical papers. Free. No API key. No signup.

What Is PubMed?

PubMed is the U.S. National Library of Medicine's database — the world's largest collection of biomedical literature. It's run by NIH (National Institutes of Health), and they provide free programmatic access to everything.

If you work with health data, drug research, clinical trials, or biomedical NLP — this is your goldmine.

Your First API Call (Zero Setup)

curl "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=artificial+intelligence&retmode=json&retmax=3"
Enter fullscreen mode Exit fullscreen mode

That returns 364,000+ results for "artificial intelligence" in biomedical literature.

Full Python Example: Search and Fetch Papers

import requests

# Step 1: Search for papers
search_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
search_params = {
    "db": "pubmed",
    "term": "machine learning cancer diagnosis",
    "retmode": "json",
    "retmax": 5,
    "sort": "relevance"
}

search = requests.get(search_url, params=search_params).json()
ids = search["esearchresult"]["idlist"]
total = search["esearchresult"]["count"]
print(f"Found {total} papers. Fetching top {len(ids)}...")

# Step 2: Fetch paper details
fetch_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi"
fetch_params = {
    "db": "pubmed",
    "id": ",".join(ids),
    "retmode": "json"
}

details = requests.get(fetch_url, params=fetch_params).json()["result"]

for pmid in ids:
    paper = details[pmid]
    title = paper["title"]
    authors = ", ".join(a["name"] for a in paper.get("authors", [])[:3])
    journal = paper.get("source", "Unknown")
    date = paper.get("pubdate", "Unknown")
    print(f"
[{pmid}] {title}")
    print(f"  Authors: {authors}")
    print(f"  Journal: {journal} | Date: {date}")
Enter fullscreen mode Exit fullscreen mode

Output:

Found 45,231 papers. Fetching top 5...

[39187234] Machine Learning in Cancer Diagnosis: Current State and Future
  Authors: Smith J, Chen L, Kumar R
  Journal: Nature Reviews | Date: 2024 Mar
Enter fullscreen mode Exit fullscreen mode

5 Things You Can Build

1. Drug Research Tracker

# Track publications about a specific drug
drugs = ["ozempic", "metformin", "ivermectin"]
for drug in drugs:
    r = requests.get(search_url, params="db": "pubmed")
    count = r.json()["esearchresult"]["count"]
    print(f"{drug}: {count} papers")
Enter fullscreen mode Exit fullscreen mode

2. Clinical Trial Monitor

# Find recent clinical trials
params = {
    "db": "pubmed",
    "term": "clinical trial[pt] AND 2024[dp] AND diabetes",
    "retmode": "json",
    "retmax": 10
}
trials = requests.get(search_url, params=params).json()
print(f"Diabetes clinical trials in 2024: {trials["esearchresult"]["count"]}")
Enter fullscreen mode Exit fullscreen mode

3. Author Publication Tracker

# Find all papers by a specific author
params = {
    "db": "pubmed",
    "term": "Hinton GE[author]",
    "retmode": "json"
}
result = requests.get(search_url, params=params).json()
print(f"Geoffrey Hinton: {result["esearchresult"]["count"]} papers in PubMed")
Enter fullscreen mode Exit fullscreen mode

4. Research Trend Analyzer

# Publications per year for a topic
for year in range(2019, 2026):
    params = {
        "db": "pubmed",
        "term": f"large language model AND year}[dp]"
    count = requests.get(search_url, params=params).json()["esearchresult"]["count"]
    print(f"{year}: {count} LLM papers")
Enter fullscreen mode Exit fullscreen mode

5. Abstract Downloader for NLP Training

import csv

# Get abstracts for NLP/ML training
fetch_abstract_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
params = "db": "pubmed"
abstracts = requests.get(fetch_abstract_url, params=params).text
print(abstracts[:500])
Enter fullscreen mode Exit fullscreen mode

PubMed vs Other Medical APIs

Feature PubMed (E-utilities) Semantic Scholar Google Scholar Scopus
API Key ❌ Not required ✅ Required No API ✅ Required
Papers 36M+ biomedical 200M+ all fields Unknown 84M+
Free ✅ Completely ✅ Basic tier N/A ❌ Paid
Abstracts ✅ Full text ✅ Yes ❌ No ✅ Yes
MeSH Terms ✅ Yes ❌ No ❌ No ❌ No
Clinical Trials ✅ Filterable ❌ No ❌ No Limited

Pro Tips

  1. Use MeSH terms for precise medical queries: "diabetes mellitus"[MeSH] is more accurate than just diabetes
  2. Add your email as &email=you@example.com — NCBI recommends it for tracking
  3. Rate limit: 3 requests/second without API key, 10/sec with a free key from NCBI
  4. Get a free API key at NCBI — optional but recommended
  5. Use retstart for pagination: &retstart=10&retmax=10 for page 2

Combine with OpenAlex

PubMed is biomedical-only. For broader academic search (CS, physics, social science), check out my OpenAlex tutorial — 250M+ papers, also free.


What medical data would you extract from 36M papers? Share your use case in the comments.

More free API tutorials: My API series on Dev.to

Need custom data extraction? 77 scrapers on Apify | Contact


More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Also: Neon Free Postgres | Vercel Free API | Hetzner 4x More Server
NEW: I Ran an AI Agent for 16 Days — What Actually Works

You might also like:


Need web scraping or data extraction? I've built 88 production scrapers. Email spinov001@gmail.com — quote in 2 hours. Or try my ready-made Apify actors — no code needed.

Top comments (0)