Open source engine to fetch academic papers at scale. Built to make research accessible.
Academic research shouldn't be locked behind paywalls and clunky interfaces. This tool lets you programmatically collect paper data from open sources - for literature reviews, research projects, or building your own tools.
- OpenAlex - 250M+ scholarly works, completely free, no API key needed
This is modular. Each source is a separate adapter. Want support for another database?
Open an issue: github.com/jowwalker77/get-papers-engine/issues
Potential additions:
- Semantic Scholar
- arXiv
- PubMed
- bioRxiv/medRxiv
- CORE
- Unpaywall
Requires Bun:
curl -fsSL https://bun.sh/install | bashClone and install:
git clone https://github.com/jowwalker77/get-papers-engine.git
cd get-papers-engine
bun installCreate a script (e.g. my-search.ts):
import { createPapersEngine } from "./src"
import { Effect } from "effect"
const engine = createPapersEngine({
dbPath: "./my-papers.db",
})
const program = Effect.gen(function* () {
yield* engine.migrate()
const result = yield* engine.fetch({
query: "your research topic",
fromYear: 2020,
minCitations: 10,
maxPapers: 100,
})
console.log(`Fetched ${result.imported} papers`)
})
Effect.runPromise(program)Run:
bun run my-search.tsPapers are saved to SQLite (my-papers.db). Open with any SQLite viewer or query programmatically.
| Option | What it does |
|---|---|
query |
Search terms |
fromYear |
Papers from this year onwards |
minCitations |
Minimum citation count |
maxPapers |
How many to fetch |
hasAbstract |
Only papers with abstracts |
language |
Language code (en, es, zh, etc.) |
Each paper includes:
- Title, abstract, authors
- Publication date
- DOI
- Citation count
- Open access PDF link (when available)
- Paper type (article, review, etc.)
Find related papers using AI embeddings. Works locally, no API keys needed.
yield* engine.embedAll()
const similar = yield* engine.similar("your research question")The model downloads once (~23MB) on first use.
Pull requests welcome. To add a new paper source:
- Create adapter in
src/sources/ - Follow the OpenAlex pattern
- Open PR
MIT
Questions or source requests: @jowwalker77