Skip to content

total open corpus #6

@petermr

Description

@petermr

To get the total size of the open corpus I ran:

$ getpapers -q "quantum chemistry" -n
info: Searching using eupmc API
info: Running in no-execute mode, so nothing will be downloaded
info: Found 22234 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.1 reported by api

This shows that we have a lot of papers! Since this is EuropePMC there will be a bioscience bias , e.g. proteins, nucleic acids, bioactive small molecules/drugs and very little on materials.

FWIW the total including closed is:

$ getpapers -q "quantum chemistry" -n -a
info: Searching using eupmc API
info: Running in no-execute mode, so nothing will be downloaded
info: Found 95872 results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.1 reported by api

The "OA" papers are those specifically labelled as such and available in XML. There may be freely visible papers elsewhere (use Unpaywall) but they'll likely be in PDF.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions