-
Notifications
You must be signed in to change notification settings - Fork 177
Description
Version 0.5.1
Documentation on Parse MEDLINE XML in README differs a bit from the medline_parser script.
Readme: delete : boolean if False means paper got updated so you might have two
Script: An iterator of dictionary containing information about articles in NLM format.
see parse_article_info). Articles that have been deleted will be
added with no information other than the field delete being True
I'm somewhat confused. As one seems to indicate that delete = False -> paper updated
While delete = True -> paper deleted.
But these don't seem like natural opposites. Doesn't updated mean that the previous paper was deleted?
Readme for reference:
MEDLINE XML has a different XML format than PubMed Open Access. The structure of XML files can be found in MEDLINE/PubMed DTD [here](https://www.nlm.nih.gov/databases/dtd/). You can use the function parse_medline_xml` to parse that format. This function will return list of dictionaries, where each element contains:
pmid: PubMed IDpmc: PubMed Central IDdoi: DOIother_id: Other IDs found, each separated by;title: title of the articleabstract: abstract of the articleauthors: authors, each separated by;mesh_terms: list of MeSH terms with corresponding MeSH ID, each separated by;e.g.'D000161:Acoustic Stimulation; D000328:Adult; ...publication_types: list of publication type list each separated by;e.g.'D016428:Journal Article'keywords: list of keywords, each separated by;chemical_list: list of chemical terms, each separated by;pubdate: Publication date. Defaults to year information only.journal: journal of the given papermedline_ta: this is abbreviation of the journal namenlm_unique_id: NLM unique identificationissn_linking: ISSN linkage, typically use to link with Web of Science datasetcountry: Country extracted from journal information fieldreference: string of PMID each separated by;or list of references made to the articledelete: boolean ifFalsemeans paper got updated so you might have twolanguages: list of languages, separated by;vernacular_title: vernacular title. Defaults to empty string whenever non-available.
XMLs for the same paper. You can delete the record of deleted paper because it got updated.`
Greatful for clarification as I've hade some duplication issues