forked from datalogism/SciLEx
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
enhancementNew feature or requestNew feature or requestenrichmentPost-collection data enrichmentPost-collection data enrichment
Description
Description
After collection, optionally fetch hIndex and citationCount for first and last authors using the Semantic Scholar /author/batch endpoint (up to 1000 authors per request). Append first_author_hindex and last_author_hindex columns to the CSV. Optionally weight author impact in the relevance score.
Justification
The Semantic Scholar Academic Graph API exposes hIndex as a named field on author objects (confirmed fields: authorId, name, affiliations, citationCount, hIndex, paperCount). Author reputation is a standard quality signal in systematic reviews. SciLEx currently stores author names but discards all author-level metadata. Implementation reuses the existing Semantic Scholar client infrastructure.
Affected files
- new
scilex/author_enrichment.py scilex/crawlers/collectors/semantic_scholar.py
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestenrichmentPost-collection data enrichmentPost-collection data enrichment