-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Loader should include some normalization routine to handle dates in different formats.
Expected Behavior
Such a normalization routine would be called for each date field in the record ensuring that the data fit the schema, like "2017 Sep 1" -> "2017-09-01", "2017-Sep-1" -> "2017-09-01", "2017 Sep-Oct" -> "2017", "01.09.2017" -> "2017-09-01"
Current Behavior
I have to admit, I do not know to what extent it is already implemented in hepcrawl. In the harvesting-kit each publisher program has its own normalization code. At DESY we have a hand-written function which tries to catch most the cases.
Context
We will have to write a lot of spiders. It would save time, if we could just map the date-fields without thinking about the format.
Metadata
Metadata
Assignees
Labels
No labels