-
Notifications
You must be signed in to change notification settings - Fork 10
videos: required transform rules / setup: separate rdm and videos #215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
videos: required transform rules / setup: separate rdm and videos #215
Conversation
9807eb8 to
8f1ed9e
Compare
ntarocco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏🏼
Can you add a test for the digitized field?
|
|
||
| def format_contributors(json_data): | ||
| """ | ||
| Same contributors could be both in tag 700 and 906. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the 906 with a different role, see https://cds.cern.ch/help/admin/howto-marc?ln=en
906 RESPONSIBLE PERSON / REFEREE (R) [CERN]
@zzacharo @jrcastro2 Looks like we should first re-define the final list of roles for authors. It does not look like that there is a good one for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our meeting we decided tag 906 is event speakers, it's also coming from indico event. And if you see this record 700 and 906 has exactly same people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to discuss this particular case and align on the role. We were wondering how the event speakers are different from contribution speakers in our context. Maybe we take this IRL?
| return None | ||
| if len(date_str) < 10 or len(date_str) > 13: | ||
| # Too short/long to have the full date info, some values only have the year! | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that we don't validate the date, in such cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible date formats I have found:
- 2008-03-11T14:00:00
- 27 Nov 1998
If date_str fits one of them it's validating, new solution
We also have these kind of values:
- 23 - 27 Nov 1998
- 1993
In these cases if we dont have the full date(month, day) it's not validated. Also If we have a date ranges ( 23 - 27 Nov 1998) parser returns the last date 27 Nov 1998
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another potential solution is to introduce edtf date support like in RDM.
cc01e2e to
0c93eaa
Compare
0c93eaa to
4170b15
Compare
88c62d3 to
64bce48
Compare
|
|
||
| def format_contributors(json_data): | ||
| """ | ||
| Same contributors could be both in tag 700 and 906. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to discuss this particular case and align on the role. We were wondering how the event speakers are different from contribution speakers in our context. Maybe we take this IRL?
cds_migrator_kit/videos/weblecture_migration/transform/xml_processing/quality/contributors.py
Outdated
Show resolved
Hide resolved
| return None | ||
| if len(date_str) < 10 or len(date_str) > 13: | ||
| # Too short/long to have the full date info, some values only have the year! | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another potential solution is to introduce edtf date support like in RDM.
cds_migrator_kit/videos/weblecture_migration/transform/xml_processing/rules/video_lecture.py
Outdated
Show resolved
Hide resolved
cds_migrator_kit/videos/weblecture_migration/transform/xml_processing/rules/video_lecture.py
Show resolved
Hide resolved
cds_migrator_kit/videos/weblecture_migration/transform/transform.py
Outdated
Show resolved
Hide resolved
64bce48 to
86825b2
Compare
816d1bb to
39d0591
Compare
39d0591 to
21e0ba2
Compare
zzacharo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make the tests a bit cleaner in the next PR! Great job 🚀
closes CERNDocumentServer/cds-videos#2000
Separated setup for rdm and videos
migration_configfilesSeparated tests for rdm and videos
tests.ymlupdated to have 2 jobs for testing separately rdm and videosrun-tests.shupdated to handle rdm and videos testsAdded runner module since it's common for rdm/videos