Skip to content

Conversation

@zubeydecivelek
Copy link
Contributor

@zubeydecivelek zubeydecivelek commented Feb 6, 2025

closes CERNDocumentServer/cds-videos#2000

  • Copyright headers updated to 2025
  • Contributor affiliation added as array of string
  • Removed zenodo leftovers
  • VideoLecture model ignore_keys ordered
  • Added transform rules for videos required metadata
  • Tests for transform rules

Separated setup for rdm and videos

  • Separate installation for rdm and videos
  • Entry points updated (dynamically registering according to installation)
  • Separate migration_config files

Separated tests for rdm and videos

  • 2 test folder to run rdm and videos separately
  • tests.yml updated to have 2 jobs for testing separately rdm and videos
  • run-tests.sh updated to handle rdm and videos tests

Added runner module since it's common for rdm/videos

@zubeydecivelek zubeydecivelek force-pushed the videos-transform branch 4 times, most recently from 9807eb8 to 8f1ed9e Compare February 7, 2025 13:33
Copy link
Contributor

@ntarocco ntarocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏼

Can you add a test for the digitized field?


def format_contributors(json_data):
"""
Same contributors could be both in tag 700 and 906.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the 906 with a different role, see https://cds.cern.ch/help/admin/howto-marc?ln=en
906 RESPONSIBLE PERSON / REFEREE (R) [CERN]

@zzacharo @jrcastro2 Looks like we should first re-define the final list of roles for authors. It does not look like that there is a good one for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our meeting we decided tag 906 is event speakers, it's also coming from indico event. And if you see this record 700 and 906 has exactly same people.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to discuss this particular case and align on the role. We were wondering how the event speakers are different from contribution speakers in our context. Maybe we take this IRL?

return None
if len(date_str) < 10 or len(date_str) > 13:
# Too short/long to have the full date info, some values only have the year!
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we don't validate the date, in such cases?

Copy link
Contributor Author

@zubeydecivelek zubeydecivelek Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible date formats I have found:

  • 2008-03-11T14:00:00
  • 27 Nov 1998
    If date_str fits one of them it's validating, new solution

We also have these kind of values:

  • 23 - 27 Nov 1998
  • 1993
    In these cases if we dont have the full date(month, day) it's not validated. Also If we have a date ranges ( 23 - 27 Nov 1998) parser returns the last date 27 Nov 1998

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another potential solution is to introduce edtf date support like in RDM.

@zubeydecivelek zubeydecivelek force-pushed the videos-transform branch 2 times, most recently from cc01e2e to 0c93eaa Compare February 12, 2025 15:58
@zubeydecivelek zubeydecivelek changed the title videos: migration required transform rules videos: required transform rules / setup: separate rdm and videos Feb 12, 2025
@zubeydecivelek zubeydecivelek force-pushed the videos-transform branch 2 times, most recently from 88c62d3 to 64bce48 Compare February 12, 2025 16:35

def format_contributors(json_data):
"""
Same contributors could be both in tag 700 and 906.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to discuss this particular case and align on the role. We were wondering how the event speakers are different from contribution speakers in our context. Maybe we take this IRL?

return None
if len(date_str) < 10 or len(date_str) > 13:
# Too short/long to have the full date info, some values only have the year!
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another potential solution is to introduce edtf date support like in RDM.

Copy link
Contributor

@zzacharo zzacharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make the tests a bit cleaner in the next PR! Great job 🚀

@zzacharo zzacharo merged commit 4390364 into CERNDocumentServer:master Feb 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement required migration rules

4 participants