-
Notifications
You must be signed in to change notification settings - Fork 3
Update metadata integration to oemetadata v2 #1344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
…already part of the metadata specification
…ed to v2, old metadata is not fully supported anymore (<v1.5)
- Add settings module to manage general settings and specific input values which are important for the metadata integration - Update to latest OMI ... using local python 3.10 venv for now
|
Currently, i have to work with a local venv using Python 3.10 as |
|
After working on this for some time, i came to the conclusion to move this code later on to The implementation went into the direction of a OemetadataBuilder tool. It is an implementation of the oemetadata v2 structure using classes to manage single resource (dataset) and full datapackage metadata (all datasets in a collection). It relies on Dev´s can create YAML files collected in the new overlays module. They should be stored in a new directory, which is named after the dataset name to keep it tidy. Later on the template functionality is soon available in It can be added to the tasks of a dataset which will trigger an instance of the ResourceBuilder and the OemetadataBuilder as a step in the DAG pipeline definition which collects all Resources. Each table still can use the SQL comment on table to store the metadata. We could also change this approach and store oemetadata in a JSONB column as part of a new model with FK relation to the table resource. Another step would be the publishing to the OEP. But this is another PR |
- Formatting - Line too long
…ation as we can do this in a cleaner way
…mplementation a bit more stable until it is teared down
- Builder to generate, create and customize oemetadata for egon-data datasets - Implement draft builder tool -> will be enhanced further by using mixins to reduce class complexity - The oemetadataBuilder will become a more complex module and spit into a resource and datapackage builder
|
FYI @CarlosEpia I went further down the road to update egon-data to python 3.10. Locally i can now run the pipeline but some tasks are failing so i assume there need to be more updates:
This is my current env setup: $> uv pip list Using Python 3.10.12 environment at: .venv_py310 affine 2.4.0 |
…ge with multiple resources
…proach -> becomes resource builder to create metadata for single data resources
- It now expects a python dict or json string without extra quoting
… submit comment functionality - Also update path that reads the static oemetadata.json files stored in the metadata/results directory - Update metadataVersion to at least v1.5.2, omi does not support older version
add testing / zensus dags to gitignore
…ch works for the new metadata module and new omi functionality
… to the exsisting egon-data metadata JSON files
- remove string wrappers required for older versions of omi (used to parse metadata)
Fixes #1177 and adds #1305 as well as reworks the current metadata implementation and airflow pipeline integration.
to get started this is a larger PR ;)
Before merging into
dev-branch, please make sure thatCHANGELOG.rstwas updated.blackandisort.Dataset-version is updated when existing datasets are adjusted.test mode.Everythingmode.Closes #1177
Closes #1305