A validation tool for KG edges and nodes in KGX format.
Install the validator:
pip install matrix-validator
Run the validator:
matrix-validator python \
--report-dir tmp/ \
--edges abc_edges.tsv \
--nodes abc_nodes.tsv
Currently available options for validator are "pandera", "python" and "polars".
- Make sure you have poetry installed
- Run
make installto install the poetry environment - Run
make run_small_teststo see if it worked
The tool is currently divided in the following files (basic layout):
src/matrix_validator/cli.pycontains all CLI methods (click-based) and should not contain any code other than CLI boilerplate (in particular no IO)src/matrix_validator/validator.pycontains the abstract validation class.src/matrix_validator/datamodels.pycontains the edge and nodes schemas.src/matrix_validator/util.pycontains any utility methods that we might need.- We currently experiment with a number of different implementations:
src/matrix_validator/validator_polars.py: A very efficient pure polars implementation.src/matrix_validator/validator_purepython.py: A pure python implementationsrc/matrix_validator/validator_schema.py: A schema-based validation approach based on LinkML generated pandera schemas.
This cookiecutter project was developed from the monarch-project-template template and will be kept up-to-date using cruft.