Add testing workflow and dependencies by lispandfound · Pull Request #9 · ucgmsim/nzcvm_data

lispandfound · 2026-01-04T23:58:25Z

Adds a testing workflow to the data repository and properly specifies dependencies. Also:

Fixes the Dunedin smoothing boundary to make the tests pass. Probably should have a more elegant fix.
Style fixes the registry file.
Removes missing paths in the registry to "grisborne_basement.png".
Deletes the registry entry to the broken Chow model.

The Tests

Summary of NZCVM Registry Validation Tests

This test suite ensures the integrity of the New Zealand Community Velocity Model (NZCVM) by validating both the metadata registry and the underlying geophysical data files.

Registry Metadata Validation

Schema Enforcement: Uses the schema library to verify nzcvm_registry.yaml structure, ensuring valid Unix paths, python identifiers for submodels, and correctly formatted URLs.
Dynamic Test Generation: Uses pytest_generate_tests to automatically create individual test cases for every entry defined in the registry.

Tomography Model Tests

Verifies the high-resolution 3D velocity models stored in HDF5 format:

File Integrity: Ensures paths exist and files are valid, non-empty HDF5 containers.
Metadata Alignment: Cross-references elevation levels in the YAML registry against the actual datasets inside the HDF5 file.
Structural Consistency: Validates that the dimensions of data arrays ($V_p$, $V_s$, and $\rho$) match the (Latitude $\times$ Longitude) grid dimensions.
Geospatial Logic: * Latitudes must be between $-90$ and $90$.
- Longitudes must be between $0$ and $185$.
- Coordinates must be strictly monotonic (ascending or descending).
Physical Constraints: Ensures no NaN values exist and data falls within predefined bounds:
- $V_p$: $0$ to $11.0$ km/s
- $V_s$: $0$ to $7.0$ km/s
- $\rho$ (Density): $0$ to $5.0$ g/cm³

Basin & Surface Validation

Checks the geometry and depth surfaces of sedimentary basins:

Boundary Geometry: Uses shapely to verify that basin boundaries (GeoJSON) are valid, non-empty, and composed of closed Polygons.
Surface Coverage: Verifies that the HDF5 elevation surfaces spatially contain the entire basin boundary.
Smoothing Transitions: Ensures that "smoothing boundaries" are correctly nested within the primary basin boundaries.
Surface Data Quality: Validates that surface elevations are realistic (between $\pm10,000$m) and contain no missing data.

Vs30 & Submodel Tests

Vs30 Verification: Validates near-surface velocity data, ensuring values are within $0$ to $2000$ m/s.
1D Velocity Models (vm1d): Parses 1D text-based model files (DEF HST format) to verify:
- Correct header identification.
- Positive layer thicknesses.
- Valid seismic quality factors ($Q_p$, $Q_s$).

test: update QUALITY_BOUNDS values for vp and vs in test_registry - Increase upper bound for vp from 10.0 to 11.0 - Increase upper bound for vs from 6.0 to 6.5 - Add note clarifying values are not physically derived test: remove longitude spacing tolerance check from tomography geo gridpoints test chore(pyproject): add empty types optional dep tests: update quality bounds to ayushi's values ci: add github actions test: add comprehensive basin, vs30, and submodel validation tests - Add Basin, Vs30, and Submodel TypedDicts for schema validation - Parametrize tests for basin, vs30, and submodel entries from registry - Add tests for basin boundaries, surfaces, smoothing, and containment - Add tests for vs30 file existence, HDF5 validity, and gridpoint checks - Add tests for submodel data existence and content validation - Improve test coverage and robustness for registry-driven datasets test: refactor registry tests for clarity and remove unused code - Remove duplicate and unused imports in test_registry.py - Decorate test_nzcvm_registry_schema with @no_type_check for type checking bypass - Eliminate redundant and commented-out test code for basin smoothing boundaries - Remove unused variables and streamline vs30 tests for readability deps: add shapely dependency ci: add all extras to ty check ci: add yamllint config ci: fix pytest action ci: use cached LFS checkout build: require Python 3.11 or higher in pyproject.toml test: improve boundary geometry assertions in basin smoothing test - Check now asserts the basin *boundary* contains the smoothing boundary, instead of the basin *polygon* (which includes the interior). ci: use big-data runner group for extra space ci: use ubuntu latest but remove unused junk

…ntry - Update Canterbury basement file paths to use correct lowercase naming - Fix spacing in Wellington basement file path - Remove invalid Gisborne basement image reference from registry

gemini-code-assist · 2026-01-04T23:58:43Z

Summary of Changes

Hello @lispandfound, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the NZCVM data repository by integrating a new testing framework and formalizing dependency management. The changes ensure that the geophysical data and its metadata registry adhere to defined standards and are free from common errors, improving overall data reliability and maintainability. It also includes several targeted fixes and stylistic improvements to the registry file itself.

Highlights

Testing Workflow: A comprehensive testing workflow has been added to the data repository to ensure the integrity and consistency of the New Zealand Community Velocity Model (NZCVM) data.
Dependency Management: Project dependencies are now properly specified using a new 'pyproject.toml' file, including dedicated optional dependencies for testing and development.
Registry Cleanup and Style Fixes: The 'nzcvm_registry.yaml' file has undergone significant cleanup, including reformatting, removal of outdated entries, correction of file paths, and general style improvements for readability.
Dunedin Smoothing Boundary Fix: The Dunedin smoothing boundary definition has been adjusted to resolve issues, likely to ensure tests pass as mentioned in the PR description.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files

Ignored by pattern: .github/workflows/** (4)
- .github/workflows/pytest.yml
- .github/workflows/ruff.yml
- .github/workflows/types.yml
- .github/workflows/yamllint.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new pyproject.toml file to manage project dependencies and metadata, and a .yamllint configuration file to enforce a 200-character line length. The nzcvm_registry.yaml file has been updated by reformatting elev arrays for readability, removing two tomography entries (CHOW2020_EP2020_MIX, EP2025), standardizing basement file naming to lowercase 'basement' across multiple basin entries, removing an image from Gisborne_v21p7, and correcting various indentation and spacing issues throughout the basin and submodel sections. Additionally, a block of coordinate data was removed from regional/Dunedin/Dunedin_smoothing.txt. The review comments highlight several issues in tests/test_registry.py: the attrs library needs to be added as a test dependency in pyproject.toml, the wiki_images path validation logic was incorrect and needed to be relative to the repository root, and the test_submodel_data_is_valid test was improved to include rho validation and use pytest-subtests for better error reporting. Further suggestions included making latitude and longitude checks in test_surface_geo_gridpoints more robust by allowing both strictly ascending or descending order, and using line.strip().split() for parsing in read_smoothing_boundary and parse_submodel_data for improved robustness.

tests/test_registry.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…o ci-testing-branch

lispandfound · 2026-01-05T02:01:19Z

@sungeunbae type checking does work but fails in the installation step because of version resolution issues with numba.

Catches invalid DOI links

lispandfound added 7 commits January 5, 2026 12:19

fix(registry): correct basement file paths and remove invalid image e…

89353c3

…ntry - Update Canterbury basement file paths to use correct lowercase naming - Fix spacing in Wellington basement file path - Remove invalid Gisborne basement image reference from registry

fix(registry): remove broken Chow tomography

c69d838

fix(Dunedin): fix Dunedin smoothing boundary

52b0ab1

add missing dependencies

175442b

deps: add deptry to dev dependencies

51a1458

tests(registry): update tests to check ascending or descending

91e0ef1

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

lispandfound and others added 10 commits January 5, 2026 13:02

fix: do not use attr for dataclass

e81d173

deps: migrate tools to optional dependencies

48a28db

tests(registry): subtests for rows.

39f3574

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

tests(registry): lat monotonicity

4a4298e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

tests(registry): lon monotonicity

693f718

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

tests(registry): use split function

f083aa3

Merge branch 'ci-testing-branch' of github.com:ucgmsim/nzcvm_data int…

9771e49

…o ci-testing-branch

fix(tomo_boundary): ruff + type checking + numpydoc

f724adb

fix(types): resolve type checking complaints

fd18a72

fix: attribution and python ident checks

0e0e964

lispandfound requested a review from sungeunbae January 5, 2026 02:00

tests(registry): use more robust url validator

f991ebc

Catches invalid DOI links

sungeunbae approved these changes Jan 6, 2026

View reviewed changes

lispandfound added 2 commits January 7, 2026 11:27

deps: add uv.lock to make tests pass

9edc10c

use default cache dependency glob

5247601

lispandfound merged commit e895787 into main Jan 6, 2026
4 checks passed

lispandfound deleted the ci-testing-branch branch January 6, 2026 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add testing workflow and dependencies#9

Add testing workflow and dependencies#9
lispandfound merged 20 commits intomainfrom
ci-testing-branch

lispandfound commented Jan 4, 2026

Uh oh!

gemini-code-assist bot commented Jan 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lispandfound commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lispandfound commented Jan 4, 2026

The Tests

Summary of NZCVM Registry Validation Tests

Registry Metadata Validation

Tomography Model Tests

Basin & Surface Validation

Vs30 & Submodel Tests

Uh oh!

gemini-code-assist bot commented Jan 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lispandfound commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants