Skip to content

Conversation

@halungge
Copy link
Contributor

@halungge halungge commented Sep 6, 2024

Decompose (global) grid file:

  • uses pymetis to decompose the global grid (cells) into n patches
  • after decomposition halos for all dimensions (cell, edge, vertex) are constructed. Halo construction is done in a ICON like fashion: They consist halos of 2 cell levels (one upward and one downward pointing line) and the corresponding vertices and edges on these lines.

Omissions:

  • LAM grids need to be investigated further:

    • tests comparing decomposed vs. single_node computation are only run on the global grid.
    • for the LAM grids ICON reorders arrays to arrange halo points on the first boundary layers together with the boundary layers, it should be investigated whether that is essential in the model.
    • This PR does only take this into account on the computation of the start_index and end_index not in the halo construction.
  • the number of halo lines (in terms of cells) is hardcoded to 2, that could be made a parameter.

  • Not sure it all runs on GPU correctly... most probably there are some numpy cupy issues to fix.

halungge added 30 commits July 3, 2024 22:31
extract conventional ownership assignment
pass connectivities explicitly in order to avoid sparse matrix 0 - based index business. (preliminary fix)
- read data fields from grid file (only selected indices)
fix imports in grid_manager
move test to for local grid to test_grid_manager.py
@halungge halungge marked this pull request as ready for review December 16, 2025 21:39
@halungge halungge requested review from havogt and msimberg December 16, 2025 21:39
transformation: IndexTransformation,
grid_file: pathlib.Path | str,
config: v_grid.VerticalGridConfig, # TODO(halungge): remove to separate vertical and horizontal grid
num_levels: int,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get rid of the num_levels here, it is now only here because we have it in the BaseGrid which historically comes from the fact that we had it in the SimpleMesh


# TODO(msimberg): Compute these in GridGeometry once FieldProviders can produce scalars.
# This will also allow easier handling once grids are distributed.
# TODO (@halungge): use global reduction in geometry.py
Copy link
Contributor Author

@halungge halungge Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we should not read them from the gridfile, as they are not set in all grid files, rather move th computation of the fields below (edge_length, cell_area ... back to the geometry.py and compute the means there by global reduction.

That could also simplyfy this GlobalGridParams struct a lot.



def _refinement_level_placed_with_halo(domain: h_grid.Domain) -> int:
"""There is a speciality in the setup of the ICON halos: generally halo points are located at the end of the arrays after all
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers used here are really determined by reverse engineering in ICON. There is no reference in any documentation to that afaik. ICON places these not boundary-halo points there, the reason to me is mysterious and I think it it worth trying to run without that.

In any case, it is only relevant for LAM models.

@github-actions
Copy link

github-actions bot commented Jan 2, 2026

Mandatory Tests

Please make sure you run these tests via comment before you merge!

  • cscs-ci run default

Optional Tests

To run benchmarks you can use:

  • cscs-ci run benchmark-bencher

To run tests and benchmarks with the DaCe backend you can use:

  • cscs-ci run dace

To run test levels ignored by the default test suite (mostly simple datatest for static fields computations) you can use:

  • cscs-ci run extra

For more detailed information please look at CI in the EXCLAIM universe.

backend: gtx_typing.Backend | None,
grid: test_defs.GridDescription,
) -> None:
if test_utils.is_dace(backend):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear to me what happens here: The test runs when running it in isolation, but does not when all tests in the module are run. In fact it makes other fail as well. The test_halo_neighbor_access_e2c shows the same behavior. I don't know whether this is a matter of the halo setup (maybe still some ordering is incorrect?) or an issue with dace caching. Candidates to be investigated:

  • both stencil functions used in these test use direct neighborhood access.
  • the connectivities do have skip values
  • caching? I am not entirely sure what is cached when by which backend...

backend: gtx_typing.Backend | None,
grid: test_defs.GridDescription,
) -> None:
if test_utils.is_dace(backend):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

@halungge
Copy link
Contributor Author

halungge commented Jan 3, 2026

cscs-ci run default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants