183 add dask parallelism for cube single field imaging#206
183 add dask parallelism for cube single field imaging#206Jan-Willem wants to merge 773 commits intomainfrom
Conversation
Merged the latest changes from 45-fit_gaussian, modified it to return the max PSF sidelobe. This is now returned in the return dictionary from the deconvolver. The stopcode is not yet populated, this is intentional. Once the stopcodes are finalized and documented for iteration control, it will be straightforward to add it to the deconvolver.
The previous change added an extra return value to the main lobe fitting routine, and psf_gaussian_fit was not updated to reflect that.
There is still a single failing test, but fixed a few tests that were failing due to the added return value from the find_main_lobe function.
When peak intensity was zero, there was a separate return call that returned 3 values (rather than the updated 4 values). This has been fixed, and all tests now pass.
Fixed a handful of bugs in the deconvolve() function, related to moving the per-plane iteration outside of hogbom_clean and into deconvolve() where it it more appropriate. Added tests for the deconvolve() function, but at the moment they are failing. The hogbom clean function and python bindings need to be refactored to accept only 2D arrays, and move all the cube slicing to be outside the actual deconvolution part. This will be done in a subsequent commit.
The hogbom clean C++ implementation and associated Python bindings have been refactored to remove the iteration over polarization. This is the initial version of the refactor, and has not been tested for correctness etc. The deconvolve() function will do all the iteration, and the lower level deconvolution functions will only accept a single plane to run the deconvolution on.
6 implement vlbi fringe fit
The refactor to pull out the iteration over all planes into the deconvolve() function is complete. Hogbom clean now expects single plane numpy arrays (rather than xarray cubelets). The unit tests have been updated to reflect this change, and have been verified to work correctly.
Added a few tests to check the return dictionary structure, indexing, and consistency. Fixed some bugs with the masks in deconvolve().
Uses the deconvolve() function now, with `algorithm='hogbom'` passed in to it. Some minor fixes for bugs that surfaced while fixing the notebook.
There was a problem hiding this comment.
Pull request overview
Adds Dask-parallel cube imaging for the “single field” workflow by introducing new distributed entrypoints and node tasks, plus utilities to pre-create Zarr outputs and to chunk work based on available threads/memory.
Changes:
- Introduces a new distributed cube-imaging driver (
image_cube_single_field) that builds and executes a GraphViper map/reduce workflow and writes per-chunk results to Zarr. - Adds low-level Zarr utilities (
create_empty_data_variables_on_disk) and expands data-partitioning docs for choosing chunking based on cluster resources. - Refactors selection-param validation and imaging-weight calculation to work with
ProcessingSetIteratorrather than aDataTree.
Reviewed changes
Copilot reviewed 19 out of 22 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/core/imaging/test_imaging_weights.py | Updates test import path for imaging-weights task module. |
| src/astroviper/utils/io.py | Adds helper to create empty Zarr arrays for imaging data variables. |
| src/astroviper/utils/data_partitioning.py | Adds/expands docstrings for thread info, factorization, and chunking logic. |
| src/astroviper/utils/check_params.py | Adds ProcessingSetIterator-based selection checking and iterator reset. |
| src/astroviper/task/imaging/calculate_imaging_weights.py | Refactors imaging-weights calculation to accept a PS iterator + image dataset. |
| src/astroviper/task/imaging/add_uv_sampling_grid.py | Renames/adjusts UV sampling gridding tasks and data-group defaults. |
| src/astroviper/task/imaging/add_visibility_grid.py | Renames/adjusts visibility gridding tasks and data-group defaults. |
| src/astroviper/task/imaging/make_image_single_field.py | Adds a single-field node task for chunk processing and Zarr writes. |
| src/astroviper/task/imaging/make_image_mosaic.py | Adds a mosaic node task for chunk processing and Zarr writes. |
| src/astroviper/task/imaging/image_cube_single_field_node_task.py | Adds node task implementation for cube imaging per frequency chunk. |
| src/astroviper/task/init.py | Introduces astroviper.task package initializer/version exposure. |
| src/astroviper/distributed/imaging/image_cube_single_field.py | Adds distributed entrypoint for cube single-field imaging + map/reduce wiring. |
| src/astroviper/distributed/imaging/cube_imaging_niter0.py | Switches node-task import to new task namespace. |
| src/astroviper/distributed/imaging/feather.py | Moves feather_core import inside function (lazy import). |
| src/astroviper/distributed/imaging/utils/init.py | Removes exports of legacy node tasks from distributed utils. |
| src/astroviper/core/imaging/imaging_weighting/grid_imaging_weights.py | Updates gridding API to pass n_uv/delta_lm directly. |
| src/astroviper/core/imaging/imaging_utils/gcf_prolate_spheroidal.py | Adds 1D correcting-image helper for prolate-spheroidal gridder. |
| src/astroviper/core/imaging/fft_normalize_prolate_spheriodal_gridder.py | Adds memory-aware per-plane IFFT normalization pipeline. |
| src/astroviper/core/imaging/fft_norm_img_xds_v2.py | Adds a memory-efficient IFFT+crop path and normalization refactor. |
| src/astroviper/core/imaging/init.py | Stops re-exporting some gridder helpers from the core imaging namespace. |
| docs/tutorials/cube_single_field_imaging.ipynb | Adds a tutorial notebook demonstrating cube single-field imaging. |
Comments suppressed due to low confidence (5)
src/astroviper/task/imaging/calculate_imaging_weights.py:151
ms_xdt.data_groups[...] = ...is inconsistent with the rest of the codebase, which stores data groups underxds.attrs["data_groups"](e.g.,core/visibility_manipulation/phase_shift.py:117). Ifms_xdtis an xarrayDataTreenode (as implied bycheck_sel_params_ps_iterusingms_xdt.ds), this will raiseAttributeError. Write toms_xdt.ds.attrs["data_groups"][...](orms_xdt.attrs[...]if appropriate) instead.
src/astroviper/task/imaging/calculate_imaging_weights.py:42- The function signature/docstring/return type are out of sync after switching from
ps_xdttops_iter+img_xds: the type annotation still references returningDataTree, and the Examples section still shows agrid_params=argument (which no longer exists). Updating the type hints/docstring to match the actual API will prevent confusion for callers.
src/astroviper/task/imaging/add_visibility_grid.py:145 - Avoid
print()in library/task code (it will spam worker logs and bypass the project logger). Uselogger.debug(...)(or remove entirely) so output is controllable by log level and handler configuration.
src/astroviper/task/imaging/add_visibility_grid.py:230 - This data-group assignment is duplicated twice in a row; the second block is redundant and can be removed to avoid confusion.
src/astroviper/task/imaging/calculate_imaging_weights.py:12 - There are a few unused/duplicate imports at the top (
from tracemalloc import startis unused;numpyandxarrayare imported multiple times;check_grid_paramsis imported but not used). Cleaning these up will reduce lint noise and speed module import slightly on workers.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 3 * n_pixels_single_frequency * bytes_in_dtype["complex64"] / (1024**3) | ||
| + 3 * n_pixels_single_frequency * bytes_in_dtype["float32"] / (1024**3) | ||
| ) | ||
| else: | ||
| memory_singleton_chunk = fudge_factor * ( | ||
| 3 * n_pixels_single_frequency * bytes_in_dtype["complex128"] / (1024**3) | ||
| + 3 * n_pixels_single_frequency * bytes_in_dtype["float64"] / (1024**3) |
| from graphviper.graph_tools import generate_dask_workflow, generate_airflow_workflow | ||
| from graphviper.graph_tools import map, reduce | ||
| from astroviper.distributed.imaging.utils import make_image_mosaic | ||
| from astroviper.task.imaging import make_image_mosaic |
| freq_chan: np.ndarray, | ||
| grid_parms: dict, | ||
| # grid_parms: dict, | ||
| n_uv: list, | ||
| delta_lm: list, | ||
| ): |
| 3 * n_pixels_single_frequency * bytes_in_dtype["complex64"] / (1024**3) | ||
| + 3 * n_pixels_single_frequency * bytes_in_dtype["float32"] / (1024**3) | ||
| ) | ||
| else: | ||
| memory_singleton_chunk = fudge_factor * ( | ||
| 3 * n_pixels_single_frequency * bytes_in_dtype["complex128"] / (1024**3) | ||
| + 3 * n_pixels_single_frequency * bytes_in_dtype["float64"] / (1024**3) |
| @@ -0,0 +1,16 @@ | |||
| # src/astroviper/__init__.py | |||
| fill_value=np.nan, | ||
| ) | ||
|
|
| pass | ||
| if data_variable_definitions == "imaging" and double_precision: | ||
| data_variable_definitions = imaging_data_variables_and_dims_double_precision | ||
| elif data_variable_definitions == "imaging" and not double_precision: | ||
| data_variable_definitions = imaging_data_variables_and_dims_single_precision |
… of https://github.com/casangi/astroviper into 183-add-dask-parallelism-for-cube-single-field-imaging
… of https://github.com/casangi/astroviper into 183-add-dask-parallelism-for-cube-single-field-imaging
… of https://github.com/casangi/astroviper into 183-add-dask-parallelism-for-cube-single-field-imaging
… of https://github.com/casangi/astroviper into 183-add-dask-parallelism-for-cube-single-field-imaging
|
Jan-Willem Steeb seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
… of https://github.com/casangi/astroviper into 183-add-dask-parallelism-for-cube-single-field-imaging
No description provided.