Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 52 additions & 4 deletions src/xradio/testing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,62 @@ The testing module includes the following submodules:

<!-- **`assertions (TBD)`**: Functions for validating data structures and schemas
- **`fixtures (TBD)`**: Reusable test fixtures for setting up test data -->
- **`image`**: Utilities for downloading, generating, and validating image test data
- **`measurement_set`**: Utilities for generating, checking, and manipulating MeasurementSet test data
- **`_utils (TBD)`**: Private testing utilities

The `measurement_set` submodule contains several Python modules that provide specific functionality:
### `image` submodule

- **`io.py`**: I/O helpers to download test MeasurementSets (non-casacore dependent)
- **`checker.py`**: Validation functions to check MSv4 data structures against expected specifications
- **`msv2_io.py`**: MSv2-specific I/O operations and test data generators (casacore-dependent)
Framework-agnostic helpers (no pytest dependency) for image unit tests, ASV
benchmarks, and third-party projects that use `xradio.image`.

#### `io.py`

| Function | Signature | Purpose |
|---|---|---|
| `download_image` | `(fname, directory=".")→Path` | Download an image asset to disk without opening it. Mirrors `download_measurement_set`. |
| `download_and_open_image` | `(fname, directory=".")→xr.Dataset` | Download an image asset and return it as an opened `xr.Dataset`. |
| `remove_path` | `(path)→None` | Delete a file or directory tree. No-op when the path does not exist. |

#### `generators.py`

| Function | Signature | Purpose |
|---|---|---|
| `make_beam_fit_params` | `(xds)→xr.DataArray` | Build a synthetic `BEAM_FIT_PARAMS` DataArray from an open image dataset. Shape is derived from the `time`, `frequency`, and `polarization` dimensions. |
| `create_empty_test_image` | `(factory, do_sky_coords=None)→xr.Dataset` | Call any `make_empty_*` factory (`make_empty_sky_image`, `make_empty_aperture_image`, `make_empty_lmuv_image`) with a canonical set of test coordinates. |
| `scale_data_for_int16` | `(data)→np.ndarray` | Clip and cast a float array to the int16 range (NaN→0, clip to ±32767, cast). Supports `create_bzero_bscale_fits`. |
| `create_bzero_bscale_fits` | `(outname, source_fits, bzero, bscale)→None` | Write a FITS file with explicit `BSCALE`/`BZERO` headers for guard testing. Reads pixel data from `source_fits`, scales it via `scale_data_for_int16`, and writes to `outname`. |

#### `assertions.py`

| Function | Signature | Purpose |
|---|---|---|
| `normalize_image_coords_for_compare` | `(coords, factor=180*60/π)→None` | Convert direction coordinates from radians to arcminutes in-place so a round-tripped CASA image can be compared with the original. Modifies `coords` in place. |
| `assert_image_block_equal` | `(xds, output_path, zarr=False)→None` | Attach a synthetic `BEAM_FIT_PARAMS` variable to `xds`, write to `output_path`, reload a fixed spatial block, and assert equality via `assert_xarray_datasets_equal`. |

All nine public names are re-exported from the package's `__init__.py`:

```python
from xradio.testing.image import (
download_image,
download_and_open_image,
remove_path,
make_beam_fit_params,
create_empty_test_image,
scale_data_for_int16,
create_bzero_bscale_fits,
normalize_image_coords_for_compare,
assert_image_block_equal,
)
```

### `measurement_set` submodule

| Module | Purpose |
|---|---|
| `io.py` | I/O helpers to download test MeasurementSets (non-casacore dependent) |
| `checker.py` | Validation functions to check MSv4 data structures against expected specifications |
| `msv2_io.py` | MSv2-specific I/O operations and test data generators (casacore-dependent) |

These are Python modules and their functions are exported through the package's `__init__.py` for convenient access.

Expand Down
5 changes: 4 additions & 1 deletion src/xradio/testing/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
from xradio.testing.assertions import (
from .assertions import (
assert_attrs_dicts_equal,
assert_xarray_datasets_equal,
)
from . import image

__all__ = [
"assert_attrs_dicts_equal",
"assert_xarray_datasets_equal",
# image sub-package (imported so external projects can use it)
"image",
]
51 changes: 51 additions & 0 deletions src/xradio/testing/image/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
"""
Test utilities for xradio image functionality.

Usable in pytest, ASV benchmarks, and any other framework – no pytest
dependency is introduced by importing this package.

Examples
--------
>>> from xradio.testing.image import (
... download_image,
... download_and_open_image,
... remove_path,
... make_beam_fit_params,
... create_empty_test_image,
... create_bzero_bscale_fits,
... scale_data_for_int16,
... normalize_image_coords_for_compare,
... assert_image_block_equal,
... )
"""

__all__ = [
# IO helpers
"download_image",
"download_and_open_image",
"remove_path",
# Generators
"make_beam_fit_params",
"create_empty_test_image",
"create_bzero_bscale_fits",
"scale_data_for_int16",
# Assertions / comparators
"normalize_image_coords_for_compare",
"assert_image_block_equal",
]

from xradio.testing.image.assertions import (
assert_image_block_equal,
normalize_image_coords_for_compare,
)
from xradio.testing.image.generators import (
create_bzero_bscale_fits,
create_empty_test_image,
make_beam_fit_params,
scale_data_for_int16,
)
from xradio.testing.image.io import (
download_and_open_image,
download_image,
remove_path,
)
129 changes: 129 additions & 0 deletions src/xradio/testing/image/assertions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
"""Image-specific assertion and comparison helpers.

All functions raise ``AssertionError`` on failure and are framework-agnostic,
so they work equally in pytest, unittest, and ASV benchmarks.
"""

from __future__ import annotations

from typing import Dict

import numpy as np
import xarray as xr


def normalize_image_coords_for_compare(
coords: dict,
factor: float = 180 * 60 / np.pi,
direction_key: str = "direction0",
spectral_key: str = "spectral2",
direction_units: list[str] | None = None,
vel_unit: str = "km/s",
) -> None:
"""Normalise a CASA image coordinate dict for round-trip comparison.

When an image is written from an ``xr.Dataset`` and re-opened by
casacore, direction coordinate values are stored in radians whereas
the original CASA image stores them in arcminutes. This function
converts the direction entries in *coords* by multiplying by *factor*
(default: rad → arcmin) and sets the spectral velocity unit so the
two dicts can be compared with
:func:`~xradio.testing.assert_attrs_dicts_equal`.

Modifies *coords* **in place**.

Parameters
----------
coords : dict
Coordinate dict returned by ``casacore.images.image.info()["coordinates"]``
or ``casacore.tables.table.getkeywords()["coords"]``.
factor : float, optional
Multiplicative scale applied to ``cdelt`` and ``crval`` of the
direction sub-dict. Defaults to ``180 * 60 / π`` (radians → arcminutes).
direction_key : str, optional
Key of the direction coordinate entry in *coords*.
Defaults to ``"direction0"``.
spectral_key : str, optional
Key of the spectral coordinate entry in *coords*.
Defaults to ``"spectral2"``.
direction_units : list of str or None, optional
Unit strings written into the direction sub-dict after scaling.
Defaults to ``["'", "'"]`` (arcminutes) when *None*.
vel_unit : str, optional
Velocity unit string written into ``coords[spectral_key]["velUnit"]``.
Defaults to ``"km/s"``.
"""
if direction_units is None:
direction_units = ["'", "'"]
direction = coords[direction_key]
direction["cdelt"] *= factor
direction["crval"] *= factor
direction["units"] = direction_units
coords[spectral_key]["velUnit"] = vel_unit


def assert_image_block_equal(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is implicitly for l,m (not u,v aperture) images only. Suggest name change to assert_sky_image_block_equal

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmehring , the original code in test_image.py lines 410:415 are these:

@staticmethod
def _normalize_coords_for_compare(coords, factor):
        direction = coords["direction0"]
        direction["cdelt"] *= factor
        direction["crval"] *= factor
        direction["units"] = ["'", "'"]
        coords["spectral2"]["velUnit"] = "km/s"

Therefore the name given in the refactoring follows the intention of the original code. Do you still think we should rename it to assert_sky_image_block_equal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept the same name, but generalized both functions. If renaming is still better, let me know and I will apply it.

xds: xr.Dataset,
output_path: str,
selection: Dict[str, slice],
zarr: bool = False,
do_sky_coords: bool = True,
) -> None:
"""Write an image, reload a spatial block, and assert equality with the
corresponding slice of the original dataset.

Workflow
--------
1. Write *xds* to *output_path*.
2. Load the region specified by *selection* from the written image via
:func:`~xradio.image.load_image`.
3. Compute the equivalent slice of *xds* with ``isel``.
4. Assert equality using
:func:`~xradio.testing.assert_xarray_datasets_equal`.

Parameters
----------
xds : xr.Dataset
Full image dataset to write and slice. Augment the dataset with
any extra data variables (e.g. ``BEAM_FIT_PARAMS``) *before* calling
this function if you want them included in the comparison.
output_path : str
Destination path for the written image. The path is overwritten if
it already exists.
selection : dict of str to slice
Mapping of dimension name to ``slice`` that defines the block to load
and compare. Every slice end must not exceed the corresponding
dimension size in *xds*.
zarr : bool, optional
If *True* write in zarr format; otherwise write as a CASA image.
Defaults to *False*.
do_sky_coords : bool, optional
Forwarded to :func:`~xradio.image.load_image` as ``do_sky_coords``.
Defaults to *True*.

Raises
------
ValueError
If any slice in *selection* exceeds the size of the corresponding
dimension in *xds*.
"""
from xradio.image import load_image, write_image
from xradio.testing import assert_xarray_datasets_equal

bad_dims = []
for dim, slc in selection.items():
size = xds.sizes.get(dim, 0)
stop = slc.stop if slc.stop is not None else size
if stop > size:
bad_dims.append(f"{dim}: slice stop {stop} > size {size}")
if bad_dims:
raise ValueError(
"assert_image_block_equal: selection exceeds dataset dimensions — "
+ ", ".join(bad_dims)
)

write_image(xds, output_path, out_format="zarr" if zarr else "casa", overwrite=True)

loaded = load_image(output_path, selection, do_sky_coords=do_sky_coords)
true_xds = xds.isel(**selection)
assert_xarray_datasets_equal(loaded, true_xds)
132 changes: 132 additions & 0 deletions src/xradio/testing/image/generators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
"""Test-data generators for image tests.

All functions are framework-agnostic and can be used in pytest, ASV
benchmarks, or any other harness that imports ``xradio.testing.image``.
"""

from __future__ import annotations

import numpy as np
import xarray as xr


def make_beam_fit_params(xds: xr.Dataset) -> xr.DataArray:
"""Build a ``BEAM_FIT_PARAMS`` DataArray from an image Dataset.

Creates a synthetic beam-parameter array whose shape is derived from
the *time*, *frequency*, and *polarization* dimensions of *xds*.
The first time–channel–polarization entry is set to 2.0; all others
are 1.0.

Parameters
----------
xds : xr.Dataset
An open image dataset with ``time``, ``frequency``, and
``polarization`` dimensions.

Returns
-------
xr.DataArray
Array with dims ``["time", "frequency", "polarization",
"beam_params_label"]`` and coords inherited from *xds*.
"""
shape = (
xds.sizes["time"],
xds.sizes["frequency"],
xds.sizes["polarization"],
3,
)
ary = np.ones(shape, dtype=np.float32)
ary[0, 0, 0, :] = 2.0
return xr.DataArray(
data=ary,
dims=["time", "frequency", "polarization", "beam_params_label"],
coords={
"time": xds.time,
"frequency": xds.frequency,
"polarization": xds.polarization,
"beam_params_label": ["major", "minor", "pa"],
},
)


def create_empty_test_image(factory, do_sky_coords=None) -> xr.Dataset:
"""Call a ``make_empty_*`` factory with canonical test arguments.

Provides a single set of standard test coordinates so every empty-image
factory can be exercised with the same call.

Parameters
----------
factory : callable
One of ``make_empty_sky_image``, ``make_empty_aperture_image``, or
``make_empty_lmuv_image``.
do_sky_coords : bool or None, optional
Forwarded as ``do_sky_coords`` keyword argument when not *None*.

Returns
-------
xr.Dataset
The empty image dataset produced by *factory*.
"""
args = [
[0.2, -0.5], # phase_center
[10, 10], # image_size
[np.pi / 180 / 60, np.pi / 180 / 60], # cell_size
[1.412e9, 1.413e9], # frequency
["I", "Q", "U"], # polarization
[54000.1], # time
]
kwargs = {} if do_sky_coords is None else {"do_sky_coords": do_sky_coords}
return factory(*args, **kwargs)


def scale_data_for_int16(data: np.ndarray) -> np.ndarray:
"""Scale a float array to the int16 range for FITS BSCALE/BZERO testing.

Replaces NaNs with zero, clips to ``[-32768, 32767]``, and casts to
``int16``.

Parameters
----------
data : np.ndarray
Input floating-point array.

Returns
-------
np.ndarray
A new array of dtype ``int16``.
"""
data = np.nan_to_num(data, nan=0.0)
data = np.clip(data, -32768, 32767)
return data.astype(np.int16)


def create_bzero_bscale_fits(
outname: str, source_fits: str, bzero: float, bscale: float
) -> None:
"""Write a FITS file with explicit BSCALE/BZERO headers for guard testing.

Reads pixel data from *source_fits*, scales it to the int16 range via
:func:`scale_data_for_int16`, and writes a new FITS primary HDU to
*outname* with the given BSCALE and BZERO header keywords.

Parameters
----------
outname : str
Destination FITS file path.
source_fits : str
Source FITS file whose pixel data is used as the basis.
bzero : float
Value written to the ``BZERO`` header keyword.
bscale : float
Value written to the ``BSCALE`` header keyword.
"""
from astropy.io import fits

with fits.open(source_fits) as hdulist:
data = scale_data_for_int16(hdulist[0].data)
hdu = fits.PrimaryHDU(data=data)
hdu.header["BSCALE"] = bscale
hdu.header["BZERO"] = bzero
hdu.writeto(outname, overwrite=True)
Loading
Loading