Skip to content

DataSetBlock constructor takes some params with padding applied but also takes in the padding value #645

@yousefmoazzam

Description

@yousefmoazzam

The padding applied to the block data is provided to the constructor:

def __init__(
self,
data: np.ndarray,
aux_data: AuxiliaryData,
slicing_dim: Literal[0, 1, 2] = 0,
block_start: int = 0,
chunk_start: int = 0,
global_shape: Optional[Tuple[int, int, int]] = None,
chunk_shape: Optional[Tuple[int, int, int]] = None,
padding: Tuple[int, int] = (0, 0),
):

However, some of the various index-related values have the padding included in them:

padding: Tuple[int, int]
Padding information - holds the number of padded slices before and after the core area of the block,
in slicing dimension. If not given, no padding is assumed.
Note that the padding information should be added to the data's shape, i.e. block_start, chunk_start,
chunk_shape, and the data's shape includes the padded slices. And therefore block_start or chunk_start
may have negative values of up to -padding[0]. The global_shape is not adapted for padding.

It's confusing that the caller needs to know which ones to pad and which ones not to pad. The DataSetBlock constructor is essentially forcing the caller to have to know which values that it wants padded, which feels like an unnecessary burden on the caller.

A simpler and less confusing approach would be for the caller to pass:

  • the unpadded values
  • the padding

and the DataSetBlock constructor applies padding to whatever it wants/needs padding to be applied to in order to work correctly, avoiding making the caller have to know such things which are probably internal details which shouldn't need to be taken care of by callers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions