Skip to content

Unexpected behavior when reading some v3 arrays under certain conditions #47

@melissalinkert

Description

@melissalinkert

In trying to add NGFF 0.5/zarr v3 support to raw2ometiff, I am encountering some unexpected behavior with a specific test that writes zarr v3 data using an odd chunk size, and then reads with a 32x32 shape.

The test in question is https://github.com/glencoesoftware/raw2ometiff/pull/148/files#diff-808adb1ac51eccf983c3644c8a5e41c964fe1c8650c555ad7104a4a22eb4d040R508, but I can reproduce the same behavior with zarr-java alone using this test code:

$ cat OddTest.java
import java.nio.ByteBuffer;
import dev.zarr.zarrjava.store.FilesystemStore;
import dev.zarr.zarrjava.v3.Array;
import dev.zarr.zarrjava.v3.DataType;

public class OddTest {
  public static void main(String[] args) throws Exception {
    int imageWidth = 52;
    int imageHeight = 1;
    int chunkWidth = 17;
    Array input = Array.create(
            new FilesystemStore(args[0]).resolve("0"),
            Array.metadataBuilder()
                    .withShape(imageHeight, imageWidth)
                    .withDataType(DataType.UINT8)
                    .withChunkShape(imageHeight, chunkWidth)
                    .withFillValue(0)
                    .build()
    );

    byte[] buf = new byte[imageWidth];
    for (int i=0; i<buf.length; i++) {
      buf[i] = (byte) i;
    }
    ByteBuffer bytes = ByteBuffer.wrap(buf);
    ucar.ma2.Array data = ucar.ma2.Array.factory(ucar.ma2.DataType.BYTE, new int[]{imageHeight, imageWidth}, bytes);
    input.write(new long[] {0, 0}, data);

    long[] readPosition = new long[] {0, 0};
    int[] readSize = new int[] {1, 32};
    for (int tile=0; tile<buf.length; tile+=readSize[1]) {
      readPosition[1] = tile;
      readSize[1] = (int) Math.min(readSize[1], buf.length - readPosition[1]);
      ucar.ma2.Array readTile = input.read(readPosition, readSize);
      for (int i=0; i<readSize[1]; i++) {
        byte pixel = readTile.getByte(i);
        if (pixel != buf[tile + i]) {
          System.out.println("pixel #" + (tile + i) + " read " + pixel + ", wrote " + buf[tile + i]);
          System.out.println("  grid position (X) = " + readPosition[1]);
          System.out.println("  index in current tile = " + i);
        }
      }
    }
  }
}
$ java OddTest oddtest.zarr
pixel #36 read 0, wrote 36
  grid position (X) = 32
  index in current tile = 4
pixel #37 read 0, wrote 37
  grid position (X) = 32
  index in current tile = 5
pixel #38 read 0, wrote 38
  grid position (X) = 32
  index in current tile = 6
pixel #39 read 0, wrote 39
  grid position (X) = 32
  index in current tile = 7
pixel #40 read 0, wrote 40
  grid position (X) = 32
  index in current tile = 8
pixel #41 read 0, wrote 41
  grid position (X) = 32
  index in current tile = 9
pixel #42 read 0, wrote 42
  grid position (X) = 32
  index in current tile = 10
pixel #43 read 0, wrote 43
  grid position (X) = 32
  index in current tile = 11
pixel #44 read 0, wrote 44
  grid position (X) = 32
  index in current tile = 12
pixel #45 read 0, wrote 45
  grid position (X) = 32
  index in current tile = 13
pixel #46 read 0, wrote 46
  grid position (X) = 32
  index in current tile = 14
pixel #47 read 0, wrote 47
  grid position (X) = 32
  index in current tile = 15
pixel #48 read 0, wrote 48
  grid position (X) = 32
  index in current tile = 16
pixel #49 read 0, wrote 49
  grid position (X) = 32
  index in current tile = 17
pixel #50 read 0, wrote 50
  grid position (X) = 32
  index in current tile = 18

The data that is written to the array appears to be correct, which I can confirm by looking at each of the chunk files:

$ xxd oddtest.zarr/0/c/0/0 
00000000: 0001 0203 0405 0607 0809 0a0b 0c0d 0e0f  ................
00000010: 10                                       .
$ xxd oddtest.zarr/0/c/0/1
00000000: 1112 1314 1516 1718 191a 1b1c 1d1e 1f20  ............... 
00000010: 21                                       !
$ xxd oddtest.zarr/0/c/0/2
00000000: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
00000010: 32                                       2
$ xxd oddtest.zarr/0/c/0/3
00000000: 3300 0000 0000 0000 0000 0000 0000 0000  3...............
00000010: 00                                       .

I see the same behavior with both zarr-java 0.0.9 and a snapshot built from the latest commit on main (93b10bb). If I change chunkWidth in line 10 in the above OddTest.java to 15 or 16 and re-run, then all pixel values are correctly read.

Definitely open to the possibility that there is something subtly wrong with how I have implemented the tile reading test, in which case any ideas would be greatly appreciated.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions