Skip to content

Empty section handling #50

@DanW97

Description

@DanW97

I've noticed that PyVista includes all possible sections for PolyData files, and it isn't readily apparent how this can be prevented in a sane manner.

As an example, this is a dataset with the following attributes: NumberOfPoints="6" NumberOfVerts="6" NumberOfLines="0" NumberOfStrips="0" NumberOfPolys="0"

In binary:

      <Verts>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="0" RangeMax="5">
          AQAAAACAAAAwAAAAGAAAAA==eJxjYIAARijNBKWZoTQLlGaF0gABSAAQ
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1" RangeMax="6">
          AQAAAACAAAAwAAAAGAAAAA==eJxjZIAAJijNDKVZoDQrlGaD0gAB8AAW
        </DataArray>
      </Verts>
      <Lines>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Lines>
      <Strips>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Strips>
      <Polys>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Polys>

And in ASCII (as you can see - empty, I'm not fully certain on what the AAAAAACAAAAAAAAA entries for the binary file mean):

      <Verts>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="0" RangeMax="5">
          0 1 2 3 4 5
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1" RangeMax="6">
          1 2 3 4 5 6
        </DataArray>
      </Verts>
      <Lines>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Lines>
      <Strips>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Strips>
      <Polys>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Polys>

Running in debug mode, I see that xml.rs, line 1900 tries to execute an overflowing subtraction. In release mode, the silent overflow, to me, doesn't seem like a good thing. So I am wondering what the best approach to handling this case is? I've added a couple of lines to xml.rs that basically makes decompress() return early with a vector containing a single u8 set to 0. From what I can tell, I don't think any other code needs modification. From some quick testing of the example files that have the snippets I've shared above, it appears to work as intended - output is polys: Some(XML { connectivity: [], offsets: [] }) for instance.

My reasoning is that if a read is requested, we should try and read everything that we can correctly and return the exact contents in the file (except if invalid data is present). Full disclosure - I'm in 2 minds whether this should be an error or left to the user to handle the fact that they have empty sections - I'm leaning more towards the latter because an empty section, to me, doesn't seem like invalid data necessarily.

I'm curious about your thoughts on this type of case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions