@@ -94,6 +94,37 @@ Bounding box is defined as the thrift struct below in the representation of
9494min/max value pair of coordinates from each axis. Note that X and Y Values are
9595always present. Z and M are omitted for 2D geospatial instances.
9696
97+ Writers should follow the guidelines below when calculating bounding boxes in
98+ the presence of invalid values. An invalid geospatial value refers to any of
99+ the following: ` NaN ` , ` null ` , ` does not exist ` (e.g., LINESTRING EMPTY), or
100+ ` out of bounds ` (e.g., ` x < -180 ` or ` x > 180 ` for ` GEOGRAPHY ` types):
101+
102+ * X and Y: Skip any value where X or Y is invalid and processing the
103+ remaining X or Y values. Do not produce a bounding box if all X or all Y
104+ values are invalid.
105+
106+ * Z: Skip any Z value that is invalid and continue processing the remaining
107+ Z values. Omit Z from the bounding box if all Z values are invalid
108+
109+ * M: Skip any M value that is invalid and continue processing the remaining
110+ M values. Omit M from the bounding box if all M values are invalid
111+
112+ Readers should follow the guidelines below when examining bounding boxes:
113+
114+ * No bounding box: No assumptions can be made about the presence or absence
115+ of invalid values. Readers may need to load all individual coordinate
116+ values for validation.
117+
118+ * A bounding box is present:
119+ * X and Y: X and Y of the bounding box must be present. Readers should
120+ proceed using these values.
121+ * Z: If Z of the bounding box are missing, readers should make no
122+ assumptions about invalid values and may need to load individual
123+ coordinates for validation.
124+ * M: If M of the bounding box are missing, readers should make no
125+ assumptions about invalid values and may need to load individual
126+ coordinates for validation.
127+
97128For the X values only, xmin may be greater than xmax. In this case, an object
98129in this bounding box may match if it contains an X such that ` x >= xmin ` OR
99130` x <= xmax ` . This wraparound occurs only when the corresponding bounding box
@@ -104,19 +135,6 @@ crosses the antimeridian line. In geographic terminology, the concepts of `xmin`
104135For ` GEOGRAPHY ` types, X and Y values are restricted to the canonical ranges of
105136[ -180, 180] for X and [ -90, 90] for Y.
106137
107- To produce ` GeospatialStatistics ` , writers must omit zmin and zmax if and
108- only if there are zero non-NaN Z values in the column chunk, and must omit mmin
109- and mmax if and only if there are zero non-NaN M values. The bounding box must
110- be omitted entirely if and only if there are zero non-NaN X values or zero
111- non-NaN Y values in the column chunk. If Z or M values are missing, the writer
112- may still include a bounding box using only the available dimensions.
113-
114- Readers may interpret the absence of a bounding box, zmin/zmax, or mmin/mmax as
115- an indication that all corresponding values are null, and may use this
116- information to skip data during predicate evaluation. For example, a reader may
117- skip a row group if the bounding box is absent, indicating that all X and Y
118- coordinates are null.
119-
120138``` thrift
121139struct BoundingBox {
122140 1: required double xmin;
0 commit comments