PARQUET-2417: Add statistics support to geometry logical type#2971
PARQUET-2417: Add statistics support to geometry logical type#2971wgtmac merged 101 commits intoapache:masterfrom zhangfengcdt:feature-apache-parquet-2417-geospatial
Conversation
This PR is copied form this place: #1379
…e spherical edge is specified.
…apache-parquet-2417-geospatial
wgtmac
left a comment
There was a problem hiding this comment.
Thanks for the update! I have left some comments. I think we are reaching the finish line!
| } | ||
|
|
||
| @Test | ||
| public void testEPSG4326BasicReadWriteGeometryValue() throws Exception { |
There was a problem hiding this comment.
Thanks for adding these tests!
I think we are missing tests in following cases:
- verify geometry type metadata is well preserved.
- verify all kinds of geometry stats are preserved, including bbox, covering and geometry types.
- verify geo stats in the column index have been generated.
I can do these later.
|
Thanks! I will take a look after back from vacation. |
- add valid flag to BoundingBox impl - check geostatistics validity when writing to parquet - add tests for mix of valid and invalid geometry update
Tested the following cases after the changes: - Merging valid stats with null stats results in invalid stats with null components - Merging null stats with valid stats also results in invalid stats with null components - Merging valid stats with partially null stats (null bounding box) results in the expected behavior where only the bounding box becomes null
|
@wgtmac can you review again? Thank you! |
…ry, omit it from stats"
|
@wgtmac Thank you for the very helpful review comments! I believe I’ve addressed them all and have also added and updated some tests. Could you please take another look and let me know if I missed anything or if further changes are needed? Thanks again! |
wgtmac
left a comment
There was a problem hiding this comment.
Thanks for the quick update! Now it generally looks great.
Thanks for the review! The fixes should be all in now. |
|
I just merged it. Thanks @zhangfengcdt for working on this and everyone for the review! |
This PR implements the geo types that are introduced in the new parquet specification.
apache/parquet-format@94b9d63
Jira
Tests
Commits
Documentation