generated from isamplesorg/python_template
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Geoparquet 12 is a compressed spatial data format that is convenient for consumers and is becoming widely supported.
Task here is to enable geoparquet as an export file format for iSamples.
Tooling for creating geoparquet is still a bit dynamic, but the following approach worked for me (there are likely optimizations that could be done).
- Retrieve the records in json lines
- Load the jsonlines into geopandas 3
- Export from geopandas to geoparquet 4
This worked for me (I could not determine if this requires loading the entire dataset into memory for processing, which may be an issue if using on the server):
import pandas as pd
import geopandas as gpd
src = "smithsonian"
json_src = f"{src}.jsonl"
with open(json_src, "r") as json_file:
df = pd.read_json(json_file, lines=True)
gdf = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(
df.producedBy_samplingSite_location_longitude,
df.producedBy_samplingSite_location_latitude),
crs="EPSG:4326"
)
gdf.to_parquet(f"{src}_geo.parquet")I think dependencies were:
pip install pandas
pip install geopandas
pip install geoarrow-pyarrow geoarrow-pandas
Footnotes
Metadata
Metadata
Assignees
Labels
No labels