014 - Allowing Downloads Of Filtered Data #415
eveleighoj
started this conversation in
Open design proposal
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Status
Draf
Intro
The Planning Data platform allows either complete download of an entire dataset or using the API to file to only the relevant entities. The aPI is limited to 500 rows, is also expected to return in a timely fashion and is limited to json or geojson formats. This presents a probably when we want to provide custom download links for let's say a single organisations data as a csv.
to facilitate this we are proposing two additional changes:
Detail
This is the new suggested container diagram, it adds the download lambda which will be responsible for retrieving the filtered data and converting to the relevant format. This is expected to be a download to the user so it doesn't need to deliver the entire file instantly.
Creation of parquet files
We will need to alter the current baking process that takes place when we create 'flattened' representations of the datasets from the entity data. this code is ran via the collection task container shown below
this is reliant on the digital-land-python python package so changes will be. required across both the digital-land-python and collection-task repos.
Download Lambda Container
The new container will need to be added. The suggested repo has been created and can be found here. It's aim is to use duckdb to query files in s3.
Alerts & Monitoing
Sentry
Can sentry monitor lambda functions? this could help identify unhandled problems with the running of the lambda function.
AWS mettics
AWS Alams
CI & CD
Security
Performance & Scalabillity
Testing
Duckdb can be. difficult to test when combined with s3. This is because the python API calls the local duckdb instance so python cannot mock objects easily which are accessed by duckdb.
There are two approaches:
Tests required:
Beta Was this translation helpful? Give feedback.
All reactions