-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Flatten tool uses a lot of memory and CPU when it's run on an XLSX file containing many blank rows.
To replicate:
- Follow instructions to download the 360 Giving schema: https://docs.opendataservices.coop/projects/flatten-tool/en/latest/usage-360/
- Use this example file
example-grants-with-blank-rows.xlsx- This file contains a grant at row 1, and another grant at row 1,000,000 with blank rows in between.
nice flatten-tool unflatten --schema 360-giving-schema.json --convert-titles --root-list-path='grants' -f xlsx -o out.json example-grants-with-blank-rows.xlsx- probably a good idea to
niceit in case it uses all your RAM (NOTE: May use 16GiB or more RAM)
- probably a good idea to
- Open your favourite resource monitoring tool and watch the memory usage grow
We encountered this issue when a file was uploaded to the 360 Giving DQT containing many blank lines, flatten-tool consumed all the memory and brought the server offline.