-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Hi there,
I have recently been working on a custom exporting feature in our project. The idea is, that a user is able to export custom (and often rather large) datasets from the GUI as an Excel file. I am able to stream the data directly from the database through knex, modify whatever I need with through2 and finally create an excel file with xlsx. We are still talking streams here, all the way.
However, then I hit a wall when I dug through the library and found out, that under the hood, .xlsx file is just a zip archive that is created using the archiver library. Now seemingly, the archiver is able to process incoming steams:
archive.append(this.sheetStream, { name: 'xl/worksheets/sheet1.xml' });However, it seems that until the stream is ended, the entry event is not emitted and the whole file is somehow multiplied and buffered in memory. This means two things as a result:
- When exporting larger datasets (50+Mb), the RAM usage hits multitudes of the file size, think low hundreds for a single export
- I am unable to stream the file creation directly to the user, he has to wait until all the data is exported, the zip archive is created and then the resulting zip archive can be transferred. This is a bit misleading, but the reason for this is the nature of the excel file. There is one large file (99.9% volume) and a dozen of small ones.
My question is simple: Is there even a possible solution for this specific case? Obviously, if I could read the zip archive while it's being created (before the stream is ended), that would solve my issue. But I presume that the nature of the whole comprimation process does not allow for this.