-
Notifications
You must be signed in to change notification settings - Fork 0
Upload large files (>=1GB) takes too long to stream to GCS #3
Copy link
Copy link
Open
Description
Problem
When a user submits a large file to manager's endpoint, they have to wait for a considerable amount of time before receiving a job's ID.
Solution
- Option 1: Delegate streaming to the background and returns the job's ID immediately. Drawback: even though this manager can take requests immediately after, its memory will take heavy hit if multiple background jobs are initiated.
- Option 2: MapReduce. Distribute the chunks of the file across multiple managers. Each manager will compute its own character frequency table and upload its assigned chunk to the correct bucket. At the end, the master manager will collect these tables and merge into a final table. Way more complex with many coordination.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels