-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Overarching goal: A user should be able to see the status of Jobs as they run in the background.
Definitions:
A Job is all of the work that is involved in loading the data from an input csv into the database.
Objectives:
- Jobs should occur in the background, and should not force the User to stay on the upload page while they run (forcing user to wait is current behavior)
- When a user submits a file for transformation, they should be redirected to a page that shows the list of running jobs and what stage they are in at that moment, along with a link to the google cloud storage folder for each stage that has been completed
- This page should update in real time
- The stages are
- Save initial data - the input data is saved exactly how it's received
- Transform initial data - The input data is transformed and split into one file per table to update
- Augment transformed data - The transformed data is augmented with existing database data so that each row of transformed data matches a row in the database
- Check augmented data - Each row in the augmented data is checked to see if it already exists in the database. Conflicts are removed
- Load non conflicting augmented data
Negative scenario:
- When a job has an error, the stage where the error occurred should be indicated clearly along with a summary of the error
Metadata
Metadata
Assignees
Labels
No labels