Background
At the moment, Cloud Pipeline keeps full history of runs. In some cases the users do not want to keep failed / stopped / duplicated jobs.
Approach
- Automated process (separate deployment) runs on a scheduled basis (every day, by default) and implements the following logic:
- Gets all runs from last start date/time, which are younger than N days (default 30, but configurable)
- Filters out runs based on run status (configuration). E.g. all failed/stopped runs.
- Deletes the following assets:
- Output data of the run. All paths defined by: "output" (type) parameters and/or specific parameters names (configuration)
- Archive run itself, so it does not disappear from the billing (configuration, default - false)
Background
At the moment, Cloud Pipeline keeps full history of runs. In some cases the users do not want to keep failed / stopped / duplicated jobs.
Approach