Skip to content

Temporary fix to narrow the scope of cached TaskRuns to prevent OOMs#336

Draft
mshaposhnik wants to merge 3 commits intokonflux-ci:mainfrom
mshaposhnik:caches_fix
Draft

Temporary fix to narrow the scope of cached TaskRuns to prevent OOMs#336
mshaposhnik wants to merge 3 commits intokonflux-ci:mainfrom
mshaposhnik:caches_fix

Conversation

@mshaposhnik
Copy link
Contributor

@mshaposhnik mshaposhnik commented Nov 7, 2024

We have OOMs on prod, with mpc container consuming over 8Gb RAM. This is supposedly due ti high number of TaskRuns on cluster (>10000, and we cache them all)
This PR limits caches TRs to just those belonging to MPC

Possibe fix for https://issues.redhat.com/browse/KFLUXINFRA-887

Signed-off-by: Max Shaposhnyk <mshaposh@redhat.com>
@mshaposhnik mshaposhnik changed the title Temporary fix to norrow the scope of cached TaskRuns to prevent OOMs Temporary fix to narrow the scope of cached TaskRuns to prevent OOMs Nov 7, 2024
Copy link
Member

@ifireball ifireball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand what would be the downside of doing this. Worst case - if we need to reconcile a task that is not in the cache, then we end up with slower reconciliation for it?

@filariow
Copy link
Member

filariow commented Jul 22, 2025

To further improve memory usage, we can also consider using TransformFunc to trim resources before storing them in the cache. Like removing managedFields, ownerReferences, unneeded annotations, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants