-
Notifications
You must be signed in to change notification settings - Fork 0
feat(config): add backoff with jitter to config fetch retries #278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a retry mechanism with backoff and jitter to the configuration fetching logic, aiming to improve resilience during repeated failures. However, the implementation uses quadratic backoff (n²) rather than the exponential backoff (2^n) described in the PR title and description.
Key Changes:
- Added a new
Backoffutility incommon/backoff.gothat implements quadratic backoff with jitter (not exponential as described) - Refactored
ConfigHandler.fetchLoopinconfig/config.goto retry failed fetches with increasing delays instead of waiting a fixed poll interval - Defined
maxRetryDelayconstant of 2 minutes to cap the maximum backoff duration
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| common/backoff.go | New utility implementing quadratic backoff with jitter; includes Backoff struct with Wait and Reset methods for managing retry delays |
| config/config.go | Integrated backoff into fetchLoop method to retry config fetches with progressive delays on failure, replacing the previous fixed-interval retry approach |
Critical Issue Identified: The refactored fetch loop contains a bug where context cancellation can cause an infinite tight loop, as the loop never checks for context cancellation when continuously failing (comment #6).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| package common | ||
|
|
||
| import ( | ||
| "context" | ||
| "math/rand/v2" | ||
| "time" | ||
| ) | ||
|
|
||
| const waitScale = 10 * time.Millisecond // scale factor for backoff timing | ||
|
|
||
| // Backoff implements an exponential backoff strategy with jitter. | ||
| type Backoff struct { | ||
| n int // number of consecutive failures | ||
| maxWait time.Duration | ||
| } | ||
|
|
||
| func NewBackoff(maxWait time.Duration) *Backoff { | ||
| return &Backoff{ | ||
| maxWait: maxWait, | ||
| } | ||
| } | ||
|
|
||
| // Wait waits for the appropriate backoff duration based on the number of consecutive failures or | ||
| // until the context is done. | ||
| func (b *Backoff) Wait(ctx context.Context) { | ||
| if ctx.Err() != nil { | ||
| return | ||
| } | ||
|
|
||
| b.n++ | ||
| // exponential backoff: waitScale * n^2, capped at maxWait | ||
| wait := waitScale * time.Duration(b.n*b.n) | ||
| wait = min(wait, b.maxWait) | ||
|
|
||
| // add jitter between 80% and 120% of wait time to avoid thundering herd | ||
| jitter := 0.8 + 0.4*rand.Float64() | ||
| wait = time.Duration(float64(wait) * jitter) | ||
| select { | ||
| case <-ctx.Done(): | ||
| case <-time.After(wait): | ||
| } | ||
| } | ||
|
|
||
| // Reset resets the backoff counter. | ||
| func (b *Backoff) Reset() { | ||
| b.n = 0 | ||
| } |
Copilot
AI
Jan 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new Backoff utility lacks test coverage. The common package has a pattern of testing utilities (e.g., common/deviceid/deviceid_test.go). Consider adding tests to verify the backoff behavior, including: the quadratic growth of delays, the jitter range (80-120%), the maxWait cap, proper context cancellation handling, and the Reset functionality.
This pull request introduces a quadratic backoff strategy with jitter to the configuration fetching logic, improving resilience against repeated failures and reducing the risk of overwhelming the server during outages. The main changes are the addition of a reusable backoff utility and its integration into the config fetch loop.
Backoff utility implementation:
Backoffstruct incommon/backoff.gothat implements quadratic backoff with jitter, including methods for waiting and resetting the backoff counter.Configuration fetch loop improvements:
ConfigHandler.fetchLoopmethod inconfig/config.go, so that on fetch failures, the system waits with increasing delays (up to a maximum) before retrying, and resets the delay after a successful fetch.maxRetryDelayconstant (2 minutes) inconfig/config.goto cap the maximum backoff duration.