Skip to content

sending err back to fetch to trigger backoff retry#2136

Merged
aswanidutt merged 15 commits intomainfrom
aswanidutt/VAULT-38829-Vault-Agent-retrybackoff-config
Mar 4, 2026
Merged

sending err back to fetch to trigger backoff retry#2136
aswanidutt merged 15 commits intomainfrom
aswanidutt/VAULT-38829-Vault-Agent-retrybackoff-config

Conversation

@aswanidutt
Copy link
Collaborator

@aswanidutt aswanidutt commented Feb 19, 2026

issue: When a rotating secret that has rotation_period but ttl=0, it should not be treated as a rotating secret. Instead, it should wait and retry exponentially using default retry_config

What Changed:

  • Modified leaseCheckWait() to return (time.Duration, error) instead of just time.Duration
  • Added error for TTL=0 detection
  • When a rotating secret returns ttl=0, the error propagates to trigger the existing retry mechanism
  • Updated VaultReadQuery and VaultWriteQuery to handle and propagate the error
  • Added comprehensive test coverage for TTL=0 scenarios

Fixes: VAULT-38829-https://hashicorp.atlassian.net/browse/VAULT-38829?

before this change: checking to rotate every second

image

tested by adding custom retry config in agent.hcl


vault {
  address = "http://127.0.0.1:8200"
  
  retry {
    attempts = 0          # Number of retry attempts (0 = unlimited)
  }
}
auto_auth {
  method "token_file" {
    min_backoff = "250ms"    # Minimum backoff time between retries
    max_backoff = "3m"       # Maximum backoff time between retries
    config = {
      token_file_path = "./auto-auth"
    }
  }
}

after the code change : Exponential backoff: 250ms,500ms,1s,2s, 4s, 8s, 16s, 32s, 64s, the max is 5m

image

@aswanidutt aswanidutt requested a review from tvoran February 24, 2026 15:16
Copy link
Member

@tvoran tvoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good to me, just some minor suggestions in the test.

aswanidutt and others added 2 commits February 26, 2026 10:58
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
@aswanidutt aswanidutt requested a review from tvoran February 26, 2026 16:59

func TestVaultRenewDuration(t *testing.T) {
renewable := Secret{LeaseDuration: 100, Renewable: true}
renewableDur := leaseCheckWait(&renewable).Seconds()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why the .Seconds() was dropped here? That seems to have forced a lot of changes unrelated to the goal of this PR that I'm not sure are necessary?

At this point I'd say either set this back to .Seconds(), or apply the suggestions I made to the rest of TestVaultRenewDuration():

Copy link
Collaborator Author

@aswanidutt aswanidutt Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.Seconds() dropped because now the leaseCheckWait is returning multiple values not just duration but also error. updated with your suggestions to the rest of TestVaultRenewDuration():

@aswanidutt aswanidutt requested a review from tvoran February 27, 2026 16:24
Copy link
Member

@tvoran tvoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion in the test, but otherwise 👍

renewableDur := leaseCheckWait(&renewable).Seconds()
if renewableDur < 16 || renewableDur >= 34 {
t.Fatalf("renewable duration is not within 1/6 to 1/3 of lease duration: %f", renewableDur)
renewableDur, _ := leaseCheckWait(&renewable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably still a good idea to check the value of err whenever leaseCheckWait() is called, just to be safe.

@aswanidutt aswanidutt changed the title sending err back to fetch to trigger retry sending err back to fetch to trigger backoff retry Mar 2, 2026
Copy link
Collaborator

@Manishakumari-hc Manishakumari-hc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aswanidutt aswanidutt merged commit 52cd7e5 into main Mar 4, 2026
54 checks passed
@aswanidutt aswanidutt deleted the aswanidutt/VAULT-38829-Vault-Agent-retrybackoff-config branch March 4, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants