Skip to content

Key configuration: retry codes whitelist #367

@jetnet

Description

@jetnet

Routine Checklist

  • I have checked for similar issues.
  • I have updated to the latest version.
  • I have read the README and confirmed that the current version does not meet my needs.
  • I understand and am willing to follow up on this issue, assist with testing, and provide feedback.
  • I understand and agree to the above, and I understand that the maintainers have limited time, so issues that do not follow the rules may be ignored or closed directly.

Description

Currently, the framework retries all available keys on any error response type from the backend.
Even then the code is valid and there is no limit issues, when the client sends a misconfigured request (e.g. wrong model name or incorrect parameters), the gpt-load framework sends retry requests using the same wrong payload.

Desired behaviour

The framework would have an additional input field with a list (CSV) of the backend response codes, which should be re-tried with different keys, e.g.:

  • Section: "Settings" --> "Key Configuration" --> "Retry codes" (Retry on the following backend response codes only): 401, 403, 429
    • 401: Invalid or Missing API Key
    • 403: Forbidden or Insufficient Permissions
    • 429: Too Many Requests

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions