Skip to content

InternalServerError from OpenAI API is not retried in error handler #24

@asaddcorp123

Description

@asaddcorp123

Problem

When the OpenAI API returns an InternalServerError (via litellm), the proxy's error handler (handle_llm_exception in llm.py) does not catch or retry these exceptions. Instead, these errors are wrapped as UnknownLLMError and requests fail immediately, instead of retrying as with other transient errors.

Example

This curl request triggered the problem:

curl -X POST \
  https://api.openai.com/v1/ \
  -H 'Authorization: Bearer sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
  -H 'Content-Type: application/json' \
  -d '{
  {
    "model": "gpt-5-2025-08-07",
    "messages": [
      {
        "role": "user",
        "content": "What's 1 + 1?"
      }
    ],
    "extra_body": {
      "service_tier": "flex"
    }
  }
  }'

Error Trace

litellm.InternalServerError: InternalServerError: OpenAIException - The server had an error while processing your request. Sorry about that!
stack trace: Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/openai/openai.py", line 823, in acompletion
    headers, response = await self.make_openai_chat_completion_request(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/logging_utils.py", line 190, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/openai/openai.py", line 454, in make_openai_chat_completion_request
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/openai/openai.py", line 436, in make_openai_chat_completion_request
    await openai_aclient.chat.completions.with_raw_response.create(
  File "/usr/local/lib/python3.11/site-packages/openai/_legacy_response.py", line 381, in wrapped
    return cast(LegacyAPIResponse[R], await func(*args, **kwargs))

Expected

InternalServerError from litellm should be recognized as transient and retried by the proxy, just like openai.error.APIError and similar exceptions.

Suggested Fix

Update handle_llm_exception in llm.py to catch litellm.exceptions.InternalServerError in the same retry logic as other server errors. Example:

import litellm.exceptions

def handle_llm_exception(e: Exception):
    if isinstance(
        e,
        (
            openai.error.APIError,
            openai.error.TryAgain,
            openai.error.Timeout,
            openai.error.ServiceUnavailableError,
            litellm.exceptions.InternalServerError,  # Add this
        ),
    ):
        raise RetryConstantError from e
    # ... rest of the function

Labels

bug, error-handling

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions