Skip to content

Rate limiting #64

@bastianolea

Description

@bastianolea

I've moved from running mall locally with ollama to use models provided by GitHub Copilot, but I've found that any request I make gets rate limited almost instantly, making it unusable.

Is there something I need to consider or configure? I'm only trying to summarize 24 rows and it gets rate limit of 8 seconds almost right away, and then goes for rate limits of 50 seconds before returning nothing.

options(.mall_chat =  ellmer::chat_github(model = "mistral-ai/mistral-medium-2505"))

resumen <- datos |> 
  distinct(variable, tema) |> 
  mall::llm_summarize(tema, max_words = 7, pred_name = "tema_resumen")

It also seems the problem is due to requests being done in parallel. Is there any way to change this?

Error in `mutate()`:
ℹ In argument: `tema_resumen = llm_vec_summarize(x = tema, max_words = max_words,
  additional_prompt = additional_prompt)`.
Caused by error in `req_perform_parallel()`:
! HTTP 429 Too Many Requests.
ℹ Rate limit of 24 per 60s exceeded for UserByMinute. Please wait 0 seconds before retrying.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions