feat: add configurable rerank timeout (rerankTimeoutMs)#371
feat: add configurable rerank timeout (rerankTimeoutMs)#371ggzeng wants to merge 1 commit intoCortexReach:masterfrom
Conversation
The cross-encoder rerank request timeout was hardcoded at 5 seconds, which is too aggressive for self-hosted rerank services that may need 6-7 seconds to respond. This adds a option to RetrievalConfig and raises the default to 10 seconds. - Add field to RetrievalConfig interface - Use instead of hardcoded 5000 - Backwards compatible: existing configs without the field get 10s default
|
Hey @ggzeng, thanks for the contribution! Just a heads-up — PR #356 by @jlin53882 was submitted about 11 hours earlier and implements the same One notable difference: your PR raises the default timeout from 5s to 10s, while #356 keeps the existing 5s default. The 10s default is arguably a better choice for self-hosted endpoints, but that's a design decision worth discussing in one place. Could you coordinate with #356? If the higher default is important to you, it might be simplest to suggest that change on #356 rather than maintaining two parallel PRs for the same feature. If your approach differs in a meaningful way I'm missing, happy to hear more! Thanks! |
|
Hi @ggzeng — thank you for taking the time to submit this PR! Making After reviewing both contributions, we're going to close this in favor of #356, which was submitted a bit earlier and covers the same feature more comprehensively — it adds the JSON Schema definition in That said, your PR raises a valid point that #356 doesn't address: the default timeout value. A 10s default may be more appropriate for self-hosted reranker setups where latency is higher. We'd love it if you could leave a comment on #356 suggesting they consider raising the default — that feedback would be genuinely useful there. You can follow along here: #356 Thanks again for the contribution, and we hope to see you around the project! |
Summary
The cross-encoder rerank request timeout was hardcoded at 5 seconds, which is too aggressive for self-hosted rerank services (e.g. HuggingFace TEI via Infinity) that may need 6-7 seconds to respond.
Changes
rerankTimeoutMsoptional field toRetrievalConfiginterface5000withconfig.rerankTimeoutMs ?? 10_000Usage
{ "retrieval": { "rerank": "cross-encoder", "rerankTimeoutMs": 20000 } }Backwards Compatibility
Fully backwards compatible. Existing configs without
rerankTimeoutMswill get the new 10s default.