Skip to content

feat: Control maximum concurrent redis requests to avoid pool exhaustion#112

Merged
dolfim-ibm merged 3 commits intomainfrom
feat-gated-redis
Mar 25, 2026
Merged

feat: Control maximum concurrent redis requests to avoid pool exhaustion#112
dolfim-ibm merged 3 commits intomainfrom
feat-gated-redis

Conversation

@dolfim-ibm
Copy link
Copy Markdown
Member

No description provided.

…ion (#6)

* add initial ray_fair orchestrator

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* implementation with ray serve

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix serialization

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* more serialization fixes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* cannot msgpack the DocumentStream

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* hardening notifier

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* cleanup raydata param and add log level

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* cleanup params and implement object store memory

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add mtls

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* more logging

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* more logging

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* launch all tasks

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename params

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix creation of redis pools

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix: Watchdog: update the RQ job statusto FAILED and remove it from StartedJobRegistry (#107)

* fix: Watchdog: update the RQ job statusto FAILED and remove it from StartedJobRegistry

Signed-off-by: Pawel Rein <pawel.rein@prezi.com>

* fix formatter/linter

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Pawel Rein <pawel.rein@prezi.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>

* add metadata for orchestrator

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fixes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add dispatch state

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename workers to actors

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename fair_ray to ray

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* more rename

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix dispatch vs running

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add redis manager to the actors

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix running metrics

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix setting rtunning

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* actor cleanup

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* feat: Expose classification filters for picture description (#105)

Preserve legacy picture description filters

Signed-off-by: drk <drukpa1455@gmail.com>

* feat: add on_result_fetched() no-op lifecycle hook to BaseOrchestrator

* feat: add consumed_ttl and on_result_fetched() to RQOrchestrator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add consumed_ttl and on_result_fetched() to LocalOrchestrator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add expire_result() to RedisStateManager

This method sets a TTL on an existing result key in Redis, enabling
crash-safe single-use deletion of results after they are fetched.
Implements test-driven development with unit test verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add consumed_ttl and on_result_fetched() to RayOrchestrator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add connect() guard to expire_result matching peer methods

All 20+ other methods in RedisStateManager check `if not self.redis`
before using the client. expire_result was missing this guard and would
raise RuntimeError if called before connection establishment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ensure no asyncio.task can be GCed early

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* apply re-formatting

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* add ray actor logging to jobkit

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: run RQ Job.fetch/get_status/get_position in thread pool to avoid blocking the event loop

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Ensure control over max ongoing requests per ray replica

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* refactor: rename consumed_ttl back to result_removal_delay

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* upgrade uv.lock

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* move Redis gating and RQ durable status into jobkit

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Pawel Rein <pawel.rein@prezi.com>
Signed-off-by: drk <drukpa1455@gmail.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Paweł Rein <pawel.rein@prezi.com>
Co-authored-by: drk <136856552+drukpa1455@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@dolfim-ibm dolfim-ibm requested review from cau-git and vku-ibm March 24, 2026 13:48
@mergify
Copy link
Copy Markdown

mergify bot commented Mar 24, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

cau-git
cau-git previously approved these changes Mar 24, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 24, 2026

DCO Check Passed

Thanks @dolfim-ibm, all your commits are properly signed off. 🎉

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 76.61017% with 69 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
docling_jobkit/orchestrators/rq/orchestrator.py 77.41% 49 Missing ⚠️
docling_jobkit/orchestrators/ray/orchestrator.py 61.90% 16 Missing ⚠️
docling_jobkit/orchestrators/_redis_gate.py 85.00% 3 Missing ⚠️
docling_jobkit/orchestrators/ray/config.py 92.30% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@dolfim-ibm dolfim-ibm changed the title feat: Ensure control over max ongoing requests per ray replica feat: Control maximum concurrent redis requests to avoid pool exhaustion Mar 25, 2026
@dolfim-ibm dolfim-ibm merged commit bc27836 into main Mar 25, 2026
10 checks passed
@dolfim-ibm dolfim-ibm deleted the feat-gated-redis branch March 25, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants