feat(uptime): Add ability to use queues to manage parallelism #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

adamsaimi wants to merge 1 commit into kafka-consumer-parallel-before from kafka-consumer-parallel-after

Owner

adamsaimi commented Oct 25, 2025

Test 9


          feat(uptime): Add ability to use queues to manage parallelism (#95633)

33f4fdd

One potential problem we have with batch processing is that any one slow
item will clog up the whole batch. This pr implements a queueing method
instead, where we keep N queues that each have their own workers.
There's still a chance of individual items backlogging a queue, but we
can try increased concurrency here to reduce the chances of that
happening

<!-- Describe your PR here. -->

adamsaimi commented

View reviewed changes

src/sentry/remote_subscriptions/consumers/queue_consumer.py

+                  """
+                  def __init__(self) -> None:
+                      self.all_offsets: dict[Partition, set[int]] = defaultdict(set)

Owner Author

adamsaimi Oct 25, 2025

[Performance] Unbounded Offset Tracking Sets

Problem: The OffsetTracker stores all observed and outstanding offsets in sets (all_offsets, outstanding) without size limits, leading to excessive memory consumption and potential OOM errors.
Fix: Implement a bounded size or an eviction policy for these sets to prevent unbounded memory growth.

adamsaimi commented

View reviewed changes

src/sentry/remote_subscriptions/consumers/queue_consumer.py

+                      while not self.shutdown:
+                          try:
+                              work_item = self.work_queue.get()
+                          except queue.ShutDown:

Owner Author

adamsaimi Oct 25, 2025

[Bug] Incorrect Queue Shutdown Mechanism

Problem: The OrderedQueueWorker uses a non-standard queue.ShutDown exception and FixedQueuePool calls a non-existent shutdown method, preventing graceful worker termination and causing resource leaks.
Fix: Replace queue.ShutDown with a sentinel value or use queue.Empty with a timeout and a self.shutdown flag; remove the invalid q.shutdown call and ensure workers are joined.

adamsaimi commented

View reviewed changes

src/sentry/remote_subscriptions/consumers/queue_consumer.py

+                              start = max(last_committed + 1, min_offset)
+                              highest_committable = last_committed
+                              for offset in range(start, max_offset + 1):

Owner Author

adamsaimi Oct 25, 2025

[Performance] Inefficient Linear Scan for Committable Offsets

Problem: The get_committable_offsets function uses a linear scan, which can be a performance bottleneck for large gaps between last_committed and max_offset.
Fix: Optimize by tracking contiguous blocks of completed offsets or using a data structure for faster identification.

adamsaimi commented

View reviewed changes

src/sentry/remote_subscriptions/consumers/queue_consumer.py

+                      with self._get_partition_lock(partition):
+                          self.last_committed[partition] = offset
+                          # Remove all offsets <= committed offset
+                          self.all_offsets[partition] = {o for o in self.all_offsets[partition] if o > offset}

Owner Author

adamsaimi Oct 25, 2025

[Performance] Inefficient Set Reconstruction for Offset Cleanup

Problem: Reconstructing self.all_offsets[partition] by iterating and filtering is computationally expensive for large sets, impacting commit loop performance.
Fix: Explore alternative data structures or methods for more efficient range-based deletion or filtering if all_offsets is expected to be large.

adamsaimi commented

View reviewed changes

src/sentry/remote_subscriptions/consumers/queue_consumer.py

+                      self.workers: list[OrderedQueueWorker[T]] = []
+                      for i in range(num_queues):
+                          work_queue: queue.Queue[WorkItem[T]] = queue.Queue()

Owner Author

adamsaimi Dec 23, 2025

[Performance] Unbounded Queues Cause Resource Exhaustion

Problem: queue.Queue() instances without a maxsize can grow indefinitely, leading to excessive memory consumption and OutOfMemory errors.
Fix: Define a reasonable maxsize for queues to prevent memory exhaustion and provide natural backpressure.
pythonwork_queue: queue.Queue[WorkItem[T]] = queue.Queue(maxsize=some_configurable_limit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet