Skip to content

feat: implement equal slot distribution for providers in priority queue#552

Draft
claretnnamocha wants to merge 3 commits intopaycrest:mainfrom
claretnnamocha:feature/equal-provider-slot-distribution
Draft

feat: implement equal slot distribution for providers in priority queue#552
claretnnamocha wants to merge 3 commits intopaycrest:mainfrom
claretnnamocha:feature/equal-provider-slot-distribution

Conversation

@claretnnamocha
Copy link

@claretnnamocha claretnnamocha commented Oct 28, 2025

Description

This PR addresses an issue in the priority queue system where providers were not receiving equal opportunities to process orders. Previously, the number of queue slots allocated to each provider was directly proportional to the number of tokens they supported, creating an unfair advantage.

Problem:

  • Provider A with 5 supported tokens received 5 queue slots
  • Provider B with 1 supported token received 1 queue slot
  • Result: Provider A had a 5x higher probability of being selected for orders

Solution:
This PR implements a two-pass equal distribution algorithm with deterministic ordering:

  1. First Pass: Collects all valid token entries for each provider and tracks the maximum token count
  2. Second Pass: Allocates equal slots to all providers by cycling through their available tokens in sorted order

Key Changes:

  • Refactored CreatePriorityQueueForBucket() in services/priority_queue.go
  • All providers now receive equal number of slots (matching the provider with the most tokens)
  • Providers with fewer tokens have their entries cycled proportionally using modulo operation
  • Added deterministic provider ID sorting for reproducible queue ordering (aids debugging and testing)
  • Added a 20-slot maximum cap per provider to prevent excessive queue sizes
  • Enhanced logging to monitor distribution metrics (providers count, slots per provider, total queue size)

Example:

  • Provider A: 5 tokens → 5 slots (each token once)
  • Provider B: 2 tokens → 5 slots (pattern: T1, T2, T1, T2, T1)
  • Provider C: 1 token → 5 slots (same token 5 times)
  • Result: All providers have equal 1-in-15 chance for first position

Breaking Changes: None - the implementation maintains the existing queue structure and Redis key format

Alternatives Considered:

  • Token-specific queues: Would require major architectural changes and increase Redis key count
  • Manual slot configuration: Adds complexity and maintenance overhead for admins
  • Weighted distribution: Still unfair to providers with fewer tokens

References

Addresses internal discussion regarding fair provider slot allocation in the priority queue system.

Testing

This PR adds comprehensive test coverage with 5 new test cases (754 lines):

  1. TestEqualSlotDistribution: Tests core functionality with 3-token vs 1-token providers

    • Verifies both receive equal slots (3 each)
    • Confirms single-token provider's token is repeated correctly
    • Validates multi-token provider has all unique tokens represented
  2. TestSlotCapAt20: Tests maximum slot limitation

    • Creates provider with 25 tokens
    • Verifies queue is capped at exactly 20 slots
    • Confirms exactly 20 unique tokens are present
  3. TestNoValidProvidersExitEarly: Tests edge case handling

    • Provider with no tokens configured
    • Verifies graceful exit with empty queue
  4. TestTwoProvidersWithDifferentTokenCounts: Real-world scenario test

    • 2-token provider vs 5-token provider
    • Verifies equal distribution (5 slots each)
    • Validates cycling pattern (3+2 distribution for 2-token provider)
  5. TestDeterministicQueueOrder: Tests reproducibility

    • Creates 3 providers and runs queue generation multiple times
    • Verifies identical queue ordering across multiple runs
    • Confirms providers appear in sorted ID order

Manual Testing:
To verify in staging/production:

  1. Check Redis keys with pattern bucket_* to see queue entries
  2. Verify log entries show equal TargetSlotsPerProvider for all providers
  3. Monitor order assignment distribution across providers over time
  4. Confirm queue ordering remains consistent across rebuilds

Environment:

  • Language: Go 1.21+

  • Database: PostgreSQL with Ent ORM

  • Cache: Redis

  • Testing: In-memory SQLite + miniredis

  • This change adds test coverage for new/changed/fixed functionality

Checklist

  • I have added documentation and tests for new/changed/fixed functionality in this PR
  • All active GitHub checks for tests, formatting, and security are passing
  • The correct base branch is being used, if not main

By submitting a PR, I agree to Paycrest's Contributor Code of Conduct and Contribution Guide.

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened authorization validation checks for enhanced security.
  • Refactor

    • Optimized priority queue construction with improved deterministic slot distribution across multiple providers, capping slots at 20 per provider.
    • Enhanced logging for better visibility into queue operations.
  • Tests

    • Expanded comprehensive test coverage for priority queue slot distribution, provider ordering, and edge cases.
    • Improved test reliability with enhanced error handling.

- Refactored CreatePriorityQueueForBucket to use two-pass algorithm
- All providers now receive equal number of queue slots regardless of token count
- Providers with fewer tokens have their tokens cycled proportionally
- Added 20-slot cap per provider to prevent excessive queue sizes
- Added comprehensive logging for distribution metrics
- Added 4 unit tests covering equal distribution, capping, and edge cases

Fixes issue where providers with more tokens dominated the queue
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

Walkthrough

Refactors bucket priority-queue construction to a deterministic two-pass algorithm (collect per-provider token entries, then distribute slots evenly with a 20-slot cap), adds extensive tests for distribution/capping/determinism, tightens middleware authorization checks, and adjusts a test to fail fast on setup error.

Changes

Cohort / File(s) Summary
Priority Queue Core Logic
services/priority_queue.go
Adds deterministic two-pass construction: pass one collects per-provider valid token entries (providerID, token symbol, rate, min, max) and tracks max entries; enforces a 20-slot cap per provider and early-exits with a warning if no valid providers. Pass two sorts providers by ID, computes targetSlotsPerProvider, cycles per-provider entries to enqueue serialized data (modulo cycling), enriches enqueue error logs with ProviderID, and logs final metrics (currency, min/max, provider count, target slots, total queue size). Maintains existing queue rotation behavior and public signatures.
Priority Queue Tests
services/priority_queue_test.go
Adds tests: TestEqualSlotDistribution, TestSlotCapAt20, TestNoValidProvidersExitEarly, TestTwoProvidersWithDifferentTokenCounts, and TestDeterministicQueueOrder to validate equal distribution, 20-slot cap, early exit when no valid providers, cycling behavior, and deterministic sorted provider ordering across runs.
Authorization Middleware
routers/middleware/auth.go
Tightens authorization checks in OnlySenderMiddleware and OnlyProviderMiddleware by changing conditional from requiring both key missing AND nil scope to requiring key missing OR nil scope (i.e., `if !ok
Test Failure Behavior
tasks/tasks_test.go
Replaces a direct assertion of error with a conditional that calls t.FailNow() on setup error to ensure immediate test termination on failure.

Sequence Diagram(s)

sequenceDiagram
    participant Init as Bucket Build Start
    participant Collect as First Pass (Collect)
    participant Distribute as Second Pass (Distribute)
    participant Redis as Redis Queue
    participant Metrics as Metrics Logger

    Init->>Collect: Begin collection of per-provider token entries

    rect rgb(240,248,255)
      Note over Collect: For each provider: gather valid tokens\ntrack per-provider entries, track max entries\ncap entries per provider at 20
      Collect->>Collect: Append serialized entries to providerEntries
      Collect-->>Init: If no valid providers -> log warning & exit
    end

    alt Providers available
      Collect->>Distribute: Sorted provider list + counts
      rect rgb(255,247,235)
        Note over Distribute: Compute targetSlotsPerProvider\nIterate providers deterministically (sorted), cycle tokens using modulo\nSerialize & enqueue entries (include ProviderID)
        Distribute->>Redis: Enqueue requests
        Redis-->>Distribute: Ack / error (logged with ProviderID)
      end
      Distribute->>Metrics: Emit final metrics (currency,min,max,providers,targetSlots,totalQueueSize)
    end

    Metrics-->>Init: Complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay attention to two-pass logic and edge cases (empty providers, 20-slot cap, integer division/rounding).
  • Verify deterministic provider sorting and exact cycling/index math.
  • Review enqueue error logging additions and metrics fields for correctness.
  • Check middleware conditional change for intended security behavior.
  • Confirm tests are deterministic and not flaky.

Suggested reviewers

  • 5ran6
  • onahprosper

Poem

🐰 I hop through buckets, clean and bright,

I gather tokens, set them right,
I spin them fair in ordered line,
Twenty hops max — each slot defined,
Queues hum calm beneath moonlight. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "feat: implement equal slot distribution for providers in priority queue" is clear, specific, and directly summarizes the primary change in the changeset. It accurately reflects the main objective of the PR, which is to implement an equal slot distribution algorithm for providers in the priority queue system. The title is concise and uses appropriate conventional commit formatting (feat:) to indicate a new feature. While the changeset includes secondary modifications to auth middleware and test utilities, the title appropriately highlights the central change that is the focus of the PR's objectives.
Description Check ✅ Passed The pull request description is comprehensive and well-structured, covering all required sections from the template. The Description section clearly explains the problem, proposed solution, key changes, and provides concrete examples. The References section addresses the motivation for the change. The Testing section is notably thorough, detailing five new test cases with specific functionality each validates, manual testing steps, and environment specifications—with the test coverage checkbox appropriately checked. The Checklist section is completed with relevant items marked. All required information is present and detailed enough for reviewers to understand the change's scope, implementation, and validation approach.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
services/priority_queue.go (1)

307-325: Consider using deterministic iteration for reproducibility.

The second pass iterates over providerTokenEntries (a map), which has non-deterministic iteration order in Go. While providers are intentionally randomized at line 155, the non-deterministic map iteration here is unintentional and could make debugging more difficult.

If reproducible queue ordering is desired (for debugging or testing), consider sorting provider IDs before iteration:

 	// Second pass: distribute slots equally by cycling through each provider's tokens
+	// Sort provider IDs for deterministic iteration order
+	providerIDs := make([]string, 0, len(providerTokenEntries))
+	for providerID := range providerTokenEntries {
+		providerIDs = append(providerIDs, providerID)
+	}
+	sort.Strings(providerIDs)
+
-	for providerID, entries := range providerTokenEntries {
+	for _, providerID := range providerIDs {
+		entries := providerTokenEntries[providerID]
 		numEntries := len(entries)

Note: You'll need to import "sort" at the top of the file.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 793c80d and 5aa4c93.

📒 Files selected for processing (2)
  • services/priority_queue.go (3 hunks)
  • services/priority_queue_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
services/priority_queue.go (2)
utils/logger/logger.go (5)
  • WithFields (76-109)
  • Fields (73-73)
  • Warnf (133-138)
  • Errorf (141-146)
  • Infof (125-130)
storage/redis.go (1)
  • RedisClient (13-13)
services/priority_queue_test.go (5)
ent/providerordertoken/where.go (5)
  • Network (100-102)
  • ID (15-17)
  • IDEQ (20-22)
  • HasProviderWith (556-565)
  • HasCurrencyWith (602-611)
ent/providerprofile/where.go (2)
  • ID (14-16)
  • IDEQ (19-21)
ent/token/where.go (2)
  • ID (14-16)
  • IDEQ (19-21)
utils/test/db.go (4)
  • CreateTestUser (33-62)
  • CreateTestProviderProfile (456-506)
  • AddProviderOrderTokenToProvider (516-575)
  • CreateTestProvisionBucket (578-612)
storage/redis.go (1)
  • RedisClient (13-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-and-test
🔇 Additional comments (8)
services/priority_queue.go (4)

209-211: LGTM! Clear two-pass approach initialization.

The first pass setup is well-structured, collecting provider token entries and tracking the maximum token count across all providers.


241-287: LGTM! First pass correctly collects and validates provider entries.

The collection logic properly filters tokens, validates rates, and tracks the maximum token count per provider. Error handling with contextual logging is appropriate.


290-305: LGTM! Proper cap enforcement and early exit handling.

The 20-slot cap prevents excessive queue size, and the early exit with logging appropriately handles the case when no valid provider entries exist.


327-334: LGTM! Excellent observability improvement.

The final metrics logging provides valuable insights into queue construction, including provider count, slot distribution, and total queue size. This will greatly aid monitoring and debugging.

services/priority_queue_test.go (4)

582-820: LGTM! Comprehensive test for equal slot distribution.

This test thoroughly validates the core feature: equal slot distribution across providers with different token counts. It correctly verifies:

  • Both providers receive equal slots (3 each)
  • Single-token provider cycles its one token
  • Multi-token provider includes all 3 unique tokens

The assertions are clear and cover the key behaviors introduced in this PR.


822-920: LGTM! Validates the 20-slot cap effectively.

This test confirms that even when a provider has 25 tokens, the queue is correctly capped at 20 slots with 20 unique tokens. This validates the cap enforcement logic introduced at lines 290-295 in the main file.


922-975: LGTM! Validates early exit for empty provider scenario.

This test confirms the early exit behavior (lines 297-305 in the main file) when no valid provider entries exist. The queue correctly remains empty, preventing unnecessary operations.


977-1192: LGTM! Excellent validation of cycling behavior with different token counts.

This test thoroughly validates that:

  • Both providers receive equal slots (5 each, matching the max)
  • The 2-token provider correctly cycles through its tokens with modulo logic
  • Token counts follow the expected pattern (3 and 2 occurrences)

The test confirms the core cycling algorithm at line 312 (entries[i%numEntries]) works correctly across different provider configurations.

- Sort provider IDs before iteration to ensure consistent queue order
- Prevents non-deterministic map iteration from affecting queue creation
- Improves debugging experience with predictable queue ordering
- Added TestDeterministicQueueOrder to verify consistent queue generation
- Added containsString helper function for tests
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
services/priority_queue.go (2)

156-158: Initial provider randomization appears ineffective.

The provider slice is randomized here, but the final queue order at line 314 sorts providers by ID for deterministic ordering. This initial shuffle doesn't affect the final queue composition since provider IDs are extracted from map keys and then sorted. Consider removing this randomization to avoid confusion, unless it serves another purpose in the workflow.


210-289: Consider adding context cancellation check for long-running operations.

The first pass iterates through all providers and their tokens, performing database queries and rate calculations. For buckets with many providers/tokens, this could be time-consuming. Consider periodically checking ctx.Err() within the outer loop to allow graceful cancellation.

Example:

 	for _, provider := range providers {
+		if ctx.Err() != nil {
+			logger.WithFields(logger.Fields{
+				"Error": ctx.Err(),
+			}).Warnf("context cancelled during queue construction")
+			return
+		}
 		exists, err := provider.QueryProviderCurrencies().
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5aa4c93 and 761890b.

📒 Files selected for processing (2)
  • services/priority_queue.go (4 hunks)
  • services/priority_queue_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
services/priority_queue.go (2)
utils/logger/logger.go (5)
  • WithFields (76-109)
  • Fields (73-73)
  • Warnf (133-138)
  • Errorf (141-146)
  • Infof (125-130)
storage/redis.go (1)
  • RedisClient (13-13)
services/priority_queue_test.go (2)
utils/test/db.go (4)
  • CreateTestUser (33-62)
  • CreateTestProviderProfile (456-506)
  • AddProviderOrderTokenToProvider (516-575)
  • CreateTestProvisionBucket (578-612)
storage/redis.go (1)
  • RedisClient (13-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-and-test
🔇 Additional comments (1)
services/priority_queue_test.go (1)

582-1324: Excellent test coverage for equal slot distribution!

The five test cases comprehensively validate the new two-pass distribution algorithm:

  • Equal distribution across providers with varying token counts
  • 20-slot cap enforcement
  • Empty queue handling when no valid providers exist
  • Token cycling patterns (e.g., [T1, T2, T1, T2, T1] for a 2-token provider needing 5 slots)
  • Deterministic ordering across multiple queue constructions

The tests are well-structured, clearly documented, and provide strong confidence in the implementation.

- Update condition checks in OnlySenderMiddleware and OnlyProviderMiddleware to use logical OR instead of AND for better validation of scope.
- Ensures proper handling of unauthorized access when the sender or provider is not set.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
routers/middleware/auth.go (1)

77-101: Consider refactoring the profile assignment logic.

While reviewing the authorization fix, I noticed the profile assignment has a code smell:

if err != nil {
    c.Set("sender", nil)
}
c.Set("sender", senderProfile)

The unconditional c.Set on line 87 overwrites line 84, making the first call redundant. Consider refactoring to:

if err == nil {
    c.Set("sender", senderProfile)
}

Same pattern exists for provider profile (lines 95-100).

Apply this diff:

 if scope == "sender" || senderAndProvider {
     senderProfile, err := storage.Client.SenderProfile.
         Query().
         Where(senderprofile.HasUserWith(user.IDEQ(userUUID))).
         WithOrderTokens().
         Only(c)
-    if err != nil {
-        c.Set("sender", nil)
-    }
-
-    c.Set("sender", senderProfile)
+    if err == nil {
+        c.Set("sender", senderProfile)
+    }
 }

 if scope == "provider" || senderAndProvider {
     providerProfile, err := storage.Client.ProviderProfile.
         Query().
         Where(providerprofile.HasUserWith(user.IDEQ(userUUID))).
         Only(c)
-    if err != nil && !senderAndProvider {
-        fmt.Println("error", err)
-        c.Set("provider", nil)
-    }
-
-    c.Set("provider", providerProfile)
+    if err == nil {
+        c.Set("provider", providerProfile)
+    }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 761890b and 312031e.

📒 Files selected for processing (2)
  • routers/middleware/auth.go (2 hunks)
  • tasks/tasks_test.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-and-test
🔇 Additional comments (1)
tasks/tasks_test.go (1)

133-135: LGTM: Improved fail-fast behavior for setup errors.

This correctly ensures the test terminates immediately if setup fails, preventing cascading failures and misleading error messages from tests running with invalid state.

scope, ok := c.Get("sender")

if !ok && scope == nil {
if !ok || scope == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical security fix: Correctly rejects nil scope.

The change from && to || fixes a vulnerability where requests could pass authorization if the context key existed but contained a nil value. This can occur when JWTMiddleware fails to fetch a sender profile (lines 77-88) and sets the key to nil. The old logic would incorrectly allow such requests through.

The new logic correctly requires both key existence AND a non-nil value.

🤖 Prompt for AI Agents
In routers/middleware/auth.go around line 468, the authorization check must
reject requests when the context key is missing OR the stored scope is nil;
replace the previous combined-check so the condition reads to return/deny when
!ok || scope == nil. Ensure the logic uses a short-circuit OR to abort
authorization immediately when either the key is absent or its value is nil, and
keep the subsequent error/return behavior unchanged.

scope, ok := c.Get("provider")

if !ok && scope == nil {
if !ok || scope == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical security fix: Correctly rejects nil scope.

Same fix as line 468 for OnlySenderMiddleware. This correctly prevents authorization bypass when the provider profile lookup fails and the context key is set to nil (lines 90-101 in JWTMiddleware).

🤖 Prompt for AI Agents
In routers/middleware/auth.go around line 481, the nil scope check must be
tightened to prevent authorization bypass; update the conditional to reject when
the context key is missing OR the retrieved scope is nil (same fix applied at
line 468 for OnlySenderMiddleware) and ensure the handler returns the
unauthorized response immediately when that condition is true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments