feat: implement equal slot distribution for providers in priority queue#552
feat: implement equal slot distribution for providers in priority queue#552claretnnamocha wants to merge 3 commits intopaycrest:mainfrom
Conversation
- Refactored CreatePriorityQueueForBucket to use two-pass algorithm - All providers now receive equal number of queue slots regardless of token count - Providers with fewer tokens have their tokens cycled proportionally - Added 20-slot cap per provider to prevent excessive queue sizes - Added comprehensive logging for distribution metrics - Added 4 unit tests covering equal distribution, capping, and edge cases Fixes issue where providers with more tokens dominated the queue
WalkthroughRefactors bucket priority-queue construction to a deterministic two-pass algorithm (collect per-provider token entries, then distribute slots evenly with a 20-slot cap), adds extensive tests for distribution/capping/determinism, tightens middleware authorization checks, and adjusts a test to fail fast on setup error. Changes
Sequence Diagram(s)sequenceDiagram
participant Init as Bucket Build Start
participant Collect as First Pass (Collect)
participant Distribute as Second Pass (Distribute)
participant Redis as Redis Queue
participant Metrics as Metrics Logger
Init->>Collect: Begin collection of per-provider token entries
rect rgb(240,248,255)
Note over Collect: For each provider: gather valid tokens\ntrack per-provider entries, track max entries\ncap entries per provider at 20
Collect->>Collect: Append serialized entries to providerEntries
Collect-->>Init: If no valid providers -> log warning & exit
end
alt Providers available
Collect->>Distribute: Sorted provider list + counts
rect rgb(255,247,235)
Note over Distribute: Compute targetSlotsPerProvider\nIterate providers deterministically (sorted), cycle tokens using modulo\nSerialize & enqueue entries (include ProviderID)
Distribute->>Redis: Enqueue requests
Redis-->>Distribute: Ack / error (logged with ProviderID)
end
Distribute->>Metrics: Emit final metrics (currency,min,max,providers,targetSlots,totalQueueSize)
end
Metrics-->>Init: Complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
services/priority_queue.go (1)
307-325: Consider using deterministic iteration for reproducibility.The second pass iterates over
providerTokenEntries(a map), which has non-deterministic iteration order in Go. While providers are intentionally randomized at line 155, the non-deterministic map iteration here is unintentional and could make debugging more difficult.If reproducible queue ordering is desired (for debugging or testing), consider sorting provider IDs before iteration:
// Second pass: distribute slots equally by cycling through each provider's tokens + // Sort provider IDs for deterministic iteration order + providerIDs := make([]string, 0, len(providerTokenEntries)) + for providerID := range providerTokenEntries { + providerIDs = append(providerIDs, providerID) + } + sort.Strings(providerIDs) + - for providerID, entries := range providerTokenEntries { + for _, providerID := range providerIDs { + entries := providerTokenEntries[providerID] numEntries := len(entries)Note: You'll need to import
"sort"at the top of the file.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/priority_queue.go(3 hunks)services/priority_queue_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
services/priority_queue.go (2)
utils/logger/logger.go (5)
WithFields(76-109)Fields(73-73)Warnf(133-138)Errorf(141-146)Infof(125-130)storage/redis.go (1)
RedisClient(13-13)
services/priority_queue_test.go (5)
ent/providerordertoken/where.go (5)
Network(100-102)ID(15-17)IDEQ(20-22)HasProviderWith(556-565)HasCurrencyWith(602-611)ent/providerprofile/where.go (2)
ID(14-16)IDEQ(19-21)ent/token/where.go (2)
ID(14-16)IDEQ(19-21)utils/test/db.go (4)
CreateTestUser(33-62)CreateTestProviderProfile(456-506)AddProviderOrderTokenToProvider(516-575)CreateTestProvisionBucket(578-612)storage/redis.go (1)
RedisClient(13-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-and-test
🔇 Additional comments (8)
services/priority_queue.go (4)
209-211: LGTM! Clear two-pass approach initialization.The first pass setup is well-structured, collecting provider token entries and tracking the maximum token count across all providers.
241-287: LGTM! First pass correctly collects and validates provider entries.The collection logic properly filters tokens, validates rates, and tracks the maximum token count per provider. Error handling with contextual logging is appropriate.
290-305: LGTM! Proper cap enforcement and early exit handling.The 20-slot cap prevents excessive queue size, and the early exit with logging appropriately handles the case when no valid provider entries exist.
327-334: LGTM! Excellent observability improvement.The final metrics logging provides valuable insights into queue construction, including provider count, slot distribution, and total queue size. This will greatly aid monitoring and debugging.
services/priority_queue_test.go (4)
582-820: LGTM! Comprehensive test for equal slot distribution.This test thoroughly validates the core feature: equal slot distribution across providers with different token counts. It correctly verifies:
- Both providers receive equal slots (3 each)
- Single-token provider cycles its one token
- Multi-token provider includes all 3 unique tokens
The assertions are clear and cover the key behaviors introduced in this PR.
822-920: LGTM! Validates the 20-slot cap effectively.This test confirms that even when a provider has 25 tokens, the queue is correctly capped at 20 slots with 20 unique tokens. This validates the cap enforcement logic introduced at lines 290-295 in the main file.
922-975: LGTM! Validates early exit for empty provider scenario.This test confirms the early exit behavior (lines 297-305 in the main file) when no valid provider entries exist. The queue correctly remains empty, preventing unnecessary operations.
977-1192: LGTM! Excellent validation of cycling behavior with different token counts.This test thoroughly validates that:
- Both providers receive equal slots (5 each, matching the max)
- The 2-token provider correctly cycles through its tokens with modulo logic
- Token counts follow the expected pattern (3 and 2 occurrences)
The test confirms the core cycling algorithm at line 312 (
entries[i%numEntries]) works correctly across different provider configurations.
- Sort provider IDs before iteration to ensure consistent queue order - Prevents non-deterministic map iteration from affecting queue creation - Improves debugging experience with predictable queue ordering - Added TestDeterministicQueueOrder to verify consistent queue generation - Added containsString helper function for tests
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
services/priority_queue.go (2)
156-158: Initial provider randomization appears ineffective.The provider slice is randomized here, but the final queue order at line 314 sorts providers by ID for deterministic ordering. This initial shuffle doesn't affect the final queue composition since provider IDs are extracted from map keys and then sorted. Consider removing this randomization to avoid confusion, unless it serves another purpose in the workflow.
210-289: Consider adding context cancellation check for long-running operations.The first pass iterates through all providers and their tokens, performing database queries and rate calculations. For buckets with many providers/tokens, this could be time-consuming. Consider periodically checking
ctx.Err()within the outer loop to allow graceful cancellation.Example:
for _, provider := range providers { + if ctx.Err() != nil { + logger.WithFields(logger.Fields{ + "Error": ctx.Err(), + }).Warnf("context cancelled during queue construction") + return + } exists, err := provider.QueryProviderCurrencies().
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/priority_queue.go(4 hunks)services/priority_queue_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
services/priority_queue.go (2)
utils/logger/logger.go (5)
WithFields(76-109)Fields(73-73)Warnf(133-138)Errorf(141-146)Infof(125-130)storage/redis.go (1)
RedisClient(13-13)
services/priority_queue_test.go (2)
utils/test/db.go (4)
CreateTestUser(33-62)CreateTestProviderProfile(456-506)AddProviderOrderTokenToProvider(516-575)CreateTestProvisionBucket(578-612)storage/redis.go (1)
RedisClient(13-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-and-test
🔇 Additional comments (1)
services/priority_queue_test.go (1)
582-1324: Excellent test coverage for equal slot distribution!The five test cases comprehensively validate the new two-pass distribution algorithm:
- Equal distribution across providers with varying token counts
- 20-slot cap enforcement
- Empty queue handling when no valid providers exist
- Token cycling patterns (e.g.,
[T1, T2, T1, T2, T1]for a 2-token provider needing 5 slots)- Deterministic ordering across multiple queue constructions
The tests are well-structured, clearly documented, and provide strong confidence in the implementation.
- Update condition checks in OnlySenderMiddleware and OnlyProviderMiddleware to use logical OR instead of AND for better validation of scope. - Ensures proper handling of unauthorized access when the sender or provider is not set.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
routers/middleware/auth.go (1)
77-101: Consider refactoring the profile assignment logic.While reviewing the authorization fix, I noticed the profile assignment has a code smell:
if err != nil { c.Set("sender", nil) } c.Set("sender", senderProfile)The unconditional
c.Seton line 87 overwrites line 84, making the first call redundant. Consider refactoring to:if err == nil { c.Set("sender", senderProfile) }Same pattern exists for provider profile (lines 95-100).
Apply this diff:
if scope == "sender" || senderAndProvider { senderProfile, err := storage.Client.SenderProfile. Query(). Where(senderprofile.HasUserWith(user.IDEQ(userUUID))). WithOrderTokens(). Only(c) - if err != nil { - c.Set("sender", nil) - } - - c.Set("sender", senderProfile) + if err == nil { + c.Set("sender", senderProfile) + } } if scope == "provider" || senderAndProvider { providerProfile, err := storage.Client.ProviderProfile. Query(). Where(providerprofile.HasUserWith(user.IDEQ(userUUID))). Only(c) - if err != nil && !senderAndProvider { - fmt.Println("error", err) - c.Set("provider", nil) - } - - c.Set("provider", providerProfile) + if err == nil { + c.Set("provider", providerProfile) + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
routers/middleware/auth.go(2 hunks)tasks/tasks_test.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-and-test
🔇 Additional comments (1)
tasks/tasks_test.go (1)
133-135: LGTM: Improved fail-fast behavior for setup errors.This correctly ensures the test terminates immediately if setup fails, preventing cascading failures and misleading error messages from tests running with invalid state.
| scope, ok := c.Get("sender") | ||
|
|
||
| if !ok && scope == nil { | ||
| if !ok || scope == nil { |
There was a problem hiding this comment.
Critical security fix: Correctly rejects nil scope.
The change from && to || fixes a vulnerability where requests could pass authorization if the context key existed but contained a nil value. This can occur when JWTMiddleware fails to fetch a sender profile (lines 77-88) and sets the key to nil. The old logic would incorrectly allow such requests through.
The new logic correctly requires both key existence AND a non-nil value.
🤖 Prompt for AI Agents
In routers/middleware/auth.go around line 468, the authorization check must
reject requests when the context key is missing OR the stored scope is nil;
replace the previous combined-check so the condition reads to return/deny when
!ok || scope == nil. Ensure the logic uses a short-circuit OR to abort
authorization immediately when either the key is absent or its value is nil, and
keep the subsequent error/return behavior unchanged.
| scope, ok := c.Get("provider") | ||
|
|
||
| if !ok && scope == nil { | ||
| if !ok || scope == nil { |
There was a problem hiding this comment.
Critical security fix: Correctly rejects nil scope.
Same fix as line 468 for OnlySenderMiddleware. This correctly prevents authorization bypass when the provider profile lookup fails and the context key is set to nil (lines 90-101 in JWTMiddleware).
🤖 Prompt for AI Agents
In routers/middleware/auth.go around line 481, the nil scope check must be
tightened to prevent authorization bypass; update the conditional to reject when
the context key is missing OR the retrieved scope is nil (same fix applied at
line 468 for OnlySenderMiddleware) and ensure the handler returns the
unauthorized response immediately when that condition is true.
88f28e7 to
5d8eb49
Compare
Description
This PR addresses an issue in the priority queue system where providers were not receiving equal opportunities to process orders. Previously, the number of queue slots allocated to each provider was directly proportional to the number of tokens they supported, creating an unfair advantage.
Problem:
Solution:
This PR implements a two-pass equal distribution algorithm with deterministic ordering:
Key Changes:
CreatePriorityQueueForBucket()inservices/priority_queue.goExample:
Breaking Changes: None - the implementation maintains the existing queue structure and Redis key format
Alternatives Considered:
References
Addresses internal discussion regarding fair provider slot allocation in the priority queue system.
Testing
This PR adds comprehensive test coverage with 5 new test cases (754 lines):
TestEqualSlotDistribution: Tests core functionality with 3-token vs 1-token providers
TestSlotCapAt20: Tests maximum slot limitation
TestNoValidProvidersExitEarly: Tests edge case handling
TestTwoProvidersWithDifferentTokenCounts: Real-world scenario test
TestDeterministicQueueOrder: Tests reproducibility
Manual Testing:
To verify in staging/production:
bucket_*to see queue entriesTargetSlotsPerProviderfor all providersEnvironment:
Language: Go 1.21+
Database: PostgreSQL with Ent ORM
Cache: Redis
Testing: In-memory SQLite + miniredis
This change adds test coverage for new/changed/fixed functionality
Checklist
mainBy submitting a PR, I agree to Paycrest's Contributor Code of Conduct and Contribution Guide.
Summary by CodeRabbit
Bug Fixes
Refactor
Tests