-
Notifications
You must be signed in to change notification settings - Fork 11
Add write-rate-limit. #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR adds write-rate-limiting to the connector system by introducing a WriteRateLimit configuration option, threading it from CLI flags through runner settings into connector configurations, and implementing rate.Limiter in base.go to throttle batched write operations. It also introduces WriterClosedErr sentinel, adds a HandlerError method to the ParallelWriterConnector interface, and enhances error handling for context cancellation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
connectors/common/parallel_writer.go (1)
28-28: Note the interface change to ParallelWriterConnector.Adding the
HandlerError(error) errormethod to theParallelWriterConnectorinterface is a breaking change. However, this appears to be an internal interface within the connectors package, so the impact should be limited.If this interface has external implementations, ensure all implementations are updated to include the new method. Run this script to verify:
#!/bin/bash # Find all implementations of ParallelWriterConnector ast-grep --pattern $'type $_ struct { $$$ } func ($_ *$_) HandleBarrierMessage($$$) error { $$$ }' # Look for missing HandlerError implementations rg -nP 'func \(\w+ \*\w+\) HandleBarrierMessage' -C5 | rg -v 'HandlerError'
🧹 Nitpick comments (1)
connectors/common/base.go (1)
1061-1064: Note the conservative burst size choice.The limiter is initialized with burst size equal to the rate limit (
rate.NewLimiter(rate.Limit(settings.WriteRateLimit), settings.WriteRateLimit)). This means the token bucket can hold exactly 1 second worth of operations.While valid, this conservative choice means large batches will always wait proportionally, even at startup. Consider whether a larger burst (e.g.,
2 * settings.WriteRateLimit) would better accommodate initial bursts while still maintaining the average rate.If you want to allow a larger initial burst, apply this change:
var limiter *rate.Limiter if settings.WriteRateLimit > 0 { - limiter = rate.NewLimiter(rate.Limit(settings.WriteRateLimit), settings.WriteRateLimit) + limiter = rate.NewLimiter(rate.Limit(settings.WriteRateLimit), settings.WriteRateLimit*2) }Also applies to: 1071-1071
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
connectors/common/base.go(8 hunks)connectors/common/parallel_writer.go(10 hunks)internal/app/app.go(1 hunks)internal/app/options/flags.go(1 hunks)internal/app/options/options.go(2 hunks)runners/local/runner.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
connectors/common/base.go (2)
connectors/dynamodb/stream/helper.go (1)
Limiter(23-25)connectors/common/parallel_writer.go (1)
WriterClosedErr(23-23)
connectors/common/parallel_writer.go (1)
protocol/iface/transport.go (3)
MutationType_InsertBatch(39-39)DataMessage(16-34)MutationType_Ignore(42-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (11)
internal/app/options/options.go (1)
46-46: LGTM!The WriteRateLimit field addition and wiring from CLI context is straightforward and consistent with other options.
Also applies to: 83-83
internal/app/app.go (1)
246-246: LGTM!The WriteRateLimit wiring into RunnerLocalSettings is correct and consistent with other configuration fields.
runners/local/runner.go (2)
313-313: LGTM!Adding
r.dst.Interrupt(r.activeFlowID)toGracefulShutdownensures the destination flow is properly interrupted during shutdown, which is especially important when rate limiting is active to prevent hanging write operations.
83-83: The review comment is incorrect. WhileWriteRateLimitis passed to both source and destination connectors viaconnectorSettings, the rate limiter is only checked withinProcessDataMessages(), which is exclusively a write operation. Source connectors only invoke read operations (StartReadToChannel,RequestCreateReadPlan, etc.) and never callProcessDataMessages(). Therefore, the limiter does not throttle reads from source connectors.Likely an incorrect or invalid review comment.
connectors/common/base.go (4)
55-55: LGTM!Adding
WriteRateLimittoConnectorSettingsproperly threads the configuration through to connectors.
1082-1086: LGTM!The rate limiter is correctly applied before all write operations:
- Uses
limiter.WaitNto reserve tokens for batch operations- Counts individual messages/documents via
len(msgs)orlen(dataMsg.DataBatch)- Uses
c.flowCtxfor proper cancellation support- Returns errors immediately for graceful shutdown
This implementation correctly enforces the per-second rate limit on the destination connector.
Also applies to: 1099-1103, 1135-1139
951-953: LGTM!Properly checks for
WriterClosedErrsentinel to detect when the writer has been closed, allowing the write loop to break cleanly without logging spurious errors.Also applies to: 961-963
37-37: No action required. The project is already usinggolang.org/x/time v0.12.0(a recent, stable version) and the codebase properly implements rate limiting withWaitNand context cancellation support across multiple connectors.connectors/common/parallel_writer.go (3)
23-23: LGTM!Introducing
WriterClosedErras a sentinel error is good practice. It allows callers to distinguish between writer shutdown and other errors usingerrors.Is(), enabling cleaner error handling during graceful shutdown.
250-252: LGTM!The enhanced context cancellation handling is excellent:
- Detects
context.Cancelederrors usingerrors.Is()- Returns immediately without logging spurious errors
- Allows graceful shutdown without error spam
- Consistent pattern applied across all write paths
This properly integrates with the rate limiter, which may return
context.CanceledwhenWaitNis interrupted during shutdown.Also applies to: 262-264, 300-302, 318-320, 336-338, 352-354
167-167: LGTM!Returning
WriterClosedErrwhen the context is done while attempting to send messages is appropriate. This allows callers to distinguish between clean shutdown scenarios and actual errors, reducing spurious error logging.Also applies to: 379-379, 386-386
| altsrc.NewIntFlag(&cli.IntFlag{ | ||
| Name: "write-rate-limit", | ||
| Usage: "If set, overall per second rate limit on the destination. If set, must be larger than any batch sizes.", | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the flags.go file and examine the relevant lines
find . -name "flags.go" -type f | head -5
# Also look for batch size references
rg -l "batch" --type go | head -10Repository: adiom-data/dsync
Length of output: 391
🏁 Script executed:
#!/bin/bash
# Search for batch size flags and write-rate-limit references
echo "=== Searching for batch size and rate limit flags ==="
rg -n "batch|rate-limit|WriterMaxBatchSize" internal/app/options/flags.go | head -30
echo -e "\n=== Look for write-rate-limit flag specifically ==="
rg -n "write-rate-limit" internal/app/options/flags.go -A 3 -B 1
echo -e "\n=== Search for validation logic ==="
rg -n "validate|Validate" internal/app/options/ -A 2 | head -40
echo -e "\n=== Check if there's flag validation in the app ==="
rg -n "func.*Before|flag.*validation" internal/app/ --type go | head -20Repository: adiom-data/dsync
Length of output: 788
🏁 Script executed:
#!/bin/bash
# Look at the GetFlagsAndBeforeFunc function to see what validation happens
echo "=== GetFlagsAndBeforeFunc function ==="
sed -n '42,200p' internal/app/options/flags.go | head -100
echo -e "\n=== Look for writer-batch-size flag ==="
rg -n "writer-batch-size" internal/app/options/flags.go -A 3 -B 1
echo -e "\n=== Search for validation that uses write-rate-limit and batch sizes together ==="
rg -n "write.*rate.*limit|batch.*size" internal/app/ --type go -i
echo -e "\n=== Check options.go for any validation ==="
rg -n "validate|Validate|rate.*limit" internal/app/options/options.go -A 2Repository: adiom-data/dsync
Length of output: 6189
🏁 Script executed:
#!/bin/bash
# Check how WriterMaxBatchSize and WriteRateLimit are used in app.go
echo "=== How WriterMaxBatchSize and WriteRateLimit are used in app.go ==="
rg -n "WriterMaxBatchSize|WriteRateLimit" internal/app/app.go -B 2 -A 2
echo -e "\n=== Look at the full context around line 242-246 in app.go ==="
sed -n '230,260p' internal/app/app.go
echo -e "\n=== Search for any validation or constraint checks anywhere ==="
rg -n "must be larger|rate.*limit.*batch|batch.*rate" . --type go -i
echo -e "\n=== Check if there's validation in the Before function ==="
rg -n "BeforeFunc" internal/app/options/flags.go -A 50 | head -60Repository: adiom-data/dsync
Length of output: 4885
🏁 Script executed:
#!/bin/bash
# Find where the BeforeFunc is returned and what it does
echo "=== Find the end of GetFlagsAndBeforeFunc to see BeforeFunc ==="
rg -n "return.*flags|BeforeFunc" internal/app/options/flags.go | tail -20
echo -e "\n=== Get the last part of the file to see BeforeFunc implementation ==="
tail -100 internal/app/options/flags.go
echo -e "\n=== Check if options.go has validation after parsing flags ==="
cat internal/app/options/options.goRepository: adiom-data/dsync
Length of output: 5680
🏁 Script executed:
#!/bin/bash
# Let's verify the exact state of the flags and see if there are any runtime checks
echo "=== Check how WriterMaxBatchSize is used at runtime ==="
rg -n "WriterMaxBatchSize" . --type go -B 1 -A 1 | head -40
echo -e "\n=== Check how WriteRateLimit is used at runtime ==="
rg -n "WriteRateLimit" . --type go -B 1 -A 1 | head -40
echo -e "\n=== Check for any validation in the runner or flow logic ==="
rg -n "batch.*rate|rate.*batch|WriteRateLimit.*Batch|WriterMaxBatchSize.*Rate" . --type go -iRepository: adiom-data/dsync
Length of output: 4840
🏁 Script executed:
#!/bin/bash
# Let's see the complete usage of WriteRateLimit when the limiter is created
echo "=== Full context of rate limiter creation ==="
sed -n '1055,1075p' ./connectors/common/base.go
echo -e "\n=== Check if there's any validation between flags and limiter creation ==="
rg -n "WriteRateLimit\|WriterMaxBatchSize" ./runners/local/runner.go -B 3 -A 3Repository: adiom-data/dsync
Length of output: 794
🏁 Script executed:
#!/bin/bash
# Let's check the Before function more carefully for any validation
echo "=== Complete Before function ==="
sed -n '191,205p' ./internal/app/options/flags.go
echo -e "\n=== Check if there's validation in app.go before passing to runner ==="
sed -n '200,250p' ./internal/app/app.go | head -50
echo -e "\n=== Check for any Action handlers on write-rate-limit flag ==="
rg -n "write-rate-limit" ./internal/app/options/flags.go -B 5 -A 10Repository: adiom-data/dsync
Length of output: 3144
Add validation to enforce write-rate-limit constraint against batch sizes.
The flag usage text states "must be larger than any batch sizes," but there is no validation enforcing this constraint. The Before function (lines 193-198) only validates flag combinations for progress/logfile. Add validation to check that write-rate-limit exceeds writer-batch-size when both are set, preventing misconfiguration that could cause confusing runtime behavior.
🤖 Prompt for AI Agents
In internal/app/options/flags.go around lines 187-190 (and add logic into the
Before function near lines 193-198), the flag usage claims write-rate-limit
"must be larger than any batch sizes" but no validation enforces that; add a
check in the Before hook that if both write-rate-limit and writer-batch-size are
set (non-zero) then write-rate-limit > writer-batch-size, and return a
descriptive error (e.g. "write-rate-limit must be larger than
writer-batch-size") when the constraint is violated so misconfiguration is
rejected at startup.
Summary by CodeRabbit
Release Notes
New Features
write-rate-limitCLI option. The limit must exceed configured batch sizes.Improvements
✏️ Tip: You can customize this high-level summary in your review settings.