-
Notifications
You must be signed in to change notification settings - Fork 7
Description
We've had a good run with Sqewer, with 5 years of operation and billions of jobs being processed. There were good times and bad times, but it seems like we have a chance of meaningfully upgrading Sqewer into the next phase. I believe it is necessary so the various services that we run can benefit from more SQS features than previously, and also provide some cost savings.
I propose replacing the .submit! API we have at the moment and the configuration API.
Configuration API
The proposed configuration API looks like this:
Sqewer.configure do |config|
config.sqs_endpoint_url = some_url # defaults to SQS_QUEUE_URL environment variable value
config.serializer = MyCustomSerializer.new # defaults to Sqewer::Serializer.default
config.logger = SomeLogger.new # defaults to Sqewer::Logger.default
config.middleware_handlers = [some_middleware, another_middleware, yet_another_middleware]
config.worker = MyCustomizedWorker.new # defaults to Sqewer::Worker.default
endReasoning: currently most of our services define a custom Sqewer worker class. The customizations that we apply in that custom class are pretty much in-scope for Sqewer itself though. I think that moving these to stable configuration options and allowing them to be configured from one place is appropriate.
Submission (message sending) API
Currently we have two ways of sending messages using Sqewer. One is using Sqewer.submit! which guarantees that your message will be delivered to the queue immediately, and the call will fail if that doesn't succeed. We also have the Executor object passed to every job's run() method, which also responds to .submit! with the same arguments. When an Executor is used, we apply batching to the messages generated during a single run of a job. This was done so that the if a job forward-spools descendant jobs, they would be delivered to the queue at once - saving money due to batching. When using bare Sqewer.submit! batching was not used for fear of "dropping jobs" if a process that has buffered some jobs terminates, but the messages have not been sent.
I propose changing the submission API in a way that prioritizes batching over guaranteed delivery to the queue, by removing the Executor objects entirely. Instead of providing submit! on an Executor and submit! on Sqewer itself with different semantics, we will instead provide submit! and submit_immediately! on Sqewer itself.
Sqewer.submit_immediately! would ensure that the jobs passed to it get serialized together (and if any job fails to serialize none get delivered to SQS) and then deliver them to SQS in batches before returning.
Sqewer.submit! would first serialize all jobs, if any of the jobs fails to serialize it would raise an exception before buffering them for sending. Once the jobs are converted into messages ready for sending, they will be added to an in-memory Queue. This queue would be emptied by a Thread running in the background at regular intervals - for instance once every 2 seconds. All messages picked up after this fixed interval would be delivered to SQS in batches.
The reasoning is as follows:
- Even though we prioritised message delivery over batching, we still had a bug where this delivery and/or serialization could fail silently
- Our ActiveJob adapter, which currently powers the main site, does not do batching at all because ActiveJob does not provide a "transaction" semantic for buffering the jobs during an execution of another job - or during the execution of another Rails unit of work for that matter. Using buffering for
submit!would immediately enable batching for ActiveJob, allowing for substantial savings - Delivery can be wrapped in a separate error-tracking transaction, in all cases
- The choice is clearly placed on the developer/consumer to make a choice for less efficient batching and delivery guarantee with
submit_immediately!as opposed to sane default batching and a deferred failure with justsubmit!.
Very interested in your thoughts about this!