Skip to content

Added Redis Sorted Function and Redis Stream Functions for Job Scheduling #187

@Surajvatsya

Description

@Surajvatsya

New Atlas Scheduler:

Problem:

Currently we are using a scheduler for scheduling search request batches, these jobs have following properties:

They are short lived, retry should be immediate.
Time difference between schedule time and pickup time shouldn’t be greater than 1 sec. at max (as low as possible).

– There can be different kinds of jobs with different kinds of priority levels, so new implementations can have room for that as well (not immediate requirement).

Current Solution:

The current solution uses DB as a queue for storing jobs with a scheduled time. The executors performs following actions:
Query the DB with where clause scheduledTime<now and shardId=x.
Take a lock on each job before starting execution.

– We added sharding to this DB, in layman terms it implies that instead of DB being used as a single queue now it’s treated as multiple queues. And each queue gets assigned an executor randomly, using a global counter and shardId in fetch jobs DB query.

Limitations:
Producer needs to know how many executors are running, so that they can assign shardId accordingly.
DB query has to be separated by X sec so as to not flood the DB. And with increasing executors load on DB will increase with multiple loopers querying DB every X sec.
This X sec itself introduces a delay.
Adding more executors is not dynamic, we will have to scale manually.
Two pods can end up having same shardId and some shardIds may not be allocated to any in pod recycle/restart cases.

Proposed solution - Using Redis Streams as a Scheduler
Actors involved:
Job Creator (App Servers): Registers a job that has to be done later at some point.
Job Producer: Pick the jobs scheduled for current time and pass it on for execution in batches.
Job Executor: Executes the jobs produced by the producer.
Two Redis Streams (RS1, RS2).

Role of Redis Stream:
It has a property of pushing data with an Id which can be used to lookup in range of .
Using the above stated property we can add jobs with id = scheduledTime by the Job Creator in RS1, these can then be queried by the Job Producer by using range query -<now-1s**>. The Job Producer will now push the entries returned into a new Redis steam(RS2), which is being popped by the Job Executors

Screenshots
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions