Skip to content

Parallelism issue of dispatch when one actor blocks thread #104

@svetlyak40wt

Description

@svetlyak40wt

By default shared dispatcher uses 4 threads. If some actor's RECEIVE function takes significant time, and there are many messages in it's queue, then other dispatcher thread will be locked trying to acquire actor's message box lock here. Because of this if we are having yet another actor which also uses shared dispatcher, then mesages to this second actor will not be processed untill all messages to the first actor will be processed.

To demonstrate the issue, let's create a system with two actors:

  • answerer processes each message 10 seconds;
  • checker returns immediately.
(uiop:define-package #:test-sento-parallelism
  (:use #:cl))
(in-package #:test-sento-parallelism)


(defparameter *system* (sento.actor-system:make-actor-system))

(defparameter *answerer*
  (sento.actor-context:actor-of *system* :name "answerer"
                                         :receive
                                         (lambda (msg)
                                           (log:debug "Thinking on ~A" msg)
                                           (sleep 10)
                                           (log:debug "Replying to ~A" msg)
                                           (let ((output (format nil "Hello ~a" msg)))
                                             (sento.actor:reply output)))))


(defparameter *checker*
  (sento.actor-context:actor-of *system* :name "checker"
                                         :receive
                                         (lambda (msg)
                                           (log:debug "Checking ~A" msg)
                                           (let ((output (format nil "Checked ~a" msg)))
                                             (sento.actor:reply output)))))


(defun run-test (&key (num-messages 4))
  (loop for i from 1 upto num-messages
        do (sento.actor:ask *answerer* i)
        finally (sento.actor:ask *checker* 100500)))

Results of this code:

TEST-SENTO-PARALLELISM> (run-test)
<DEBUG> [2025-02-01T11:18:48.985165Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 1
NIL
<DEBUG> [2025-02-01T11:18:58.985431Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 1
<DEBUG> [2025-02-01T11:18:58.985617Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 2
<DEBUG> [2025-02-01T11:19:08.985794Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 2
<DEBUG> [2025-02-01T11:19:08.985937Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Checking 100500
<DEBUG> [2025-02-01T11:19:08.986015Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 3
<DEBUG> [2025-02-01T11:19:18.986221Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 3
<DEBUG> [2025-02-01T11:19:18.986371Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 4
<DEBUG> [2025-02-01T11:19:28.986530Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 4

Here Checking 100500 message appears only after the first message to answered, 10 seconds after the test was runned.

On another run it starts 20 seconds after:

TEST-SENTO-PARALLELISM> (run-test)
NIL
<DEBUG> [2025-02-01T11:20:12.193907Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 1
<DEBUG> [2025-02-01T11:20:22.194082Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 1
<DEBUG> [2025-02-01T11:20:22.194221Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 2
<DEBUG> [2025-02-01T11:20:32.194417Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 2
<DEBUG> [2025-02-01T11:20:32.194572Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Checking 100500
<DEBUG> [2025-02-01T11:20:32.194647Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 3
<DEBUG> [2025-02-01T11:20:42.194823Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 3
<DEBUG> [2025-02-01T11:20:42.195007Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 4
<DEBUG> [2025-02-01T11:20:52.195192Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 4

If we will send more messages to the first actor, then the message to the second actor will be processed with more delay:

TEST-SENTO-PARALLELISM> (run-test :num-messages 20)
NIL
<DEBUG> [2025-02-01T11:24:15.766732Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 1
<DEBUG> [2025-02-01T11:24:25.766886Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 1
<DEBUG> [2025-02-01T11:24:25.767037Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 2
<DEBUG> [2025-02-01T11:24:35.767161Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 2
<DEBUG> [2025-02-01T11:24:35.767294Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 3
<DEBUG> [2025-02-01T11:24:45.767407Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 3
...
<DEBUG> [2025-02-01T11:27:25.771623Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 19
<DEBUG> [2025-02-01T11:27:25.771745Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Thinking on 20
<DEBUG> [2025-02-01T11:27:35.771863Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Replying to 20
<DEBUG> [2025-02-01T11:27:35.772013Z] test-sento-parallelism test-sento-parallelism.lisp (top level form) Checking 100500

Message to the second worker was processed 3 minutes from test start. And all this time three threads from four were just waiting for a lock on the first actor's message box.

Expected behaviour

Only one thread of the shared dispatcher is used to process messages of a single message-box/dp. Other threads are able to process messages of other actors immediately.

In the situation modelled in the test code above, I expect that one thread will start processing the first message to answerer actor, second thread will start processing a message to checker actor and two more threads will be available for processing of other messages for any other actors if they will be created.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions