Skip to content

[Bug]: deadlock in worker cleanup leads to zombie workers #493

@parkan

Description

@parkan

Description

still investigating this but we are observing thousands of deadlocks in worker cleanup calls:

 blocked_pid | blocking_pid | blocked_user | blocking_user |    blocking_duration    |                           blocked_query                           |                  
    blocking_query                       
-------------+--------------+--------------+---------------+-------------------------+-------------------------------------------------------------------+------------------
-----------------------------------------
      193590 |       190674 | singularity  | singularity   | 3 days 13:38:47.888695  | DELETE FROM "workers" WHERE last_heartbeat < $1                   | DELETE FROM "work
ers" WHERE last_heartbeat < $1
      184265 |       190674 | singularity  | singularity   | 3 days 13:38:47.888695  | DELETE FROM "workers" WHERE last_heartbeat < $1                   | DELETE FROM "work
ers" WHERE last_heartbeat < $1
      216642 |       190674 | singularity  | singularity   | 3 days 13:38:47.888695  | DELETE FROM "workers" WHERE last_heartbeat < $1                   | DELETE FROM "work
ers" WHERE last_heartbeat < $1
      187480 |       190674 | singularity  | singularity   | 3 days 13:38:47.888695  | DELETE FROM "workers" WHERE last_heartbeat < $1                   | DELETE FROM "work
ers" WHERE last_heartbeat < $1
      191185 |       190674 | singularity  | singularity   | 3 days 13:38:47.888695  | DELETE FROM "workers" WHERE last_heartbeat < $1                   | DELETE FROM "work
ers" WHERE last_heartbeat < $1

Steps to Reproduce

appears to happen when a larger number of workers is enabled, especially across multiple parent processes

Version

tbd

Operating System

Linux

Database Backend

PostgreSQL

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions