Skip to content

sysbox-fs enter deallock #981

@okhowang

Description

@okhowang

I use sysbox with docker on centos-liked linux.
kernel version 6.6
sysbox version 0.6.7 ( I use package from https://github.com/karellen/karellen-sysbox )
docker version 28.4.0

sometimes sysbox-fs become unresponsable.

I found request to sysbox-fs was stucked on css.Lock()
and one ContainerPreRegister goroutine stucked on srv.InitWait() in CreateFuseServer
but all goroutines created by CreateFuseServer were waiting on fuse.(*Conn).ReadRequest

I notice that fuseServer.Run may return directly, left s.initDone untouched.
it may failed at fuse.Mount, but I can't pick out the log from here
sysbox-fs's log may like

time="2025-12-30 10:55:21" level=info msg="Container pre-registration completed: id = f94491638433"
time="2025-12-30 10:55:21" level=info msg="Container registration completed: id = 8338ff2129b4, initPid = 1531739, uid:gid = 100000:100000"
time="2025-12-30 10:55:21" level=info msg="Container unregistration completed: id = 82d627f3d5a9"
time="2025-12-30 10:55:21" level=info msg="Container registration completed: id = f94491638433, initPid = 1531787, uid:gid = 100000:100000"
time="2025-12-30 10:55:21" level=info msg="Container pre-registration completed: id = 1fb490132bd6"
time="2025-12-30 10:55:21" level=info msg="Container registration completed: id = 1fb490132bd6, initPid = 1532300, uid:gid = 100000:100000"
time="2025-12-30 10:55:21" level=info msg="Container unregistration completed: id = 8338ff2129b4"
time="2025-12-30 10:55:21" level=info msg="Container unregistration completed: id = f94491638433"
time="2025-12-30 10:55:22" level=info msg="Container unregistration completed: id = 1fb490132bd6"
time="2025-12-30 10:55:22" level=info msg="Container pre-registration completed: id = c14fe60b247c"
time="2025-12-30 10:55:22" level=info msg="Container registration completed: id = c14fe60b247c, initPid = 1533088, uid:gid = 100000:100000"
time="2025-12-30 10:55:22" level=info msg="Container unregistration completed: id = a067c8a45c4a"
time="2025-12-30 10:55:23" level=info msg="Container unregistration completed: id = e95e41ad806b"
time="2025-12-30 10:55:23" level=info msg="Container unregistration completed: id = c14fe60b247c"
time="2025-12-30 10:55:24" level=info msg="Container pre-registration completed: id = e02075b10cfc"
time="2025-12-30 10:55:24" level=info msg="Container unregistration completed: id = ec7307f20f10"
time="2025-12-30 10:55:24" level=info msg="Container registration completed: id = e02075b10cfc, initPid = 1534324, uid:gid = 100000:100000"
time="2025-12-30 10:55:25" level=info msg="Container unregistration completed: id = e35979d7acbb"
time="2025-12-30 10:55:25" level=info msg="Container pre-registration completed: id = 782ff1a076b2"
time="2025-12-30 10:55:26" level=info msg="Container registration completed: id = 782ff1a076b2, initPid = 1536552, uid:gid = 100000:100000"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1536924"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1536931"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1536934"
time="2025-12-30 10:55:26" level=info msg="Container pre-registration completed: id = 53d32a403562"
time="2025-12-30 10:55:26" level=info msg="Container unregistration completed: id = 782ff1a076b2"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537014"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537016"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537018"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537020"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537025"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537131"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537134"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537136"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537140"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537148"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537154"
time="2025-12-30 10:55:26" level=info msg="Container registration completed: id = 53d32a403562, initPid = 1537084, uid:gid = 100000:100000"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537242"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537256"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537263"
time="2025-12-30 10:55:26" level=warning msg="Sysbox-fs first child process error status: exit status 1, pid: 1537266"

dlv core output with grs -group userloc

github.com/nestybox/sysbox-fs/fuse/server.go:213 in github.com/nestybox/sysbox-fs/fuse.(*FuseServerService).CreateFuseServer
          Goroutine 3640530 - User: github.com/nestybox/sysbox-fs/fuse/server.go:213 github.com/nestybox/sysbox-fs/fuse.(*FuseServerService).CreateFuseServer (0x5fd685) [unknown wait reason 14 516748745301316]
        Total: 1

internal/sync/mutex.go:70 in sync.(*RWMutex).Lock
          Goroutine 3645228 - User: internal/sync/mutex.go:70 sync.(*RWMutex).Lock (0x49f932) [unknown wait reason 22 516767446953500]
          Goroutine 3673629 - User: internal/sync/mutex.go:70 sync.(*RWMutex).Lock (0x49f932) [unknown wait reason 22 516956251790091]
          Goroutine 3642287 - User: internal/sync/mutex.go:70 sync.(*RWMutex).Lock (0x49f932) [unknown wait reason 22 516748745301316]
          Goroutine 3698300 - User: internal/sync/mutex.go:70 sync.(*RWMutex).Lock (0x49f932) [unknown wait reason 22 517145373088660]
          Goroutine 3656936 - User: internal/sync/mutex.go:70 sync.(*RWMutex).Lock (0x49f932) [unknown wait reason 22 516819640709316]
        Total: 23

I found a log

time="2025-12-30 10:55:35" level=error msg="fusermount: waitid: no child processes"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions