Skip to content
This repository was archived by the owner on Mar 30, 2023. It is now read-only.
This repository was archived by the owner on Mar 30, 2023. It is now read-only.

Persistent buffers using 100% of available space causes the job to fail #113

@jsteel44

Description

@jsteel44

We can use 99% (technically 100% minus one unit of granularity) of available space however when we create a buffer to consume the remaining space, it looks to be created successfully but then squeue reports:

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                20     debug create-p    test2 PD       0:00      1 (burst_buffer/datawarp: setup: panic: runtime error: index out of range [recovered]
        panic: runtime error: index out of range

goroutine 1 [running]:
main.main.func1()
        /home/circleci/data-acc/cmd/dacctl/main.go:187 +0xb9
panic(0xb0d7c0, 0x12394c0)
        /usr/local/go/src/runtime/panic.go:522 +0x1b5
github.com/RSE-Cambridge/data-acc/internal/pkg/dacctl/workflow_impl.sessionFacade.doAllocationAndWriteSession(0xce3440, 0xc000176d00, 0xcdf780, 0xc000179560, 0xce0e80, 0xc00017c750, 0xcccf
60, 0x126c0c8, 0x7ffff8268d30, 0x2, ...)
        /home/circleci/data-acc/internal/pkg/dacctl/workflow_impl/session.go:166 +0x6ee
github.com/RSE-Cambridge/data-acc/internal/pkg/dacctl/workflow_impl.sessionFacade.CreateSession.func1(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/circleci/data-acc/internal/pkg/dacctl/workflow_impl/session.go:110 +0xfb
github.com/RSE-Cambridge/data-acc/internal/pkg/dacctl/workflow_impl.sessionFacade.submitJob(0xce3440, 0xc000176d00, 0xcdf780, 0xc000179560, 0xce0e80, 0xc00017c750, 0xcccf60, 0x126c0c8, 0x7
ffff8268d30, 0x2, ...)
        /home/circleci/data-acc/internal/pkg/dacctl/workflow_impl/session.go:53 +0x3e8
github.com/RSE-Cambridge/data-acc/internal/pkg/dacctl/workflow_impl.sessionFacade.CreateSession(0xce3440, 0xc000176d00, 0xcdf780, 0xc000179560, 0xce0e80, 0xc00017c750, 0xcccf60, 0x126c0c8,
 0x7ffff8268d30, 0x2, ...)
        /home/circleci/data-acc/internal/pkg/dacctl/workflow_impl/session.go:107 +0x1b4
github.com/RSE-Cambridge/data-acc/internal/pkg/dacctl/actions_impl.(*dacctlActions).CreatePerJobBuffer(0xc000179580, 0xcdcac0, 0xc0000e71e0, 0xc000179580, 0x0)
        /home/circleci/data-acc/internal/pkg/dacctl/actions_impl/job.go:93 +0x6f5
main.setup(0xc0000e71e0, 0x0, 0x0)
        /home/circleci/data-acc/cmd/dacctl/actions.go:92 +0xa3
github.com/urfave/cli.HandleAction(0xade2a0, 0xc15140, 0xc0000e71e0, 0xc0000e71e0, 0x0)
        /go/pkg/mod/github.com/urfave/cli@v1.21.0/app.go:514 +0xbe
github.com/urfave/cli.Command.Run(0xbe38a2, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0a8db, 0x47, 0x0, ...)
        /go/pkg/mod/github.com/urfave/cli@v1.21.0/command.go:171 +0x4d2
github.com/urfave/cli.(*App).Run(0xc0000dc540, 0xc0000b4000, 0xe, 0xf, 0x0, 0x0)
        /go/pkg/mod/github.com/urfave/cli@v1.21.0/app.go:265 +0x733
main.runCli(0xc0000b4000, 0xf, 0xf, 0xbe21f8, 0x1)
        /home/circleci/data-acc/cmd/dacctl/main.go:172 +0x1255
main.main()
        /home/circleci/data-acc/cmd/dacctl/main.go:194 +0x1f1
)

At this point everything looks normal with the buffers with now 0 FreeSpace as expected:

Name=datawarp DefaultPool=default Granularity=1600GiB TotalSpace=24000GiB FreeSpace=0 UsedSpace=24000GiB
  Flags=EnablePersistent,PrivateData
  StageInTimeout=3600 StageOutTimeout=3600 ValidateTimeout=5 OtherTimeout=3600
  GetSysState=/usr/local/bin/dacctl
  GetSysStatus=/usr/local/bin/dacctl
  Allocated Buffers:
    Name=small CreateTime=2019-10-02T14:24:55 Pool=default Size=3200GiB State=allocated UserID=test2(1002)
    Name=full CreateTime=2019-10-02T14:23:14 Pool=default Size=20800GiB State=allocated UserID=test2(1002)
  Per User Buffer Use:
    UserID=test2(1002) Used=24000GiB

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions