-
Notifications
You must be signed in to change notification settings - Fork 48
Description
sampleCount was 0:
Line 304 in ac0b664
| *averageUsage = sum/sampleCount; |
Kubelet crashed due to SIGFPE which is a "divide error" on X86:
Sep 06 16:51:50 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=7101, name=kubelet, Uncontained: FBHUB. RST: Yes, D-RST: No
Sep 06 16:51:51 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=2362883, name=python, Ch 0000000a
Sep 06 16:51:51 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=7101, name=kubelet, Uncontained: FBHUB. RST: Yes, D-RST: No
Sep 06 16:51:51 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=2362883, name=python, Ch 0000000a
Sep 06 16:51:51 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=7101, name=kubelet, Uncontained: PCIE. RST: Yes, D-RST: No
Sep 06 16:51:51 foobar kernel: NVRM: Xid (PCI:0000:db:00): 95, pid=2362883, name=python, Ch 0000000a
Sep 06 16:51:51 foobar kubelet[7101]: fatal error: unexpected signal during runtime execution
Sep 06 16:51:51 foobar kernel: NVRM: _kgspProcessRpcEvent: Unexpected RPC event from GPU7: 0x4c (GSP_RM_CONTROL)
Sep 06 16:51:51 foobar kubelet[7101]: [signal SIGFPE: floating-point exception code=0x1 addr=0x7ff06af196fc pc=0x7ff06af196fc]
Sep 06 16:51:51 foobar kubelet[7101]: runtime stack:
Sep 06 16:51:51 foobar kubelet[7101]: runtime.throw(0x4ba98c4, 0x2a)
Sep 06 16:51:51 foobar kubelet[7101]: /usr/local/go/src/runtime/panic.go:1117 +0x72
Sep 06 16:51:51 foobar kubelet[7101]: runtime.sigpanic()
Sep 06 16:51:51 foobar kubelet[7101]: /usr/local/go/src/runtime/signal_unix.go:718 +0x2e5
Sep 06 16:51:51 foobar kubelet[7101]: goroutine 14100013 [syscall]:
Sep 06 16:51:51 foobar kubelet[7101]: runtime.cgocall(0x3ad6530, 0xc0015368f8, 0x186408826e500)
Sep 06 16:51:51 foobar kubelet[7101]: /usr/local/go/src/runtime/cgocall.go:154 +0x5b fp=0xc0015368c8 sp=0xc001536890 pc=0x40707b
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/mindprince/gonvml._Cfunc_nvmlDeviceGetAverageUsage(0x7ff06b4df8a8, 0xc000000001, 0x63e1e0e4c5955, 0xc00578>
Sep 06 16:51:51 foobar kubelet[7101]: _cgo_gotypes.go:100 +0x48 fp=0xc0015368f8 sp=0xc0015368c8 pc=0x1f15348
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/mindprince/gonvml.Device.AverageGPUUtilization.func1(0x7ff06b4df8a8, 0x63e1e0e4c5955, 0xc005789b00, 0xffff>
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/mindprince/gonvml/bindings.go:477 +0x6f fp=0xc>
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/mindprince/gonvml.Device.AverageGPUUtilization(0x7ff06b4df8a8, 0x2540be400, 0x166f200000, 0x0, 0x0)
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/mindprince/gonvml/bindings.go:477 +0xf4 fp=0xc>
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/google/cadvisor/accelerators.(*nvidiaCollector).UpdateStats(0xc0080c1c20, 0xc00c33b800, 0x1860a9fe1415b, 0>
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/accelerators/nvidia.go:260 +0x>
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager.(*containerData).updateStats(0xc0064d6b40, 0xc2271b56b01531ab, 0x1860a9fd870c3)
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager/container.go:688 +0x9f>
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager.(*containerData).housekeepingTick(0xc0064d6b40, 0xc008b10300, 0x5f5e100, 0xc000d14>
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager/container.go:583 +0x15>
Sep 06 16:51:51 foobar kubelet[7101]: k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager.(*containerData).housekeeping(0xc0064d6b40)
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager/container.go:531 +0x28>
Sep 06 16:51:51 foobar kubelet[7101]: runtime.goexit()
Sep 06 16:51:51 foobar kubelet[7101]: /usr/local/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc001537fe0 sp=0xc001537fd8 pc=0x475f21
Sep 06 16:51:51 foobar kubelet[7101]: created by k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager.(*containerData).Start
Sep 06 16:51:51 foobar kubelet[7101]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/manager/container.go:119 +0x3f`