-
Notifications
You must be signed in to change notification settings - Fork 51
Description
Summary
When urunc exits while the Pod's network namespace remains, previously created tap*_urunc devices can persist in the namespace in a NO-CARRIER / DOWN state. Current startup treats the mere existence of a TAP as an active unikernel and refuses to create a new TAP, causing network setup to fail.
Impact
- urunc Pod restart/retry can fail to configure networking because leftover TAPs block creation of a fresh TAP.
- Only affects TAPs created by urunc (naming pattern
tap*_urunc). Must avoid deleting other CNI/user interfaces.
Steps to reproduce
- Deploy the test Pod
Apply the test manifest and wait for the Pod to reach Running:
kubectl apply -f nginx-urunc.yaml
kubectl get pods
output:
deployment.apps/nginx-urunc created
service/nginx-urunc created
NAME READY STATUS RESTARTS AGE
nginx-urunc-67f8694dd6-874rc 1/1 Running 0 55s
- Locate the QEMU process for the urunc Pod
List QEMU processes on the host and record the PID of the Pod’s QEMU instance (PID 1374168 in this example):
ps aux | grep qemu
output:
root 1374168 35.0 0.0 840108 85048 ? Ssl 05:28 0:00 /usr/bin/qemu-system-x86_64 ... -net tap,ifname=tap0_urunc ...
- Inspect network interfaces inside the Pod netns
Enter the QEMU network namespace and list interfaces:
nsenter -t 1374168 -n ip link
output:
1: lo: <LOOPBACK,UP,LOWER_UP> ...
2: eth0@if288: <BROADCAST,MULTICAST,UP,LOWER_UP> ...
3: tap0_urunc: <BROADCAST,MULTICAST,UP,LOWER_UP> ...
At this point:
eth0 is UP, LOWER_UP
tap0_urunc is UP, LOWER_UP
The Pod is functioning normally
- Force-kill the QEMU process (simulate a crash)
kill -9 1374168
Check Pod status:
kubectl get pods
output:
NAME READY STATUS RESTARTS AGE
nginx-urunc-67f8694dd6-874rc 0/1 Error 0 65s
Kubernetes then automatically restarts the Pod:
NAME READY STATUS RESTARTS AGE
nginx-urunc-67f8694dd6-874rc 1/1 Running 1 (6s ago) 69s
- Locate the new QEMU process after restart
List QEMU processes again and record the new PID (PID 1374761 in this example):
ps aux | grep qemu
output:
root 1374761 1.2 0.0 838364 83976 ? Ssl 05:29 0:00 /usr/bin/qemu-system-x86_64 ...
- Inspect interfaces in the new QEMU ns
nsenter -t 1374761 -n ip link
output:
1: lo: <LOOPBACK,UP,LOWER_UP> ...
2: eth0@if288: <BROADCAST,MULTICAST,UP,LOWER_UP> ...
3: tap0_urunc: <NO-CARRIER,BROADCAST,MULTICAST,UP> ... state DOWN
Observed state after restart:
eth0 remains UP, LOWER_UP
tap0_urunc still exists but is now NO-CARRIER and state DOWN
- Observe urunc error logs
Relevant log messages from urunc:
Failed to setup network :unsupported operation: can't spawn multiple unikernels in the same network namespace
The Pod restart succeeds at the Kubernetes level, but network setup inside urunc fails due to the presence of the pre-existing tap0_urunc device.
Reason
In kubernetes setups when a pod is getting restarted, the network namespace (created by the pause container) remains active and hence the tap0_urunc device still exists. Therefore, when urunc (re)creates the container it identifies the tap0_urunc device and it does not recreates it.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status