-
Notifications
You must be signed in to change notification settings - Fork 0
Description
We are trying to use lixs to replace oxenstored which is not capable to cope with lots of VMs restore/destroy in parallel.
Everything works fine , VMs are functionnal and the performance of lixs and the whole system is much better than the oxenstored one.
However, we found that when issuing an "xl destroy" the VIF interface linked to the VM is not deleted and xl complains.
Below a debug output of xl
libxl: debug: libxl_domain.c:1040:libxl_domain_destroy: Domain 6:ao 0x56156ae580f0: create: how=(nil) callback=(nil) poller=0x56156ae549b0
libxl: debug: libxl_dm.c:3237:libxl__destroy_device_model: Domain 6:Didn't find dm UID; destroying by pid
libxl: debug: libxl_dm.c:3106:kill_device_model: Device Model signaled
libxl: debug: libxl_event.c:639:libxl__ev_xswatch_register: watch w=0x56156ae5f190 wpath=/local/domain/0/backend/vif/6/0/state token=3/0: register slotnum=3
libxl: debug: libxl_domain.c:1049:libxl_domain_destroy: Domain 6:ao 0x56156ae580f0: inprogress: poller=0x56156ae549b0, flags=i
libxl: debug: libxl_event.c:576:watchfd_callback: watch w=0x56156ae5f190 wpath=/local/domain/0/backend/vif/6/0/state token=3/0: event epath=/local/domain/0/backend/vif/6/0/state
libxl: debug: libxl_event.c:881:devstate_callback: backend /local/domain/0/backend/vif/6/0/state wanted state 6 still waiting state 5
libxl: debug: libxl_linux.c:235:libxl__get_hotplug_script_info: Domain 6:backend_kind 3, no need to execute scripts
libxl: debug: libxl_device.c:1176:device_hotplug: Domain 6:No hotplug script to execute
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5e340: deregister unregistered
libxl: debug: libxl_linux.c:235:libxl__get_hotplug_script_info: Domain 6:backend_kind 3, no need to execute scripts
libxl: debug: libxl_device.c:1176:device_hotplug: Domain 6:No hotplug script to execute
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5e750: deregister unregistered
libxl: debug: libxl_linux.c:235:libxl__get_hotplug_script_info: Domain 6:backend_kind 6, no need to execute scripts
libxl: debug: libxl_device.c:1176:device_hotplug: Domain 6:No hotplug script to execute
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5d550: deregister unregistered
libxl: debug: libxl_aoutils.c:88:xswait_timeout_callback: backend /local/domain/0/backend/vif/6/0/state (hoping for state change to 6): xswait timeout (path=/local/domain/0/backend/vif/6/0/state)
libxl: debug: libxl_event.c:676:libxl__ev_xswatch_deregister: watch w=0x56156ae5f190 wpath=/local/domain/0/backend/vif/6/0/state token=3/0: deregister slotnum=3
libxl: debug: libxl_event.c:865:devstate_callback: backend /local/domain/0/backend/vif/6/0/state wanted state 6 timed out
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5f190: deregister unregistered
libxl: debug: libxl_device.c:1090:device_backend_callback: Domain 6:calling device_backend_cleanup
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5f190: deregister unregistered
libxl: error: libxl_device.c:1105:device_backend_callback: Domain 6:unable to remove device with path /local/domain/0/backend/vif/6/0
libxl: debug: libxl_event.c:689:libxl__ev_xswatch_deregister: watch w=0x56156ae5f290: deregister unregistered
libxl: error: libxl_domain.c:1290:devices_destroy_cb: Domain 6:libxl__devices_destroy failed
libxl: debug: libxl_domain.c:1355:devices_destroy_cb: Domain 6:Forked pid 21164 for destroy of domain
libxl: debug: libxl_event.c:1893:libxl__ao_complete: ao 0x56156ae580f0: complete, rc=0
libxl: debug: libxl_event.c:1862:libxl__ao__destroy: ao 0x56156ae580f0: destroy
The issue seems related to this line:
libxl: debug: libxl_event.c:881:devstate_callback: backend /local/domain/0/backend/vif/6/0/state wanted state 6 still waiting state 5
looking at the lixs debug log or using strace i cannot see a "write" with a state 6 in the socket.
We can see write with state 5 and the operation is successfull:
INFO [S177] > { type = 11, req_id = 0, tx_id = 210, len = 40, msg = "/local/domain/0/backend/vif/6/0/online 0" }
INFO [S177] < { type = 11, req_id = 0, tx_id = 210, len = 2, msg = "OK" }
INFO [S177] > { type = 11, req_id = 0, tx_id = 210, len = 39, msg = "/local/domain/0/backend/vif/6/0/state 5" }
INFO [S177] < { type = 11, req_id = 0, tx_id = 210, len = 2, msg = "OK" }
INFO [S177] > { type = 7, req_id = 0, tx_id = 210, len = 2, msg = "T " }
INFO [S177] < { type = 7, req_id = 0, tx_id = 210, len = 2, msg = "OK" }
if we try to set state to 6 manually before the destroy using:
xenstore-write /local/domain/0/backend/vif/6/0/state 6
The destroy is OK and xl does not complains, but only part of the interface is delete (the -qemu one) but the vif remains (need to call ifconfig -a to see it )
We are running Xen 4.13
Any idea to troubleshoot this further ?