vm_manager response in degraded cluster

Edit after initial posting:
The problem was not correctly identified. The issue now refers to
- Add a proper timeout on ceph commands (currently the command never ends if ceph doesn't respond)
- Add a proper error message when the qorum is not formed
- Make vm_manager create command work even when one node is in standby or maintenance mode (the rest of the commands works from what I see)

**Original issue:**


**Describe the bug**
When the cluster is in a degraded state (one of the hypervisors is powered off), no vm_manager command responds (start, stop, list ...)

**To Reproduce**
- Create a SEAPATH cluster (no matter the distribution) and configure it
- Deploy a VM in it
- Shut down one hypervisor
- Launch a `vm-mgr list` command on one of the other hypervisors
- The command never responds

**Expected behavior**
When the quorum is formed (at least two machines are up and connected), vm_manager should be able to respond correctly, even to create and deploy VMs

**First investigations**
It seems that vm_manager was developed with only a fully running cluster in mind. There is no mention of the word "quorum" in the code.
vm_manager should be able to detect if the quorum is formed on the current machine
- If it is: schedule the command correctly
- If it is not: fail with a meaningful message

IMO, we should even be able to revert this PR https://github.com/seapath/ansible/pull/870, as vm_manager should be able to deploy a VM even when one physical machine is out of the quorum.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vm_manager response in degraded cluster #81

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

vm_manager response in degraded cluster #81

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions