Abstraction
- How to CloudStack migrate VMs, if hypervisor was broken.
1. Detect Cnode was unreacable or broken
2. Call scheduleRestartForVmsOnHost(HostVO, boolean)
- file: server.src.com.cloud.ha.HighAvailabilityManagerImpl.java
3. Find VMs in that host
- send alert email
- scheduleRestart(vm, investigate)
4. scheduleRestart()
- new HaWorkVO()
- call wakeupWorkers();
5. run work
- since there is no Investigator
- Fencing off VM
- but there is no OvmFenceBuilder (Failed)
- How to CloudStack migrate VMs, if hypervisor was broken.
1. Detect Cnode was unreacable or broken
2012-03-16 11:44:21,952 ERROR [agent.manager.AgentManagerImpl] (AgentTaskPool-13:null) Host is down: 94-cnode04-m.pod1.kr-0.dmz.xxx.com. Starting HA on the VMs
2. Call scheduleRestartForVmsOnHost(HostVO, boolean)
- file: server.src.com.cloud.ha.HighAvailabilityManagerImpl.java
2012-03-16 11:44:21,963 WARN [cloud.ha.HighAvailabilityManagerImpl] (AgentTaskPool-13:null) Scheduling restart for VMs on host 94
3. Find VMs in that host
- send alert email
- scheduleRestart(vm, investigate)
2012-03-16 11:44:21,987 DEBUG [cloud.ha.HighAvailabilityManagerImpl] (AgentTaskPool-13:null) Notifying HA Mgr of to restart vm 88-i-9-88-VM
2012-03-16 11:44:21,998 INFO [cloud.ha.HighAvailabilityManagerImpl] (AgentTaskPool-13:null) Schedule vm for HA: VM[User|i-9-88-VM]
- new HaWorkVO()
- call wakeupWorkers();
5. run work
2012-03-16 11:44:22,012 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) Processing HAWork[40-HA-88-Running-Investigating]
6. restart()
- find Investigator
- In OVM it failed, since OVM does not implement OvmInvestigator
2012-03-16 11:44:22,027 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) HA on VM[User|i-9-88-VM]
- find Investigator
- In OVM it failed, since OVM does not implement OvmInvestigator
2012-03-16 11:44:22,052 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) SimpleInvestigator found VM[User|i-9-88-VM]to be alive? null
2012-03-16 11:44:22,053 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) VmwareInvestigator found VM[User|i-9-88-VM]to be alive? null
2012-03-16 11:44:22,053 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) XenServerInvestigator found VM[User|i-9-88-VM]to be alive? null
- since there is no Investigator
- Fencing off VM
- but there is no OvmFenceBuilder (Failed)
2012-03-16 11:46:28,712 DEBUG [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) Fencing off VM that we don't know the state of
2012-03-16 11:46:28,712 DEBUG [cloud.ha.XenServerFencer] (HA-Worker-4:work-40) Don't know how to fence non XenServer hosts Ovm
2012-03-16 11:46:28,712 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) Fencer XenServerFenceBuilder returned null
2012-03-16 11:46:28,713 DEBUG [cloud.ha.KVMFencer] (HA-Worker-4:work-40) Don't know how to fence non kvm hosts Ovm
2012-03-16 11:46:28,713 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) Fencer KVMFenceBuilder returned null
Finally unable to restart VM
2012-03-16 11:56:28,923 DEBUG [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-40) We were unable to fence off the VM VM[User|i-9-88-VM]