osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[queens][nova] nova host-evacuate errot


Thanks for explanation Ignazio.

I have tried same same by trying to put the compute node on a failure (echo
'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not able
connect to it.
All the VMs are now in Error state.

Running the host-evacaute was successful on controller node, but now I am
not able to use the VMs. Because they are all in error state now.

root at h004:~$ nova host-evacuate h017
+--------------------------------------+-------------------+---------------+
| Server UUID                          | Evacuate Accepted | Error Message |
+--------------------------------------+-------------------+---------------+
| f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True              |               |
| 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True              |               |
| abe7075b-ac22-4168-bf3d-d302ba37d80e | True              |               |
| c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True              |               |
| ffd983bb-851e-4314-9d1d-375303c278f3 | True              |               |
+--------------------------------------+-------------------+---------------+

Now I have restarted the compute node manually , now I am able to connect
to the compute node but VMs are still in Error state.
1. Any ideas, how to recover the VMs?
2. Are there any other methods to evacuate, as this method seems to be not
working in mitaka version.

~Jay.

On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> Ok Jay,
> let me to describe my environment.
> I have an openstack made up of 3 controllers nodes ad several compute
> nodes.
> The controller nodes services are controlled by pacemaker and the compute
> nodes services are controlled by remote pacemaker.
> My hardware is Dell so I am using ipmi fencing device .
> I wrote a service controlled by pacemaker:
> this service controls if a compude node fails and for avoiding split
> brains if a compute node does nod respond on the management network and on
> storage network the stonith poweroff the node and then execute a nova
> host-evacuate.
>
> Anycase to have a simulation before writing the service I described above
> you can do as follows:
>
> connect on one compute node where some virtual machines are running
> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately the
> node like in case of failure)
> On a controller node run:  nova host-evacuate "name of failed compute node"
> Instances running on the failed compute node should be restarted on
> another compute node
>
>
> Ignazio
>
> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See <jayachander.it at gmail.com>
> ha scritto:
>
>> Hi ,
>>
>> I have tried on a failed compute node which is in power off state now.
>> I have tried on a running compute node, no errors. But nothing happens.
>> On running compute node - Disabled the compute service and tried
>> migration also.
>>
>> May be I might have not followed proper steps. Just wanted to know the
>> steps you have followed. Otherwise, I was planning to manual migration also
>> if possible.
>> ~Jay.
>>
>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano <
>> ignaziocassano at gmail.com> wrote:
>>
>>> Hi Jay,
>>> would you like to evacuate a failed compute node or evacuate a running
>>> compute node ?
>>>
>>> Ignazio
>>>
>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See <
>>> jayachander.it at gmail.com> ha scritto:
>>>
>>>> Hi Ignazio,
>>>>
>>>> I am trying to evacuate the compute host on older version (mitaka).
>>>> Could please share the process you followed. I am not able to succeed
>>>> with openstack live-migration fails with error message (this is known issue
>>>> in older versions) and nova live-ligration - nothing happens even after
>>>> initiating VM migration. It is almost 4 days.
>>>>
>>>> ~Jay.
>>>>
>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano <
>>>> ignaziocassano at gmail.com> wrote:
>>>>
>>>>> I am sorry.
>>>>> For simulating an host crash I used a wrong procedure.
>>>>> Using  "echo 'c' > /proc/sysrq-trigger" all work fine
>>>>>
>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano <
>>>>> ignaziocassano at gmail.com> ha scritto:
>>>>>
>>>>>> Hello All,
>>>>>> on ocata when I  poweroff a node with active instance , doing a nova
>>>>>> host-evacuate works  fine
>>>>>> and instances are restartd on an active node.
>>>>>> On queens it does non evacuate instances but nova-api reports for
>>>>>> each instance the following:
>>>>>>
>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi
>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9
>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown:
>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is
>>>>>> in task_state powering-off
>>>>>>
>>>>>> So it poweroff all instance on the failed node but does not start
>>>>>> them on active nodes
>>>>>>
>>>>>> What is changed ?
>>>>>> Ignazio
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> â??
>>>> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
>>>> necessary.*
>>>>
>>>
>>
>> --
>> â??
>> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
>> necessary.*
>>
>

-- 
â??
P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
necessary.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190711/5d1105ab/attachment-0001.html>