osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[queens][nova] nova host-evacuate errot


Sorry ...the question was : how many compute nodes do you have ?
instead of how many compute nodes do gli have...


Anycase;
Did you configured cinder ?

Il giorno ven 12 lug 2019 alle ore 11:26 Jay See <jayachander.it at gmail.com>
ha scritto:

> Ignazio,
>
> One instance is stuck in error state not able to recover it. All other
> instances are running now.
>
> root at h004:~$ nova reset-state --all-tenants my-instance-1-2
> Reset state for server my-instance-1-2 succeeded; new state is error
>
> I have several compute nodes (14). I am not sure what is gli?
> Live migration is not working, i have tried it was not throwing any
> errors. But nothing seems to happen.
> I am not completely sure, I haven't heard about gli before. (This setup is
> deployed by someone else).
>
> ~Jay.
>
> On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Jay,  for recovering vm state use the command nova reset-state....
>>
>> nova help reset-state to check the command requested parameters.
>>
>> Ad far as evacuation la concerned, how many compute nodes do gli have ?
>> Instance live migration works?
>> Are gli using shared cinder storage?
>> Ignazio
>>
>> Il Gio 11 Lug 2019 20:51 Jay See <jayachander.it at gmail.com> ha scritto:
>>
>>> Thanks for explanation Ignazio.
>>>
>>> I have tried same same by trying to put the compute node on a failure
>>> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not
>>> able connect to it.
>>> All the VMs are now in Error state.
>>>
>>> Running the host-evacaute was successful on controller node, but now I
>>> am not able to use the VMs. Because they are all in error state now.
>>>
>>> root at h004:~$ nova host-evacuate h017
>>>
>>> +--------------------------------------+-------------------+---------------+
>>> | Server UUID                          | Evacuate Accepted | Error
>>> Message |
>>>
>>> +--------------------------------------+-------------------+---------------+
>>> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True              |
>>>   |
>>> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True              |
>>>   |
>>> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True              |
>>>   |
>>> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True              |
>>>   |
>>> | ffd983bb-851e-4314-9d1d-375303c278f3 | True              |
>>>   |
>>>
>>> +--------------------------------------+-------------------+---------------+
>>>
>>> Now I have restarted the compute node manually , now I am able to
>>> connect to the compute node but VMs are still in Error state.
>>> 1. Any ideas, how to recover the VMs?
>>> 2. Are there any other methods to evacuate, as this method seems to be
>>> not working in mitaka version.
>>>
>>> ~Jay.
>>>
>>> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano <
>>> ignaziocassano at gmail.com> wrote:
>>>
>>>> Ok Jay,
>>>> let me to describe my environment.
>>>> I have an openstack made up of 3 controllers nodes ad several compute
>>>> nodes.
>>>> The controller nodes services are controlled by pacemaker and the
>>>> compute nodes services are controlled by remote pacemaker.
>>>> My hardware is Dell so I am using ipmi fencing device .
>>>> I wrote a service controlled by pacemaker:
>>>> this service controls if a compude node fails and for avoiding split
>>>> brains if a compute node does nod respond on the management network and on
>>>> storage network the stonith poweroff the node and then execute a nova
>>>> host-evacuate.
>>>>
>>>> Anycase to have a simulation before writing the service I described
>>>> above you can do as follows:
>>>>
>>>> connect on one compute node where some virtual machines are running
>>>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately
>>>> the node like in case of failure)
>>>> On a controller node run:  nova host-evacuate "name of failed compute
>>>> node"
>>>> Instances running on the failed compute node should be restarted on
>>>> another compute node
>>>>
>>>>
>>>> Ignazio
>>>>
>>>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See <
>>>> jayachander.it at gmail.com> ha scritto:
>>>>
>>>>> Hi ,
>>>>>
>>>>> I have tried on a failed compute node which is in power off state now.
>>>>> I have tried on a running compute node, no errors. But nothing happens.
>>>>> On running compute node - Disabled the compute service and tried
>>>>> migration also.
>>>>>
>>>>> May be I might have not followed proper steps. Just wanted to know the
>>>>> steps you have followed. Otherwise, I was planning to manual migration also
>>>>> if possible.
>>>>> ~Jay.
>>>>>
>>>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano <
>>>>> ignaziocassano at gmail.com> wrote:
>>>>>
>>>>>> Hi Jay,
>>>>>> would you like to evacuate a failed compute node or evacuate a
>>>>>> running compute node ?
>>>>>>
>>>>>> Ignazio
>>>>>>
>>>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See <
>>>>>> jayachander.it at gmail.com> ha scritto:
>>>>>>
>>>>>>> Hi Ignazio,
>>>>>>>
>>>>>>> I am trying to evacuate the compute host on older version (mitaka).
>>>>>>> Could please share the process you followed. I am not able to
>>>>>>> succeed with openstack live-migration fails with error message (this is
>>>>>>> known issue in older versions) and nova live-ligration - nothing happens
>>>>>>> even after initiating VM migration. It is almost 4 days.
>>>>>>>
>>>>>>> ~Jay.
>>>>>>>
>>>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano <
>>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>>
>>>>>>>> I am sorry.
>>>>>>>> For simulating an host crash I used a wrong procedure.
>>>>>>>> Using  "echo 'c' > /proc/sysrq-trigger" all work fine
>>>>>>>>
>>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano <
>>>>>>>> ignaziocassano at gmail.com> ha scritto:
>>>>>>>>
>>>>>>>>> Hello All,
>>>>>>>>> on ocata when I  poweroff a node with active instance , doing a
>>>>>>>>> nova host-evacuate works  fine
>>>>>>>>> and instances are restartd on an active node.
>>>>>>>>> On queens it does non evacuate instances but nova-api reports for
>>>>>>>>> each instance the following:
>>>>>>>>>
>>>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi
>>>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9
>>>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown:
>>>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is
>>>>>>>>> in task_state powering-off
>>>>>>>>>
>>>>>>>>> So it poweroff all instance on the failed node but does not start
>>>>>>>>> them on active nodes
>>>>>>>>>
>>>>>>>>> What is changed ?
>>>>>>>>> Ignazio
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> â??
>>>>>>> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
>>>>>>> necessary.*
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> â??
>>>>> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
>>>>> necessary.*
>>>>>
>>>>
>>>
>>> --
>>> â??
>>> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
>>> necessary.*
>>>
>>
>
> --
> â??
> P  *SAVE PAPER â?? Please do not print this e-mail unless absolutely
> necessary.*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190712/ad2542e8/attachment.html>