osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ironic] - automated cleaning


We have enabled automatic cleaning and I have noticed that our UEFI enabled nodes most often end up in "clean failed" state.
Basic troubleshooting reveals that PXE correctly boots IPXE, but the IPXE goes and boots the OS instead of downloading the deployed_kernel / ramdisk.

Here's the timeline of events.Dec 18 12:19:39 sc-ironic04 Node 56e58642-12ac-4455-bc95-2a328198f845 moved to provision state "cleaning"Dec 18 12:20:32 sc-ironic04 Successfully set node 56e58642-12ac-4455-bc95-2a328198f845 power state to power onDec 18 12:21:14  DHCPACK(ns-601d0738-39) 10.33.23.7 6c:b3:11:4f:8b:18 (PXE Boot gets IP: 10.33.23.7)Dec 18 12:21:15 sc-ironic04 in.tftpd[367896]: Client 10.33.23.7 finished ipxe_x86_64.efi (TFTP of IPXE is complete)Dec 18 12:21:23 DHCPACK(ns-261f35c5-7e) 10.33.23.7 6c:b3:11:4f:8b:18 host-10-33-23-7 (IPXE acquires IP: 10.33.23.7)NO HTTP request for deploy_kernel
Instead:Dec 18 12:22:40 sc-control04 dnsmasq-dhcp[3508449]: 3990148443 DHCPREQUEST(ns-261f35c5-7e) 10.33.22.188 6c:b3:11:4f:8b:18Dec 18 12:22:40 sc-control04 dnsmasq-dhcp[3508449]: 3990148443 DHCPNAK(ns-261f35c5-7e) 10.33.22.188 6c:b3:11:4f:8b:18 wrong address
When the OS was running it had IP address of 10.33.22.188.  The OS came up and tries to renew its lease for 10.33.22.188 which was NAKed by the DHCP server. The OS then did a full DHCP (DISCOVER, etc.), got a valid IP address, did a cloud-init, etc.  
Is this a known issue that has been fixed ? Any pointers would be appreciated.
thanks,Farad.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191220/6cf8fe50/attachment.html>