osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[nova][scheduler] scheduler spawns to the same compute node only


Hi all,

we have a kolla-ansible deployed "Queens" Release of openstack with 8 
compute nodes and an external Percona XtraDB Cluster (with read-write 
split with haproxy).


New VMs  are just currently always scheduled to the same compute node, 
even though a manual live-migration is working fine to other compute nodes.


We're not sure, what the issue is, but perhaps someone may spot it from 
our config:


# nova.conf  scheduler config

default_availability_zone = az1

...

[filter_scheduler]
available_filters = nova.scheduler.filters.all_filters
enabled_filters = RetryFilter, AvailabilityZoneFilter, 
ComputeCapabilitiesFilter, ImagePropertiesFilter, 
ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, 
AggregateInstanceExtraSpecsFilter, AggregateMultiTenancyIsolation, 
DifferentHostFilter, RamFilter, SameHostFilter, NUMATopologyFilter



Database is an external Percona XtraDB Cluster (Version 5.7.24) with 
haproxy for read-write-splitting (currently only one write node).

We do see mysql errors in the nova-scheduler.log on the write DB node 
when an instance is created.


2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db [-] 
Unexpected error while reporting service status: OperationalError: 
(pymysql.err.OperationalError) (1213, u'WSREP detected deadlock/conflict 
and aborted the transaction. Try restarting the transaction') 
(Background on this error at: http://sqlalche.me/e/e3q8)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db Traceback 
(most recent call last):
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py", 
line 91, in _report_state
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
service.service_ref.save()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_versionedobjects/base.py", 
line 226, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return 
fn(self, *args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/service.py", 
line 397, in save
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db db_service 
= db.service_update(self._context, self.id, updates)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/db/api.py", 
line 183, in service_update
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return 
IMPL.service_update(context, service_id, values)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/api.py", 
line 154, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
ectxt.value = e.inner_exc
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 220, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.force_reraise()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 196, in force_reraise
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
six.reraise(self.type_, self.value, self.tb)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/api.py", 
line 142, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return 
f(*args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", 
line 227, in wrapped
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return 
f(context, *args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.gen.next()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", 
line 1043, in _transaction_scope
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db yield resource
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.gen.next()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", 
line 653, in _session
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.session.rollback()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 220, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.force_reraise()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 196, in force_reraise
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
six.reraise(self.type_, self.value, self.tb)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", 
line 650, in _session
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self._end_session_transaction(self.session)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", 
line 678, in _end_session_transaction
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
session.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
line 943, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.transaction.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
line 471, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db t[1].commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
line 1643, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self._do_commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
line 1674, in _do_commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.connection._commit_impl()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
line 726, in _commit_impl
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self._handle_dbapi_exception(e, None, None, None, None)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
line 1409, in _handle_dbapi_exception
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
util.raise_from_cause(newraise, exc_info)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", 
line 265, in raise_from_cause
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
reraise(type(exception), exception, tb=exc_tb, cause=cause)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
line 724, in _commit_impl
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self.engine.dialect.do_commit(self.connection)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", 
line 1765, in do_commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
dbapi_connection.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py", 
line 422, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
self._read_ok_packet()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py", 
line 396, in _read_ok_packet
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db pkt = 
self._read_packet()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py", 
line 683, in _read_packet
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
packet.check_error()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/protocol.py", 
line 220, in check_error
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
err.raise_mysql_exception(self._data)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/err.py", 
line 109, in raise_mysql_exception
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db raise 
errorclass(errno, errval)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db 
OperationalError: (pymysql.err.OperationalError) (1213, u'WSREP detected 
deadlock/conflict and aborted the transaction. Try restarting the 
transaction') (Background on this error at: http://sqlalche.me/e/e3q8)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
2019-04-15 16:52:20.020 24 INFO nova.servicegroup.drivers.db [-] 
Recovered from being unable to report status.


The deadlock message is quite strange, as we have haproxy configured so 
all write requests are handled by one node.


There are NO errors in the mysqld.log WHILE creating an instance, but we 
see from time to time aborted connections from nova.

2019-04-15T14:22:36.232108Z 30616972 [Note] Aborted connection 30616972 
to db: 'nova' user: 'nova' host: '10.x.y.z' (Got an error reading 
communication packets)



As I said, all instances are allocated to the same compute node. 
nova-compute.log doesn't show an error while creating the instance.


Beside that, we also see messages from nova.scheduler.host_manager on 
all other nodes like (but those messages are _not_ triggered, when an 
instance is spawned.!)


2019-04-15 16:28:47.771 22 INFO nova.scheduler.host_manager 
[req-f92e340e-a88a-44a0-8cad-588390c25bc2 - - - - -] The instance sync 
for host 'xxx' did not match. Re-created its InstanceList.



Don't know if that may be relevant, but somehow our (currently single) 
AZ is listed several times.


# openstack availability zone list
+------------+-------------+
| Zone Name  | Zone Status |
+------------+-------------+
| internal   | available   |
| az1 | available           |
| az1 | available           |
| az1 | available           |
| az1 | available           |
+------------+-------------+

May that be related somehow?


Thanks for any consideration and support!


kind regards


Nicolas

-- 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5230 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190415/6c21110a/attachment-0001.bin>