d4530e242d
Changes in oslo.messaging for 2023.1 exposed a known race condition in RabbitMQ when dealing with non-HA classic queues. When a RMQ cluster member is taken down, clients failing over to other members may erroneously be told a queue exists when it is in the process of being deleted. This can cause them to permanently sit waiting for messages from a queue that no longer exists until their services are restarted. Making the reply queues HA resolves this issue, at the expense of a x3 increase in reply queues across the cluster. My assumption is that reply queues were previously excluded from HA policy as a performance gain given their link to the number of compute nodes in an OpenStack deployment. Context: https://bugs.launchpad.net/oslo.messaging/+bug/2031512 Depends-On: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/916042 Change-Id: Iee6b5f8cc1ad04988c8634f8b6e026e2f8c75b52
8 lines
329 B
YAML
8 lines
329 B
YAML
---
|
|
upgrade:
|
|
- |
|
|
When using RabbitMQ in a high availability cluster (non-quorum queues),
|
|
transient 'reply\_' queues are now included in the HA policy where they
|
|
previously were not. Note that this will increase the load on the RabbitMQ
|
|
cluster, particularly for deployments with large numbers of compute nodes.
|