Adjust RabbitMQ HA policy to make reply queues HA
Changes in oslo.messaging for 2023.1 exposed a known race condition in RabbitMQ when dealing with non-HA classic queues. When a RMQ cluster member is taken down, clients failing over to other members may erroneously be told a queue exists when it is in the process of being deleted. This can cause them to permanently sit waiting for messages from a queue that no longer exists until their services are restarted. Making the reply queues HA resolves this issue, at the expense of a x3 increase in reply queues across the cluster. My assumption is that reply queues were previously excluded from HA policy as a performance gain given their link to the number of compute nodes in an OpenStack deployment. Context: https://bugs.launchpad.net/oslo.messaging/+bug/2031512 Depends-On: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/916042 Change-Id: Iee6b5f8cc1ad04988c8634f8b6e026e2f8c75b52
This commit is contained in:
parent
506d3bae49
commit
d4530e242d
@ -32,7 +32,7 @@ oslomsg_rabbit_quorum_queues: False
|
||||
|
||||
rabbitmq_policies:
|
||||
- name: "HA"
|
||||
pattern: '^(?!(amq\.)|(.*_fanout_)|(reply_)).*'
|
||||
pattern: '^(?!(amq\.)|(.*_fanout_)).*'
|
||||
priority: 0
|
||||
tags: "ha-mode=all"
|
||||
state: "{{ (oslomsg_rabbit_quorum_queues | default(True) or not rabbitmq_queue_replication) | ternary('absent', 'present') }}"
|
||||
|
@ -0,0 +1,7 @@
|
||||
---
|
||||
upgrade:
|
||||
- |
|
||||
When using RabbitMQ in a high availability cluster (non-quorum queues),
|
||||
transient 'reply\_' queues are now included in the HA policy where they
|
||||
previously were not. Note that this will increase the load on the RabbitMQ
|
||||
cluster, particularly for deployments with large numbers of compute nodes.
|
Loading…
x
Reference in New Issue
Block a user