Reduce RabbitMQ busy waiting, lowering CPU load

On machines with many cores, we were seeing excessive CPU load on systems
that were not very busy. With the following Erlang VM argument we saw
RabbitMQ CPU usage drop from about 150% to around 20%, on a system with
40 hyperthreads.

    +S 2:2

By default RabbitMQ starts N schedulers where N is the number of CPU
cores, including hyper-threaded cores. This is fine when you assume all
your CPUs are dedicated to RabbitMQ. Its not a good idea in a typical
Kolla Ansible setup. Here we go for two scheduler threads.
More details can be found here:
https://www.rabbitmq.com/runtime.html#scheduling
and here:
https://erlang.org/doc/man/erl.html#emulator-flags

    +sbwt none

This stops busy waiting of the scheduler, for more details see:
https://www.rabbitmq.com/runtime.html#busy-waiting
Newer versions of rabbit may need additional flags:
"+sbwt none +sbwtdcpu none +sbwtdio none"
But this patch should be back portable to older versions of RabbitMQ
used in Train and Stein.

Note that information on this tuning was found by looking at data from:
rabbitmq-diagnostics runtime_thread_stats
More details on that can be found here:
https://www.rabbitmq.com/runtime.html#thread-stats

Related-Bug: #1846467

Change-Id: Iced014acee7e590c10848e73feca166f48b622dc
This commit is contained in:
John Garbutt 2020-04-27 10:59:06 +01:00 committed by Mark Goddard
parent 008ada9062
commit 70f6f8e4c0
4 changed files with 40 additions and 9 deletions

View File

@ -71,7 +71,7 @@ rabbitmq_user: "openstack"
rabbitmq_cluster_name: "openstack"
rabbitmq_hostname: "{{ ansible_hostname }}"
rabbitmq_pid_file: "/var/lib/rabbitmq/mnesia/rabbitmq.pid"
rabbitmq_server_additional_erl_args: ""
rabbitmq_server_additional_erl_args: "+S 2:2 +sbwt none"
# Dict of TLS options for RabbitMQ. Keys will be prefixed with 'ssl_options.'.
rabbitmq_tls_options: {}
# To avoid split-brain

View File

@ -86,12 +86,25 @@ internal VIP. As such, traffic to this endpoint is encrypted when
Passing arguments to RabbitMQ server's Erlang VM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Erlang programs run in Erlang VM (virtual machine) and use Erlang runtime.
Erlang VM can be configured.
Erlang programs run in an Erlang VM (virtual machine) and use the Erlang
runtime. The Erlang VM can be configured.
Kolla Ansible makes it possible to pass arguments to the Erlang VM via the
usage of ``rabbitmq_server_additional_erl_args`` variable. The contents of it
are appended to ``RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS`` environment variable
passed to RabbitMQ server startup script. Kolla Ansible already configures
RabbitMQ server for IPv6 (if necessary). Any argument can be passed there as
documented in https://www.rabbitmq.com/runtime.html
usage of the ``rabbitmq_server_additional_erl_args`` variable. The contents of
it are appended to the ``RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS`` environment
variable which is passed to the RabbitMQ server startup script. Kolla Ansible
already configures RabbitMQ server for IPv6 (if necessary). Any argument can be
passed there as documented in https://www.rabbitmq.com/runtime.html
The default value for ``rabbitmq_server_additional_erl_args`` is ``+S 2:2 +sbwt
none``.
By default RabbitMQ starts N schedulers where N is the number of CPU cores,
including hyper-threaded cores. This is fine when you assume all CPUs are
dedicated to RabbitMQ. Its not a good idea in a typical Kolla Ansible setup.
Here we go for two scheduler threads (``+S 2:2``). More details can be found
here: https://www.rabbitmq.com/runtime.html#scheduling and here:
https://erlang.org/doc/man/erl.html#emulator-flags
The ``+sbwt`` argument prevents busy waiting of the scheduler, for more details
see: https://www.rabbitmq.com/runtime.html#busy-waiting.

View File

@ -396,7 +396,12 @@
# See Kolla Ansible docs RabbitMQ section for details.
# These are appended to args already provided by Kolla Ansible
# to configure IPv6 in RabbitMQ server.
#rabbitmq_server_additional_erl_args: ""
# More details can be found in the RabbitMQ docs:
# https://www.rabbitmq.com/runtime.html#scheduling
# https://www.rabbitmq.com/runtime.html#busy-waiting
# The default tells RabbitMQ to always use two cores (+S 2:2),
# and not to busy wait (+sbwt none):
#rabbitmq_server_additional_erl_args: "+S 2:2 +sbwt none"
# Whether to enable TLS encryption for RabbitMQ client-server communication.
#rabbitmq_enable_tls: "no"
# CA certificate bundle in RabbitMQ container.

View File

@ -0,0 +1,13 @@
---
fixes:
- |
Fixes an issue where RabbitMQ consumes a large amount of CPU, particularly
on multi-core systems. The default RabbitMQ tuning assumes that RabbitMQ
is running on a dedicated host, which is the opposite of a typical Kolla
Ansible container setup. For more details on tuning RabbitMQ in your
environment, please see: https://www.rabbitmq.com/runtime.html#busy-waiting
https://www.rabbitmq.com/runtime.html#scheduling
upgrade:
- |
Modifies the default value of ``rabbitmq_server_additional_erl_args`` from
an empty string to ``+S 2:2 +sbwt none``.