kolla-ansible/ansible
Michal Arbet 7989756699 Improvement of ProxySQL Monitoring Configuration
This update enhances the monitoring of the databasecluster
in ProxySQL. The default monitoring intervals were insufficient
for reliably detecting failures in the Galera cluster environment.

A detailed configuration for monitoring intervals has been
introduced, providing better control over how quickly and accurately
ProxySQL can identify issues.

  - Variables such as `mariadb_monitor_connect_interval`,
    `mariadb_monitor_galera_healthcheck_interval, and
    `mariadb_monitor_ping_interval` significantly reduce
    the time between connection checks.

  - Timeouts like `mariadb_monitor_galera_healthcheck_timeout`
    and `mariadb_monitor_ping_timeout` allow faster failure
    detection, while `mariadb_monitor_galera_healthcheck_max_timeout_count`
    sets the maximum number of allowed timeouts before marking a node as down.

Calculation:

 - Galera healthcheck:

   4 seconds (interval) + 1 second (timeout) + 4 seconds (interval)
   + 1 second (timeout) = 10 seconds.

 - Ping healthcheck:

   3 seconds (interval) + 2 seconds (timeout) + 3 seconds (interval)
   + 2 seconds (timeout) = 10 seconds.

Both the health check and ping check mechanisms will detect a node failure
within a maximum of 10 seconds. Both processes (health check and ping)
operate independently, and failure in either mechanism will mark the node
as failed.

Health Check Failure Detection: Up to 10 seconds.
Ping Failure Detection: Up to 10 seconds.
Connect Attempts: ProxySQL also tries to connect every 2 seconds, which
helps monitor connectivity.

These changes ensure that ProxySQL can detect issues in 10 seconds
as haproxy, significantly reducing downtime compared to default settings.
This adjustment enables faster and more reliable monitoring, improving system
stability and reducing potential downtime in production environments.

Change-Id: Ic28801519cdb35ed2387a1468b9df661847a5476
2024-09-23 15:38:10 +02:00
..
action_plugins Fix maximum width of the DIB Multiline-YAML 2023-04-14 16:36:22 +03:00
filter_plugins haproxy: support single external frontend 2023-06-29 01:44:00 +02:00
group_vars Improvement of ProxySQL Monitoring Configuration 2024-09-23 15:38:10 +02:00
inventory Drop prometheus-msteams support 2024-08-27 11:24:48 +02:00
library Separate outputs of kolla_toolbox inner module 2024-09-12 20:19:49 +02:00
module_utils systemd: Add Wants=docker.service for docker 2024-09-13 09:38:06 +02:00
roles Improvement of ProxySQL Monitoring Configuration 2024-09-23 15:38:10 +02:00
bifrost.yml Update "openstack_release" variable to static brach name 2019-09-16 12:42:44 +00:00
certificates.yml certificates: generate libvirt TLS certificates 2022-02-03 14:32:38 +00:00
destroy.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
gather-facts.yml Merge "Revert "Allow setting any_errors_fatal true for gather-facts"" 2024-08-23 13:52:21 +00:00
kolla-host.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
mariadb_backup.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
mariadb_recovery.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
mariadb.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
nova-libvirt-cleanup.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
nova.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
octavia-certificates.yml octavia: generate certificates automatically 2020-10-08 16:50:30 +02:00
post-deploy.yml Template system scoped admin-openrc and clouds.yml files 2024-02-15 15:01:59 +00:00
prune-images.yml Support Ansible max_fail_percentage 2023-12-05 11:49:42 +01:00
rabbitmq-reset-state.yml Add command to force reset the state of RabbitMQ 2023-08-25 10:09:58 +00:00
rabbitmq-upgrade.yml Add command to upgrade to a target version of RMQ 2024-08-12 15:05:42 +01:00
rabbitmq.yml RMQ: enable all stable feature flags at once 2024-05-13 13:26:10 +01:00
site.yml Move Nova roles after OVS 2024-06-13 07:39:53 +00:00