This update enhances the monitoring of the databasecluster
in ProxySQL. The default monitoring intervals were insufficient
for reliably detecting failures in the Galera cluster environment.
A detailed configuration for monitoring intervals has been
introduced, providing better control over how quickly and accurately
ProxySQL can identify issues.
- Variables such as `mariadb_monitor_connect_interval`,
`mariadb_monitor_galera_healthcheck_interval, and
`mariadb_monitor_ping_interval` significantly reduce
the time between connection checks.
- Timeouts like `mariadb_monitor_galera_healthcheck_timeout`
and `mariadb_monitor_ping_timeout` allow faster failure
detection, while `mariadb_monitor_galera_healthcheck_max_timeout_count`
sets the maximum number of allowed timeouts before marking a node as down.
Calculation:
- Galera healthcheck:
4 seconds (interval) + 1 second (timeout) + 4 seconds (interval)
+ 1 second (timeout) = 10 seconds.
- Ping healthcheck:
3 seconds (interval) + 2 seconds (timeout) + 3 seconds (interval)
+ 2 seconds (timeout) = 10 seconds.
Both the health check and ping check mechanisms will detect a node failure
within a maximum of 10 seconds. Both processes (health check and ping)
operate independently, and failure in either mechanism will mark the node
as failed.
Health Check Failure Detection: Up to 10 seconds.
Ping Failure Detection: Up to 10 seconds.
Connect Attempts: ProxySQL also tries to connect every 2 seconds, which
helps monitor connectivity.
These changes ensure that ProxySQL can detect issues in 10 seconds
as haproxy, significantly reducing downtime compared to default settings.
This adjustment enables faster and more reliable monitoring, improving system
stability and reducing potential downtime in production environments.
Change-Id: Ic28801519cdb35ed2387a1468b9df661847a5476