Increasing MariaDB startupProbe timeout

In the event of an uncontrolled reboot on a Standard configuration,
we were seeing a behavior where the MariaDB pods kept trying to elect a
leader and restarting until the pods get to CrashLoopBackoff. After
checking the logs closely and reproducing the problem quite easily by
deleting both pods at the same time, we came to the conclusion that the
cluster wasn't having enough time to elect a new leader and recover from
the crash. This patch increases the timeout for the startup probe of the
mariadb statefulset with some slack to allow databases that are in
production to fully resync the data between the 2 pods.

Closes-Bug: #1938346

Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I19e49dab55f3a8661fa71be315093029adb0947e
This commit is contained in:
Thiago Brito 2021-07-28 19:09:37 -03:00 committed by Thiago Paiva Brito
parent 31c4390122
commit 52b3185a19

View File

@ -47,9 +47,9 @@ index 2d75f39..444bba3 100644
+ startup: + startup:
+ enabled: false + enabled: false
+ params: + params:
+ initialDelaySeconds: 30 + initialDelaySeconds: 60
+ periodSeconds: 30 + periodSeconds: 60
+ failureThreshold: 3 + failureThreshold: 10
security_context: security_context:
server: server:
pod: pod: