Changing tiller pod networking settings to improve swact time

Based on investigation by Matt, the tiller-deploy pod was running
in the cluster network namespace and therefore not inheriting host
TCP keepalive parameters.

During a swact, when the floating IP is taken down, tiller keepalive
is so large its the kube-apiserver detects the timeout after 15 minutes
(5 probes * 180 seconds)

The cluster namespace values are 9 probes at 75 second intervals.
The host TCP values are 5 consecutive probes at 1 second intervals.

By changing the tiller pod to be deployed using the host network,
it will inherit the host sysctl values and detect much more quickly.
(10 seconds)

Adding additional override settings during helm init for tiller
helm init <params> --override spec.template.spec.hostNetwork=true

These changes were added to the ansible playbook.

Change-Id: I218e4ef37100950c8ac5a0cb9759d9df50d9e368
Closes-Bug: 1817941
Partial-Bug: 1818123
Co-Authored-By: Matt Peters <Matt.Peters@windriver.com>
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
This commit is contained in:
Al Bailey 2019-05-03 15:11:57 -05:00
parent e336e6e58a
commit c85f6a1142
2 changed files with 3 additions and 1 deletions

View File

@ -1,2 +1,2 @@
SRC_DIR="playbookconfig"
TIS_PATCH_VER=1
TIS_PATCH_VER=2

View File

@ -132,6 +132,7 @@
command: >-
helm init --skip-refresh --service-account tiller --node-selectors
"node-role.kubernetes.io/master"="" --tiller-image={{ tiller_img }}
--override spec.template.spec.hostNetwork=true
become_user: wrsroot
environment:
KUBECONFIG: /etc/kubernetes/admin.conf
@ -145,6 +146,7 @@
command: >-
helm init --skip-refresh --service-account tiller --node-selectors
"node-role.kubernetes.io/master"="" --tiller-image={{ tiller_img }}
--override spec.template.spec.hostNetwork=true
environment:
KUBECONFIG: /etc/kubernetes/admin.conf
HOME: /home/wrsroot