sanitize reserved cpus list before kubelet starts

The script will run everytime before the kubelet service is started.

It reads the reserved-cpus list for the kubelet from the service
environment file and sanitizes it on the basis of online CPUs.

If none of the reserved cpus is online, it removes the
--reserved-cpus flag from the environment file which allows
the kubelet to choose CPUs itself.

Sanitizing the reserved-cpus list everytime before the kubelet starts
assures that the kubelet will not fail to start due to unavailability
of one or more CPUs in the list.

By enabling or disabling CPU hyperthreading, available CPUs change.
This change will make sure changing CPU hyperthreading setting will
not lead to kubelet start failure after the system boots up.

Test Plan: (On AIO-SX)

PASS:
Initial Hyperthreading state: enabled
Host-lock->Reboot->Disable CPU hyperthreading and reboot->Host-unlock
Observe kubelet does not fail to start before host-unlock.
All pods states are as expected. Host-unlock succeeds.

PASS:
Initial Hyperthreading state: disabled
Host-lock->Reboot->Enable CPU hyperthreading and reboot->Host-unlock
Observe kubelet does not fail to start before host-unlock.
All pods states are as expected. Host-unlock succeeds.

PASS:
Manually restart the Kubelet service.
Observe that the kubelet does not fail to start.
All pods states are as expected.

PASS:
Host-lock->Host unlock (without any config change).
Observe that the kubelet does not fail to start.
All pods states are as expected.

PASS:
Packages built successfully on both Debian and CentOS.

Closes-Bug: 1955608

Change-Id: I699c5c36a56a50d4c48faa816edad69c17058079
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
This commit is contained in:
kdhokte 2022-02-01 21:09:03 -05:00 committed by Kaustubh Dhokte
parent 2ed687c8c1
commit c9b781b7c0
12 changed files with 214 additions and 0 deletions

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -9,6 +9,7 @@ EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart= ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=-/usr/local/sbin/sanitize_kubelet_reserved_cpus.sh /etc/sysconfig/kubelet
ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh ExecStartPre=-/usr/bin/kubelet-cgroup-setup.sh
ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;' ExecStartPost=/bin/bash -c 'echo $MAINPID > /var/run/kubelet.pid;'
ExecStopPost=/bin/rm -f /var/run/kubelet.pid ExecStopPost=/bin/rm -f /var/run/kubelet.pid

View File

@ -0,0 +1,98 @@
#! /bin/bash
# Copyright (c) 2022 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# The script will run everytime before the kubelet service is started.
# (Runs as a "ExecStartPre" action)
#
# It reads the reserved-cpus list for the kubelet from the kubelet
# environment file and sanitizes it on the basis of online CPUs.
#
# If none of the reserved cpus is online, it removes the --reserved-cpus flag
# from the environment file which allows the kubelet to choose CPUs itself
#
ENVIRONMENT_FILE=$1
# Log info message to /var/log/daemon.log
function LOG {
logger -p daemon.info "$0($$): $@"
}
# Log error message to /var/log/daemon.log
function ERROR {
logger -s -p daemon.error "$0($$): ERROR: $@"
}
function sanitize_reserved_cpus {
kubelet_extra_args=$(cat ${ENVIRONMENT_FILE} 2>/dev/null)
RC=$?
if [ ${RC} != "0" ]; then
ERROR "Error reading kubelet extra arguments. Error code: [${RC}]"
exit ${RC}
fi
# Get reserved-cpus comma-separated-values string from environment file and strip double quotes
# format of kubelet_extra_args is:
# "KUBELET_EXTRA_ARGS=--cni-bin-dir=/usr/libexec/cni --node-ip=abcd:204::2
# --system-reserved=memory=9000Mi --reserved-cpus="0-29" --pod-max-pids 10000"
if [[ ${kubelet_extra_args} =~ --reserved-cpus=\"([0-9,-]+)\" ]]; then
reserved_cpus=${BASH_REMATCH[1]}
else
reserved_cpus=""
fi
if test -z "${reserved_cpus}"; then
LOG "No reserved-cpu list found for kubelet. Nothing to do."
exit 0
fi
LOG "Current reserved-cpus for the kubelet service: ${reserved_cpus}"
cpus_online=$(cat /sys/devices/system/cpu/online)
RC=$?
if [ ${RC} != "0" ]; then
ERROR "Error reading online CPU list. Error code: [${RC}]"
exit ${RC}
fi
LOG "Online CPUs: ${cpus_online}"
# Possible formats for reserved_cpus could be
# 0,2,4,6
# 0-23,36-45
# 0-4,6,9,13,23-34
expanded_reserved_cpus=$(expand_sequence ${reserved_cpus})
reserved_cpus_array=(${expanded_reserved_cpus//,/ })
sanitized_reserved_cpus=""
for element in "${reserved_cpus_array[@]}"; do
in_list ${element} ${cpus_online}
if [[ "$?" == "0" ]] ; then
sanitized_reserved_cpus+=",${element}"
fi
done
# Remove the extra leading ','
sanitized_reserved_cpus=${sanitized_reserved_cpus#","}
LOG "Sanitized reserved-cpus list for the kubelet: ${sanitized_reserved_cpus}"
if test -z "${sanitized_reserved_cpus}"; then
# Strip out --reserved-cpus option if no reserved-cpus are online
sed -i "s/ --reserved-cpus=\"${reserved_cpus}\"//g" ${ENVIRONMENT_FILE}
else
# Replace existing reserved-cpus with sanitized list
sed -i "s/--reserved-cpus=\"${reserved_cpus}\"/--reserved-cpus=\"${sanitized_reserved_cpus}\"/g" ${ENVIRONMENT_FILE}
fi
RC="$?"
if [ ${RC} != "0" ]; then
ERROR "Error updating reserved-cpus list for the kubelet. Error code: [${RC}]"
exit ${RC}
fi
LOG "Successfully updated reserved-cpus list for the kubelet."
}
source /etc/init.d/cpumap_functions.sh
sanitize_reserved_cpus
exit 0

View File

@ -41,6 +41,8 @@ Source3: kubelet_override.yaml
Source4: upgrade_k8s_config.sh Source4: upgrade_k8s_config.sh
Source5: sanitize_kubelet_reserved_cpus.sh
Patch1: kubelet-service-remove-docker-dependency.patch Patch1: kubelet-service-remove-docker-dependency.patch
BuildArch: noarch BuildArch: noarch
@ -101,6 +103,8 @@ install -d %{buildroot}%{local_sbindir}
# install execution scripts # install execution scripts
install -m 700 %{SOURCE4} %{buildroot}/%{local_sbindir}/upgrade_k8s_config.sh install -m 700 %{SOURCE4} %{buildroot}/%{local_sbindir}/upgrade_k8s_config.sh
install -m 700 %{SOURCE5} %{buildroot}/%{local_sbindir}/sanitize_kubelet_reserved_cpus.sh
# install service files # install service files
install -v -d -m 0755 %{buildroot}%{_unitdir} install -v -d -m 0755 %{buildroot}%{_unitdir}
install -v -m 0644 -t %{buildroot}%{_unitdir} contrib/init/systemd/kubelet.service install -v -m 0644 -t %{buildroot}%{_unitdir} contrib/init/systemd/kubelet.service
@ -120,6 +124,7 @@ install -v -p -m 0644 -t %{buildroot}/%{_sysconfdir}/systemd/system.conf.d %{SOU
# the following are execution scripts # the following are execution scripts
%{local_sbindir}/upgrade_k8s_config.sh %{local_sbindir}/upgrade_k8s_config.sh
%{local_sbindir}/sanitize_kubelet_reserved_cpus.sh
# the following are symlinks # the following are symlinks
%{_bindir}/kubeadm %{_bindir}/kubeadm

View File

@ -5,3 +5,4 @@ etc/kubernetes/kubelet.kubeconfig
etc/kubernetes/proxy etc/kubernetes/proxy
etc/systemd/system.conf.d/kubernetes-accounting.conf etc/systemd/system.conf.d/kubernetes-accounting.conf
usr/lib/tmpfiles.d/kubernetes.conf usr/lib/tmpfiles.d/kubernetes.conf
usr/local/sbin/sanitize_kubelet_reserved_cpus.sh

View File

@ -1,6 +1,7 @@
usr/bin/kubeadm usr/bin/kubeadm
usr/bin/kubelet usr/bin/kubelet
usr/bin/kubelet-cgroup-setup.sh usr/bin/kubelet-cgroup-setup.sh
usr/local/sbin/sanitize_kubelet_reserved_cpus.sh
usr/bin/kubectl usr/bin/kubectl
etc/systemd/system/kubelet.service.d/kubeadm.conf etc/systemd/system/kubelet.service.d/kubeadm.conf
usr/share/bash-completion/completions/kubectl usr/share/bash-completion/completions/kubectl

View File

@ -5,6 +5,7 @@
_k8s_name := kubernetes _k8s_name := kubernetes
_bindir := /usr/bin _bindir := /usr/bin
_local_sbindir := /usr/local/sbin
_curr_stage1 := /usr/local/kubernetes/current/stage1 _curr_stage1 := /usr/local/kubernetes/current/stage1
_curr_stage2 := /usr/local/kubernetes/current/stage2 _curr_stage2 := /usr/local/kubernetes/current/stage2
@ -60,6 +61,10 @@ override_dh_install:
install -v -d -m 0755 ${DEBIAN_DESTDIR}/etc/systemd/system.conf.d install -v -d -m 0755 ${DEBIAN_DESTDIR}/etc/systemd/system.conf.d
install -v -p -m 0644 -t ${DEBIAN_DESTDIR}/etc/systemd/system.conf.d debian/kubernetes-accounting.conf install -v -p -m 0644 -t ${DEBIAN_DESTDIR}/etc/systemd/system.conf.d debian/kubernetes-accounting.conf
# install scripts
install -v -m 0700 -d ${DEBIAN_DESTDIR}${_local_sbindir}
install -v -m 0700 -d ${DEBIAN_DESTDIR}${_local_sbindir}/sanitize_kubelet_reserved_cpus.sh
dh_install dh_install
override_dh_usrlocal: override_dh_usrlocal:

View File

@ -0,0 +1,98 @@
#! /bin/bash
# Copyright (c) 2022 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# The script will run everytime before the kubelet service is started.
# (Runs as a "ExecStartPre" action)
#
# It reads the reserved-cpus list for the kubelet from the kubelet
# environment file and sanitizes it on the basis of online CPUs.
#
# If none of the reserved cpus is online, it removes the --reserved-cpus flag
# from the environment file which allows the kubelet to choose CPUs itself
#
ENVIRONMENT_FILE=$1
# Log info message to /var/log/daemon.log
function LOG {
logger -p daemon.info "$0($$): $@"
}
# Log error message to /var/log/daemon.log
function ERROR {
logger -s -p daemon.error "$0($$): ERROR: $@"
}
function sanitize_reserved_cpus {
kubelet_extra_args=$(cat ${ENVIRONMENT_FILE} 2>/dev/null)
RC=$?
if [ ${RC} != "0" ]; then
ERROR "Error reading kubelet extra arguments. Error code: [${RC}]"
exit ${RC}
fi
# Get reserved-cpus comma-separated-values string from environment file and strip double quotes
# format of kubelet_extra_args is:
# "KUBELET_EXTRA_ARGS=--cni-bin-dir=/usr/libexec/cni --node-ip=abcd:204::2
# --system-reserved=memory=9000Mi --reserved-cpus="0-29" --pod-max-pids 10000"
if [[ ${kubelet_extra_args} =~ --reserved-cpus=\"([0-9,-]+)\" ]]; then
reserved_cpus=${BASH_REMATCH[1]}
else
reserved_cpus=""
fi
if test -z "${reserved_cpus}"; then
LOG "No reserved-cpu list found for kubelet. Nothing to do."
exit 0
fi
LOG "Current reserved-cpus for the kubelet service: ${reserved_cpus}"
cpus_online=$(cat /sys/devices/system/cpu/online)
RC=$?
if [ ${RC} != "0" ]; then
ERROR "Error reading online CPU list. Error code: [${RC}]"
exit ${RC}
fi
LOG "Online CPUs: ${cpus_online}"
# Possible formats for reserved_cpus could be
# 0,2,4,6
# 0-23,36-45
# 0-4,6,9,13,23-34
expanded_reserved_cpus=$(expand_sequence ${reserved_cpus})
reserved_cpus_array=(${expanded_reserved_cpus//,/ })
sanitized_reserved_cpus=""
for element in "${reserved_cpus_array[@]}"; do
in_list ${element} ${cpus_online}
if [[ "$?" == "0" ]] ; then
sanitized_reserved_cpus+=",${element}"
fi
done
# Remove the extra leading ','
sanitized_reserved_cpus=${sanitized_reserved_cpus#","}
LOG "Sanitized reserved-cpus list for the kubelet: ${sanitized_reserved_cpus}"
if test -z "${sanitized_reserved_cpus}"; then
# Strip out --reserved-cpus option if no reserved-cpus are online
sed -i "s/ --reserved-cpus=\"${reserved_cpus}\"//g" ${ENVIRONMENT_FILE}
else
# Replace existing reserved-cpus with sanitized list
sed -i "s/--reserved-cpus=\"${reserved_cpus}\"/--reserved-cpus=\"${sanitized_reserved_cpus}\"/g" ${ENVIRONMENT_FILE}
fi
RC="$?"
if [ ${RC} != "0" ]; then
ERROR "Error updating reserved-cpus list for the kubelet. Error code: [${RC}]"
exit ${RC}
fi
LOG "Successfully updated reserved-cpus list for the kubelet."
}
source /etc/init.d/cpumap_functions.sh
sanitize_reserved_cpus
exit 0