The service k8s-pod-recovery failed to restore the SRIOV device
plugin, necessary for pods that use SRIOV interfaces to create the
resource, those pods need to add the label 'restart-on-reboot=true'
to be restarted during boot. The failure was observed during an
upgrade, and although rare, it left the operator to actuate by
manually restarting the pods later.
This change adds a wait for the pod stabilization (it is considered
stable when stops the state transitions) and, if still in failure,
execute 2 attempts to restore the plugin. Logs were added to better
register the pod state in case of an error.
Test Plan:
[PASS] execute 7 upgrades in an AIO-SX lab
Closes-Bug: 1999074
Signed-off-by: Andre Fernando Zanella Kantek <AndreFernandoZanella.Kantek@windriver.com>
Change-Id: I838c35d3e0a3557c71344945a8e00f22ccb50eb4