[ceph-osd] Fix post-apply job failure related to fault tolerance
A recent change to wait_for_pods() to allow for fault tolerance appears to be causing wait_for_pgs() to fail and exit the post- apply script prematurely in some cases. The existing wait_for_degraded_objects() logic won't pass until pods and PGs have recovered while the noout flag is set, so the pod and PG waits can simply be removed. Change-Id: I5fd7f422d710c18dee237c0ae97ae1a770606605
This commit is contained in:
parent
15ad6e9a6c
commit
791b0de5ee
@ -15,6 +15,6 @@ apiVersion: v1
|
||||
appVersion: v1.0.0
|
||||
description: OpenStack-Helm Ceph OSD
|
||||
name: ceph-osd
|
||||
version: 0.1.10
|
||||
version: 0.1.11
|
||||
home: https://github.com/ceph/ceph
|
||||
...
|
||||
|
@ -148,9 +148,8 @@ function restart_by_rack() {
|
||||
# The pods will not be ready in first 60 seconds. Thus we can reduce
|
||||
# amount of queries to kubernetes.
|
||||
sleep 60
|
||||
wait_for_pods $CEPH_NAMESPACE
|
||||
echo "waiting for inactive pgs after osds restarted from rack $rack"
|
||||
wait_for_pgs
|
||||
# Degraded objects won't recover with noout set unless pods come back and
|
||||
# PGs become healthy, so simply wait for 0 degraded objects
|
||||
wait_for_degraded_objects
|
||||
ceph -s
|
||||
done
|
||||
|
Loading…
Reference in New Issue
Block a user