Docs: Update ceph documentation
- Adding section for Ceph troubleshoot - Rearrange Testing section to include Ceph Co-Authored-By: portdirect <pete@port.direct> Change-Id: Ib04e9b59fea2557cf6cad177dfcc76390c161e06 Signed-off-by: Pete Birley <pete@port.direct>
This commit is contained in:
parent
79128a94fc
commit
5a9c8d41b7
@ -17,7 +17,7 @@ Contents:
|
|||||||
install/index
|
install/index
|
||||||
readme
|
readme
|
||||||
specs/index
|
specs/index
|
||||||
testing
|
testing/index
|
||||||
troubleshooting/index
|
troubleshooting/index
|
||||||
|
|
||||||
Indices and Tables
|
Indices and Tables
|
||||||
|
25
doc/source/testing/ceph-resiliency/README.rst
Normal file
25
doc/source/testing/ceph-resiliency/README.rst
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
========================================
|
||||||
|
Resiliency Tests for OpenStack-Helm/Ceph
|
||||||
|
========================================
|
||||||
|
|
||||||
|
Mission
|
||||||
|
=======
|
||||||
|
|
||||||
|
The goal of our resiliency tests for `OpenStack-Helm/Ceph
|
||||||
|
<https://github.com/openstack/openstack-helm/tree/master/ceph>`_ is to
|
||||||
|
show symptoms of software/hardware failure and provide the solutions.
|
||||||
|
|
||||||
|
Caveats:
|
||||||
|
- Our focus lies on resiliency for various failure scenarios but
|
||||||
|
not on performance or stress testing.
|
||||||
|
|
||||||
|
Software Failure
|
||||||
|
================
|
||||||
|
* `Monitor failure <./monitor-failure.html>`_
|
||||||
|
* `OSD failure <./osd-failure.html>`_
|
||||||
|
|
||||||
|
Hardware Failure
|
||||||
|
================
|
||||||
|
* `Disk failure <./disk-failure.html>`_
|
||||||
|
* `Host failure <./host-failure.html>`_
|
||||||
|
|
171
doc/source/testing/ceph-resiliency/disk-failure.rst
Normal file
171
doc/source/testing/ceph-resiliency/disk-failure.rst
Normal file
@ -0,0 +1,171 @@
|
|||||||
|
============
|
||||||
|
Disk Failure
|
||||||
|
============
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
================
|
||||||
|
|
||||||
|
- Cluster size: 4 host machines
|
||||||
|
- Number of disks: 24 (= 6 disks per host * 4 hosts)
|
||||||
|
- Kubernetes version: 1.10.5
|
||||||
|
- Ceph version: 12.2.3
|
||||||
|
- OpenStack-Helm commit: 25e50a34c66d5db7604746f4d2e12acbdd6c1459
|
||||||
|
|
||||||
|
Case: A disk fails
|
||||||
|
==================
|
||||||
|
|
||||||
|
Symptom:
|
||||||
|
--------
|
||||||
|
|
||||||
|
This is to test a scenario when a disk failure happens.
|
||||||
|
We monitor the ceph status and notice one OSD (osd.2) on voyager4
|
||||||
|
which has ``/dev/sdh`` as a backend is down.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: 9d4d8c61-cf87-4129-9cef-8fbf301210ad
|
||||||
|
health: HEALTH_WARN
|
||||||
|
too few PGs per OSD (23 < min 30)
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager1(active), standbys: voyager3
|
||||||
|
mds: cephfs-1/1/1 up {0=mds-ceph-mds-65bb45dffc-cslr6=up:active}, 1 up:standby
|
||||||
|
osd: 24 osds: 23 up, 23 in
|
||||||
|
rgw: 2 daemons active
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 18 pools, 182 pgs
|
||||||
|
objects: 240 objects, 3359 bytes
|
||||||
|
usage: 2548 MB used, 42814 GB / 42816 GB avail
|
||||||
|
pgs: 182 active+clean
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph osd tree
|
||||||
|
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
|
||||||
|
-1 43.67981 root default
|
||||||
|
-9 10.91995 host voyager1
|
||||||
|
5 hdd 1.81999 osd.5 up 1.00000 1.00000
|
||||||
|
6 hdd 1.81999 osd.6 up 1.00000 1.00000
|
||||||
|
10 hdd 1.81999 osd.10 up 1.00000 1.00000
|
||||||
|
17 hdd 1.81999 osd.17 up 1.00000 1.00000
|
||||||
|
19 hdd 1.81999 osd.19 up 1.00000 1.00000
|
||||||
|
21 hdd 1.81999 osd.21 up 1.00000 1.00000
|
||||||
|
-3 10.91995 host voyager2
|
||||||
|
1 hdd 1.81999 osd.1 up 1.00000 1.00000
|
||||||
|
4 hdd 1.81999 osd.4 up 1.00000 1.00000
|
||||||
|
11 hdd 1.81999 osd.11 up 1.00000 1.00000
|
||||||
|
13 hdd 1.81999 osd.13 up 1.00000 1.00000
|
||||||
|
16 hdd 1.81999 osd.16 up 1.00000 1.00000
|
||||||
|
18 hdd 1.81999 osd.18 up 1.00000 1.00000
|
||||||
|
-2 10.91995 host voyager3
|
||||||
|
0 hdd 1.81999 osd.0 up 1.00000 1.00000
|
||||||
|
3 hdd 1.81999 osd.3 up 1.00000 1.00000
|
||||||
|
12 hdd 1.81999 osd.12 up 1.00000 1.00000
|
||||||
|
20 hdd 1.81999 osd.20 up 1.00000 1.00000
|
||||||
|
22 hdd 1.81999 osd.22 up 1.00000 1.00000
|
||||||
|
23 hdd 1.81999 osd.23 up 1.00000 1.00000
|
||||||
|
-4 10.91995 host voyager4
|
||||||
|
2 hdd 1.81999 osd.2 down 0 1.00000
|
||||||
|
7 hdd 1.81999 osd.7 up 1.00000 1.00000
|
||||||
|
8 hdd 1.81999 osd.8 up 1.00000 1.00000
|
||||||
|
9 hdd 1.81999 osd.9 up 1.00000 1.00000
|
||||||
|
14 hdd 1.81999 osd.14 up 1.00000 1.00000
|
||||||
|
15 hdd 1.81999 osd.15 up 1.00000 1.00000
|
||||||
|
|
||||||
|
|
||||||
|
Solution:
|
||||||
|
---------
|
||||||
|
|
||||||
|
To replace the failed OSD, excecute the following procedure:
|
||||||
|
|
||||||
|
1. From the Kubernetes cluster, remove the failed OSD pod, which is running on ``voyager4``:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl label nodes --all ceph_maintenance_window=inactive
|
||||||
|
$ kubectl label nodes voyager4 --overwrite ceph_maintenance_window=active
|
||||||
|
$ kubectl patch -n ceph ds ceph-osd-default-64779b8c -p='{"spec":{"template":{"spec":{"nodeSelector":{"ceph-osd":"enabled","ceph_maintenance_window":"inactive"}}}}}'
|
||||||
|
|
||||||
|
Note: To find the daemonset associated with a failed OSD, check out the followings:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(voyager4)$ ps -ef|grep /usr/bin/ceph-osd
|
||||||
|
(voyager1)$ kubectl get ds -n ceph
|
||||||
|
(voyager1)$ kubectl get ds <daemonset-name> -n ceph -o yaml
|
||||||
|
|
||||||
|
|
||||||
|
3. Remove the failed OSD (OSD ID = 2 in this example) from the Ceph cluster:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph osd lost 2
|
||||||
|
(mon-pod):/# ceph osd crush remove osd.2
|
||||||
|
(mon-pod):/# ceph auth del osd.2
|
||||||
|
(mon-pod):/# ceph osd rm 2
|
||||||
|
|
||||||
|
4. Find that Ceph is healthy with a lost OSD (i.e., a total of 23 OSDs):
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: 9d4d8c61-cf87-4129-9cef-8fbf301210ad
|
||||||
|
health: HEALTH_WARN
|
||||||
|
too few PGs per OSD (23 < min 30)
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager1(active), standbys: voyager3
|
||||||
|
mds: cephfs-1/1/1 up {0=mds-ceph-mds-65bb45dffc-cslr6=up:active}, 1 up:standby
|
||||||
|
osd: 23 osds: 23 up, 23 in
|
||||||
|
rgw: 2 daemons active
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 18 pools, 182 pgs
|
||||||
|
objects: 240 objects, 3359 bytes
|
||||||
|
usage: 2551 MB used, 42814 GB / 42816 GB avail
|
||||||
|
pgs: 182 active+clean
|
||||||
|
|
||||||
|
5. Replace the failed disk with a new one. If you repair (not replace) the failed disk,
|
||||||
|
you may need to run the following:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(voyager4)$ parted /dev/sdh mklabel msdos
|
||||||
|
|
||||||
|
6. Start a new OSD pod on ``voyager4``:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl label nodes voyager4 --overwrite ceph_maintenance_window=inactive
|
||||||
|
|
||||||
|
7. Validate the Ceph status (i.e., one OSD is added, so the total number of OSDs becomes 24):
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: 9d4d8c61-cf87-4129-9cef-8fbf301210ad
|
||||||
|
health: HEALTH_WARN
|
||||||
|
too few PGs per OSD (22 < min 30)
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager1(active), standbys: voyager3
|
||||||
|
mds: cephfs-1/1/1 up {0=mds-ceph-mds-65bb45dffc-cslr6=up:active}, 1 up:standby
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
rgw: 2 daemons active
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 18 pools, 182 pgs
|
||||||
|
objects: 240 objects, 3359 bytes
|
||||||
|
usage: 2665 MB used, 44675 GB / 44678 GB avail
|
||||||
|
pgs: 182 active+clean
|
98
doc/source/testing/ceph-resiliency/host-failure.rst
Normal file
98
doc/source/testing/ceph-resiliency/host-failure.rst
Normal file
@ -0,0 +1,98 @@
|
|||||||
|
============
|
||||||
|
Host Failure
|
||||||
|
============
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
================
|
||||||
|
|
||||||
|
- Cluster size: 4 host machines
|
||||||
|
- Number of disks: 24 (= 6 disks per host * 4 hosts)
|
||||||
|
- Kubernetes version: 1.10.5
|
||||||
|
- Ceph version: 12.2.3
|
||||||
|
- OpenStack-Helm commit: 25e50a34c66d5db7604746f4d2e12acbdd6c1459
|
||||||
|
|
||||||
|
Case: One host machine where ceph-mon is running is rebooted
|
||||||
|
============================================================
|
||||||
|
|
||||||
|
Symptom:
|
||||||
|
--------
|
||||||
|
|
||||||
|
After reboot (node voyager3), the node status changes to ``NotReady``.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl get nodes
|
||||||
|
NAME STATUS ROLES AGE VERSION
|
||||||
|
voyager1 Ready master 6d v1.10.5
|
||||||
|
voyager2 Ready <none> 6d v1.10.5
|
||||||
|
voyager3 NotReady <none> 6d v1.10.5
|
||||||
|
voyager4 Ready <none> 6d v1.10.5
|
||||||
|
|
||||||
|
Ceph status shows that ceph-mon running on ``voyager3`` becomes out of quorum.
|
||||||
|
Also, six osds running on ``voyager3`` are down; i.e., 18 osds are up out of 24 osds.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: 9d4d8c61-cf87-4129-9cef-8fbf301210ad
|
||||||
|
health: HEALTH_WARN
|
||||||
|
6 osds down
|
||||||
|
1 host (6 osds) down
|
||||||
|
Degraded data redundancy: 195/624 objects degraded (31.250%), 8 pgs degraded
|
||||||
|
too few PGs per OSD (17 < min 30)
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
1/3 mons down, quorum voyager1,voyager2
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2, out of quorum: voyager3
|
||||||
|
mgr: voyager1(active), standbys: voyager3
|
||||||
|
mds: cephfs-1/1/1 up {0=mds-ceph-mds-65bb45dffc-cslr6=up:active}, 1 up:standby
|
||||||
|
osd: 24 osds: 18 up, 24 in
|
||||||
|
rgw: 2 daemons active
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 18 pools, 182 pgs
|
||||||
|
objects: 208 objects, 3359 bytes
|
||||||
|
usage: 2630 MB used, 44675 GB / 44678 GB avail
|
||||||
|
pgs: 195/624 objects degraded (31.250%)
|
||||||
|
126 active+undersized
|
||||||
|
48 active+clean
|
||||||
|
8 active+undersized+degraded
|
||||||
|
|
||||||
|
Recovery:
|
||||||
|
---------
|
||||||
|
The node status of ``voyager3`` changes to ``Ready`` after the node is up again.
|
||||||
|
Also, Ceph pods are restarted automatically.
|
||||||
|
Ceph status shows that the monitor running on ``voyager3`` is now in quorum.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl get nodes
|
||||||
|
NAME STATUS ROLES AGE VERSION
|
||||||
|
voyager1 Ready master 6d v1.10.5
|
||||||
|
voyager2 Ready <none> 6d v1.10.5
|
||||||
|
voyager3 Ready <none> 6d v1.10.5
|
||||||
|
voyager4 Ready <none> 6d v1.10.5
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: 9d4d8c61-cf87-4129-9cef-8fbf301210ad
|
||||||
|
health: HEALTH_WARN
|
||||||
|
too few PGs per OSD (22 < min 30)
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager1(active), standbys: voyager3
|
||||||
|
mds: cephfs-1/1/1 up {0=mds-ceph-mds-65bb45dffc-cslr6=up:active}, 1 up:standby
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
rgw: 2 daemons active
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 18 pools, 182 pgs
|
||||||
|
objects: 208 objects, 3359 bytes
|
||||||
|
usage: 2635 MB used, 44675 GB / 44678 GB avail
|
||||||
|
pgs: 182 active+clean
|
12
doc/source/testing/ceph-resiliency/index.rst
Normal file
12
doc/source/testing/ceph-resiliency/index.rst
Normal file
@ -0,0 +1,12 @@
|
|||||||
|
===============
|
||||||
|
Ceph Resiliency
|
||||||
|
===============
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
README
|
||||||
|
monitor-failure
|
||||||
|
osd-failure
|
||||||
|
disk-failure
|
||||||
|
host-failure
|
125
doc/source/testing/ceph-resiliency/monitor-failure.rst
Normal file
125
doc/source/testing/ceph-resiliency/monitor-failure.rst
Normal file
@ -0,0 +1,125 @@
|
|||||||
|
===============
|
||||||
|
Monitor Failure
|
||||||
|
===============
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
================
|
||||||
|
|
||||||
|
- Cluster size: 4 host machines
|
||||||
|
- Number of disks: 24 (= 6 disks per host * 4 hosts)
|
||||||
|
- Kubernetes version: 1.9.3
|
||||||
|
- Ceph version: 12.2.3
|
||||||
|
- OpenStack-Helm commit: 28734352741bae228a4ea4f40bcacc33764221eb
|
||||||
|
|
||||||
|
We have 3 Monitors in this Ceph cluster, one on each of the 3 Monitor
|
||||||
|
hosts.
|
||||||
|
|
||||||
|
Case: 1 out of 3 Monitor Processes is Down
|
||||||
|
==========================================
|
||||||
|
|
||||||
|
This is to test a scenario when 1 out of 3 Monitor processes is down.
|
||||||
|
|
||||||
|
To bring down 1 Monitor process (out of 3), we identify a Monitor
|
||||||
|
process and kill it from the monitor host (not a pod).
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ ps -ef | grep ceph-mon
|
||||||
|
ceph 16112 16095 1 14:58 ? 00:00:03 /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i voyager2 --mon-data /var/lib/ceph/mon/ceph-voyager2 --public-addr 135.207.240.42:6789
|
||||||
|
$ sudo kill -9 16112
|
||||||
|
|
||||||
|
In the mean time, we monitored the status of Ceph and noted that it
|
||||||
|
takes about 24 seconds for the killed Monitor process to recover from
|
||||||
|
``down`` to ``up``. The reason is that Kubernetes automatically
|
||||||
|
restarts pods whenever they are killed.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
1/3 mons down, quorum voyager1,voyager3
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager3, out of quorum: voyager2
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
1/3 mons down, quorum voyager1,voyager2
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
|
||||||
|
We also monitored the status of the Monitor pod through ``kubectl get
|
||||||
|
pods -n ceph``, and the status of the pod (where a Monitor process is
|
||||||
|
killed) changed as follows: ``Running`` -> ``Error`` -> ``Running``
|
||||||
|
and this recovery process takes about 24 seconds.
|
||||||
|
|
||||||
|
Case: 2 out of 3 Monitor Processes are Down
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
This is to test a scenario when 2 out of 3 Monitor processes are down.
|
||||||
|
To bring down 2 Monitor processes (out of 3), we identify two Monitor
|
||||||
|
processes and kill them from the 2 monitor hosts (not a pod).
|
||||||
|
|
||||||
|
We monitored the status of Ceph when the Monitor processes are killed
|
||||||
|
and noted that the symptoms are similar to when 1 Monitor process is
|
||||||
|
killed:
|
||||||
|
|
||||||
|
- It takes longer (about 1 minute) for the killed Monitor processes to
|
||||||
|
recover from ``down`` to ``up``.
|
||||||
|
|
||||||
|
- The status of the pods (where the two Monitor processes are killed)
|
||||||
|
changed as follows: ``Running`` -> ``Error`` -> ``CrashLoopBackOff``
|
||||||
|
-> ``Running`` and this recovery process takes about 1 minute.
|
||||||
|
|
||||||
|
|
||||||
|
Case: 3 out of 3 Monitor Processes are Down
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
This is to test a scenario when 3 out of 3 Monitor processes are down.
|
||||||
|
To bring down 3 Monitor processes (out of 3), we identify all 3
|
||||||
|
Monitor processes and kill them from the 3 monitor hosts (not pods).
|
||||||
|
|
||||||
|
We monitored the status of Ceph Monitor pods and noted that the
|
||||||
|
symptoms are similar to when 1 or 2 Monitor processes are killed:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl get pods -n ceph -o wide | grep ceph-mon
|
||||||
|
NAME READY STATUS RESTARTS AGE
|
||||||
|
ceph-mon-8tml7 0/1 Error 4 10d
|
||||||
|
ceph-mon-kstf8 0/1 Error 4 10d
|
||||||
|
ceph-mon-z4sl9 0/1 Error 7 10d
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl get pods -n ceph -o wide | grep ceph-mon
|
||||||
|
NAME READY STATUS RESTARTS AGE
|
||||||
|
ceph-mon-8tml7 0/1 CrashLoopBackOff 4 10d
|
||||||
|
ceph-mon-kstf8 0/1 Error 4 10d
|
||||||
|
ceph-mon-z4sl9 0/1 CrashLoopBackOff 7 10d
|
||||||
|
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ kubectl get pods -n ceph -o wide | grep ceph-mon
|
||||||
|
NAME READY STATUS RESTARTS AGE
|
||||||
|
ceph-mon-8tml7 1/1 Running 5 10d
|
||||||
|
ceph-mon-kstf8 1/1 Running 5 10d
|
||||||
|
ceph-mon-z4sl9 1/1 Running 8 10d
|
||||||
|
|
||||||
|
The status of the pods (where the three Monitor processes are killed)
|
||||||
|
changed as follows: ``Running`` -> ``Error`` -> ``CrashLoopBackOff``
|
||||||
|
-> ``Running`` and this recovery process takes about 1 minute.
|
107
doc/source/testing/ceph-resiliency/osd-failure.rst
Normal file
107
doc/source/testing/ceph-resiliency/osd-failure.rst
Normal file
@ -0,0 +1,107 @@
|
|||||||
|
===========
|
||||||
|
OSD Failure
|
||||||
|
===========
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
================
|
||||||
|
|
||||||
|
- Cluster size: 4 host machines
|
||||||
|
- Number of disks: 24 (= 6 disks per host * 4 hosts)
|
||||||
|
- Kubernetes version: 1.9.3
|
||||||
|
- Ceph version: 12.2.3
|
||||||
|
- OpenStack-Helm commit: 28734352741bae228a4ea4f40bcacc33764221eb
|
||||||
|
|
||||||
|
Case: OSD processes are killed
|
||||||
|
==============================
|
||||||
|
|
||||||
|
This is to test a scenario when some of the OSDs are down.
|
||||||
|
|
||||||
|
To bring down 6 OSDs (out of 24), we identify the OSD processes and
|
||||||
|
kill them from a storage host (not a pod).
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ ps -ef|grep /usr/bin/ceph-osd
|
||||||
|
ceph 44587 43680 1 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb5 -f -i 4 --setuser ceph --setgroup disk
|
||||||
|
ceph 44627 43744 1 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb2 -f -i 6 --setuser ceph --setgroup disk
|
||||||
|
ceph 44720 43927 2 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb6 -f -i 3 --setuser ceph --setgroup disk
|
||||||
|
ceph 44735 43868 1 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb1 -f -i 9 --setuser ceph --setgroup disk
|
||||||
|
ceph 44806 43855 1 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb4 -f -i 0 --setuser ceph --setgroup disk
|
||||||
|
ceph 44896 44011 2 18:12 ? 00:00:01 /usr/bin/ceph-osd --cluster ceph --osd-journal /dev/sdb3 -f -i 1 --setuser ceph --setgroup disk
|
||||||
|
root 46144 45998 0 18:13 pts/10 00:00:00 grep --color=auto /usr/bin/ceph-osd
|
||||||
|
|
||||||
|
$ sudo kill -9 44587 44627 44720 44735 44806 44896
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
6 osds down
|
||||||
|
1 host (6 osds) down
|
||||||
|
Reduced data availability: 8 pgs inactive, 58 pgs peering
|
||||||
|
Degraded data redundancy: 141/1002 objects degraded (14.072%), 133 pgs degraded
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 18 up, 24 in
|
||||||
|
|
||||||
|
In the mean time, we monitor the status of Ceph and noted that it takes about 30 seconds for the 6 OSDs to recover from ``down`` to ``up``.
|
||||||
|
The reason is that Kubernetes automatically restarts OSD pods whenever they are killed.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
|
||||||
|
Case: A OSD pod is deleted
|
||||||
|
==========================
|
||||||
|
|
||||||
|
This is to test a scenario when an OSD pod is deleted by ``kubectl delete $OSD_POD_NAME``.
|
||||||
|
Meanwhile, we monitor the status of Ceph and noted that it takes about 90 seconds for the OSD running in deleted pod to recover from ``down`` to ``up``.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
root@voyager3:/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
1 osds down
|
||||||
|
Degraded data redundancy: 43/945 objects degraded (4.550%), 35 pgs degraded, 109 pgs undersized
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 23 up, 24 in
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
(mon-pod):/# ceph -s
|
||||||
|
cluster:
|
||||||
|
id: fd366aef-b356-4fe7-9ca5-1c313fe2e324
|
||||||
|
health: HEALTH_WARN
|
||||||
|
mon voyager1 is low on available space
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 3 daemons, quorum voyager1,voyager2,voyager3
|
||||||
|
mgr: voyager4(active)
|
||||||
|
osd: 24 osds: 24 up, 24 in
|
||||||
|
|
||||||
|
We also monitored the pod status through ``kubectl get pods -n ceph``
|
||||||
|
during this process. The deleted OSD pod status changed as follows:
|
||||||
|
``Terminating`` -> ``Init:1/3`` -> ``Init:2/3`` -> ``Init:3/3`` ->
|
||||||
|
``Running``, and this process taks about 90 seconds. The reason is
|
||||||
|
that Kubernetes automatically restarts OSD pods whenever they are
|
||||||
|
deleted.
|
@ -1,9 +1,6 @@
|
|||||||
=======
|
==========
|
||||||
Testing
|
|
||||||
=======
|
|
||||||
|
|
||||||
Helm Tests
|
Helm Tests
|
||||||
----------
|
==========
|
||||||
|
|
||||||
Every OpenStack-Helm chart should include any required Helm tests necessary to
|
Every OpenStack-Helm chart should include any required Helm tests necessary to
|
||||||
provide a sanity check for the OpenStack service. Information on using the Helm
|
provide a sanity check for the OpenStack service. Information on using the Helm
|
||||||
@ -27,7 +24,6 @@ chart. If Rally tests are not appropriate or adequate for a service chart, any
|
|||||||
additional tests should be documented appropriately and adhere to the same
|
additional tests should be documented appropriately and adhere to the same
|
||||||
expectations.
|
expectations.
|
||||||
|
|
||||||
|
|
||||||
Running Tests
|
Running Tests
|
||||||
-------------
|
-------------
|
||||||
|
|
9
doc/source/testing/index.rst
Normal file
9
doc/source/testing/index.rst
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
=======
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
helm-tests
|
||||||
|
ceph-resiliency/index
|
59
doc/source/troubleshooting/ceph.rst
Normal file
59
doc/source/troubleshooting/ceph.rst
Normal file
@ -0,0 +1,59 @@
|
|||||||
|
Backing up a PVC
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Backing up a PVC stored in Ceph, is fairly straigthforward, in this example we
|
||||||
|
use the PVC ``mysql-data-mariadb-server-0`` as an example, but this will also
|
||||||
|
apply to any other services using PVCs eg. RabbitMQ, Postgres.
|
||||||
|
|
||||||
|
|
||||||
|
.. code-block:: shell
|
||||||
|
|
||||||
|
# get all required details
|
||||||
|
NS_NAME="openstack"
|
||||||
|
PVC_NAME="mysql-data-mariadb-server-0"
|
||||||
|
# you can check this by running kubectl get pvc -n ${NS_NAME}
|
||||||
|
|
||||||
|
PV_NAME="$(kubectl get -n ${NS_NAME} pvc "${PVC_NAME}" --no-headers | awk '{ print $3 }')"
|
||||||
|
RBD_NAME="$(kubectl get pv "${PV_NAME}" -o json | jq -r '.spec.rbd.image')"
|
||||||
|
MON_POD=$(kubectl get pods \
|
||||||
|
--namespace=ceph \
|
||||||
|
--selector="application=ceph" \
|
||||||
|
--selector="component=mon" \
|
||||||
|
--no-headers | awk '{ print $1; exit }')
|
||||||
|
|
||||||
|
# copy admin keyring from ceph mon to host node
|
||||||
|
|
||||||
|
kubectl exec -it ${MON_POD} -n ceph -- cat /etc/ceph/ceph.client.admin.keyring > /etc/ceph/ceph.client.admin.keyring
|
||||||
|
sudo kubectl get cm -n ceph ceph-etc -o json|jq -j .data[] > /etc/ceph/ceph.conf
|
||||||
|
|
||||||
|
export CEPH_MON_NAME="ceph-mon-discovery.ceph.svc.cluster.local"
|
||||||
|
|
||||||
|
# create snapshot and export to a file
|
||||||
|
|
||||||
|
rbd snap create rbd/${RBD_NAME}@snap1 -m ${CEPH_MON_NAME}
|
||||||
|
rbd snap list rbd/${RBD_NAME} -m ${CEPH_MON_NAME}
|
||||||
|
|
||||||
|
# Export the snapshot and compress , make sure we have enough space on host to accomidate big files that we are working .
|
||||||
|
|
||||||
|
# a. if we have enough space on host
|
||||||
|
|
||||||
|
rbd export rbd/${RBD_NAME}@snap1 /backup/${RBD_NAME}.img -m ${CEPH_MON_NAME}
|
||||||
|
cd /backup
|
||||||
|
time xz -0vk --threads=0 /backup/${RBD_NAME}.img
|
||||||
|
|
||||||
|
# b. if we have less space on host we can directly export and compress in single command
|
||||||
|
|
||||||
|
rbd export rbd/${RBD_NAME}@snap1 -m ${CEPH_MON_NAME} - | xz -0v --threads=0 > /backup/${RBD_NAME}.img.xz
|
||||||
|
|
||||||
|
|
||||||
|
Restoring is just as straightforward. Once the workload consuming the device has
|
||||||
|
been stopped, and the raw RBD device removed the following will import the
|
||||||
|
back up and create a device:
|
||||||
|
|
||||||
|
.. code-block:: shell
|
||||||
|
|
||||||
|
cd /backup
|
||||||
|
unxz -k ${RBD_NAME}.img.xz
|
||||||
|
rbd import /backup/${RBD_NAME}.img rbd/${RBD_NAME} -m ${CEPH_MON_NAME}
|
||||||
|
|
||||||
|
Once this has been done the workload can be restarted.
|
@ -11,6 +11,7 @@ Sometimes things go wrong. These guides will help you solve many common issues w
|
|||||||
persistent-storage
|
persistent-storage
|
||||||
proxy
|
proxy
|
||||||
ubuntu-hwe-kernel
|
ubuntu-hwe-kernel
|
||||||
|
ceph
|
||||||
|
|
||||||
Getting help
|
Getting help
|
||||||
============
|
============
|
||||||
|
Loading…
Reference in New Issue
Block a user