docs/doc/source/storage/kubernetes/replication-groups.rst
Rafael Jardim 2e74ccd0b7 Storage Update
Signed-off-by: Rafael Jardim <rafaeljordao.jardim@windriver.com>
Change-Id: Ic8eea41e912e52ddebc5ed9dca62e8d4f9255b09
2021-03-23 09:20:34 -03:00

86 lines
4.2 KiB
ReStructuredText

.. awp1552678699112
.. _replication-groups:
==================
Replication Groups
==================
The storage hosts on Ceph systems are organized into replication groups to
provide redundancy.
Each replication group contains a number of hosts, referred to as peers.
Each peer independently replicates the same data. |prod| supports a minimum
of two peers and a maximum of three peers per replication group. This
replication factor is defined when the Ceph storage backend is added.
For a system with two peers per replication group, up to four replication
groups are supported. For a system with three peers per replication group,
up to three replication groups are supported.
For best performance, |org| recommends a balanced storage capacity, in
which each peer has sufficient resources to meet the operational
requirements of the system.
A replication group is considered healthy when all its peers are available.
When only a minimum number of peers are available \(as indicated by the
**min\_replication** value reported for the group\), the group continues to
provide storage services but without full replication, and a HEALTH\_WARN
state is declared. When the number of available peers falls below the
**min\_replication** value, the group no longer provides storage services,
and a HEALTH\_ERR state is declared. The **min\_replication** value is
always one less than the replication factor for the group.
It is not possible to lock more than one peer at a time in a replication
group.
Replication groups are created automatically. When a new storage host is
added and an incomplete replication group exists, the host is added to the
existing group. If all existing replication groups are complete, then a new
incomplete replication group is defined and the host becomes its first
member.
.. note::
Ceph relies on monitoring to detect when to switch from a primary OSD
to a replicated OSD. The Ceph parameter :command:`osd heartbeat grace` sets
the amount of time required to wait before switching OSDs when the
primary OSD is not responding. |prod| currently uses the default value
of 20 seconds. This means that a Ceph filesystem may not respond to I/O
for up to 20 seconds when a storage node or OSD goes out of service.
For more information, see the Ceph documentation:
`http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction
<http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction>`__.
Replication groups are shown on the Hosts Inventory page in association
with the storage hosts. You can also use the following CLI commands to
obtain information about replication groups:
.. code-block:: none
~(keystone_admin)]$ system cluster-list
+--------------------------------------+--------------+------+----------+------------------+
| uuid | cluster_uuid | type | name | deployment_model |
+--------------------------------------+--------------+------+----------+------------------+
| 335766eb-8564-4f4d-8665-681f73d13dfb | None | ceph | ceph_clu | controller-nodes |
| | | | ster | |
| | | | | |
+--------------------------------------+--------------+------+----------+------------------+
.. code-block:: none
~(keystone_admin)]$ system cluster-show 335766eb-968e-44fc-9ca7-907f93c772a1
+--------------------+----------------------------------------+
| Property | Value |
+--------------------+----------------------------------------+
| uuid | 335766eb-968e-44fc-9ca7-907f93c772a1 |
| cluster_uuid | None |
| type | ceph |
| name | ceph_cluster |
| replication_groups | ["group-0:['storage-0', 'storage-1']"] |
| storage_tiers | ['storage (in-use)'] |
| deployment_model | controller-nodes |
+--------------------+----------------------------------------+