From c98ee54f6857e4bbcf29bae706a19b7ca49abd8f Mon Sep 17 00:00:00 2001 From: MORITA Kazutaka Date: Wed, 21 Dec 2011 02:08:40 +0900 Subject: [PATCH] Explain how replication works more clearly A replicator creates an extra replica when it detects a remote disk failure. However, when it fails to create a replica due to other reasons (e.g. entire node failures), it doesn't create another replica at all. We should explain it explicitly so that users can know it. This fixes bug 906976. Change-Id: I2f56428ccbbb0cf0d8538ca6e08f7da71257e661 --- doc/source/overview_replication.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/overview_replication.rst b/doc/source/overview_replication.rst index 21ba25818e..ab2b2c1523 100644 --- a/doc/source/overview_replication.rst +++ b/doc/source/overview_replication.rst @@ -8,7 +8,7 @@ Replication uses a push model, with records and files generally only being copie Every deleted record or file in the system is marked by a tombstone, so that deletions can be replicated alongside creations. These tombstones are cleaned up by the replication process after a period of time referred to as the consistency window, which is related to replication duration and how long transient failures can remove a node from the cluster. Tombstone cleanup must be tied to replication to reach replica convergence. -If a replicator detects that a remote drive is has failed, it will use the ring's "get_more_nodes" interface to choose an alternate node to synchronize with. The replicator can generally maintain desired levels of replication in the face of hardware failures, though some replicas may not be in an immediately usable location. +If a replicator detects that a remote drive is has failed, it will use the ring's "get_more_nodes" interface to choose an alternate node to synchronize with. The replicator can maintain desired levels of replication in the face of disk failures, though some replicas may not be in an immediately usable location. Note that the replicator doesn't maintain desired levels of replication in the case of other failures (e.g. entire node failures) because the most of such failures are transient. Replication is an area of active development, and likely rife with potential improvements to speed and correctness.