swift/doc/source/overview_policies.rst
Romain LE DISEZ 673fda7620 Configure diskfile per storage policy
With this commit, each storage policy can define the diskfile to use to
access objects. Selection of the diskfile is done in swift.conf.

Example:
    [storage-policy:0]
    name = gold
    policy_type = replication
    default = yes
    diskfile = egg:swift#replication.fs

The diskfile configuration item accepts the same format than middlewares
declaration: [[scheme:]egg_name#]entry_point
The egg_name is optional and default to "swift". The scheme is optional
and default to the only valid value "egg". The upstream entry points are
"replication.fs" and "erasure_coding.fs".

Co-Authored-By: Alexandre Lécuyer <alexandre.lecuyer@corp.ovh.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I070c21bc1eaf1c71ac0652cec9e813cadcc14851
2018-08-24 02:29:13 +00:00

678 lines
34 KiB
ReStructuredText

================
Storage Policies
================
Storage Policies allow for some level of segmenting the cluster for various
purposes through the creation of multiple object rings. The Storage Policies
feature is implemented throughout the entire code base so it is an important
concept in understanding Swift architecture.
As described in :doc:`overview_ring`, Swift uses modified hashing rings to
determine where data should reside in the cluster. There is a separate ring for
account databases, container databases, and there is also one object ring per
storage policy. Each object ring behaves exactly the same way and is maintained
in the same manner, but with policies, different devices can belong to different
rings. By supporting multiple object rings, Swift allows the application and/or
deployer to essentially segregate the object storage within a single cluster.
There are many reasons why this might be desirable:
* Different levels of durability: If a provider wants to offer, for example,
2x replication and 3x replication but doesn't want to maintain 2 separate
clusters, they would setup a 2x and a 3x replication policy and assign the
nodes to their respective rings. Furthermore, if a provider wanted to offer a
cold storage tier, they could create an erasure coded policy.
* Performance: Just as SSDs can be used as the exclusive members of an account
or database ring, an SSD-only object ring can be created as well and used to
implement a low-latency/high performance policy.
* Collecting nodes into group: Different object rings may have different
physical servers so that objects in specific storage policies are always
placed in a particular data center or geography.
* Different Storage implementations: Another example would be to collect
together a set of nodes that use a different Diskfile (e.g., Kinetic,
GlusterFS) and use a policy to direct traffic just to those nodes.
* Different read and write affinity settings: proxy-servers can be configured
to use different read and write affinity options for each policy. See
:ref:`proxy_server_per_policy_config` for more details.
.. note::
Today, Swift supports two different policy types: Replication and Erasure
Code. See :doc:`overview_erasure_code` for details.
Also note that Diskfile refers to backend object storage plug-in
architecture. See :doc:`development_ondisk_backends` for details.
-----------------------
Containers and Policies
-----------------------
Policies are implemented at the container level. There are many advantages to
this approach, not the least of which is how easy it makes life on
applications that want to take advantage of them. It also ensures that
Storage Policies remain a core feature of Swift independent of the auth
implementation. Policies were not implemented at the account/auth layer
because it would require changes to all auth systems in use by Swift
deployers. Each container has a new special immutable metadata element called
the storage policy index. Note that internally, Swift relies on policy
indexes and not policy names. Policy names exist for human readability and
translation is managed in the proxy. When a container is created, one new
optional header is supported to specify the policy name. If no name is
specified, the default policy is used (and if no other policies defined,
Policy-0 is considered the default). We will be covering the difference
between default and Policy-0 in the next section.
Policies are assigned when a container is created. Once a container has been
assigned a policy, it cannot be changed (unless it is deleted/recreated). The
implications on data placement/movement for large datasets would make this a
task best left for applications to perform. Therefore, if a container has an
existing policy of, for example 3x replication, and one wanted to migrate that
data to an Erasure Code policy, the application would create another container
specifying the other policy parameters and then simply move the data from one
container to the other. Policies apply on a per container basis allowing for
minimal application awareness; once a container has been created with a specific
policy, all objects stored in it will be done so in accordance with that policy.
If a container with a specific name is deleted (requires the container be empty)
a new container may be created with the same name without any restriction on
storage policy enforced by the deleted container which previously shared the
same name.
Containers have a many-to-one relationship with policies meaning that any number
of containers can share one policy. There is no limit to how many containers
can use a specific policy.
The notion of associating a ring with a container introduces an interesting
scenario: What would happen if 2 containers of the same name were created with
different Storage Policies on either side of a network outage at the same time?
Furthermore, what would happen if objects were placed in those containers, a
whole bunch of them, and then later the network outage was restored? Well,
without special care it would be a big problem as an application could end up
using the wrong ring to try and find an object. Luckily there is a solution for
this problem, a daemon known as the Container Reconciler works tirelessly to
identify and rectify this potential scenario.
--------------------
Container Reconciler
--------------------
Because atomicity of container creation cannot be enforced in a
distributed eventually consistent system, object writes into the wrong
storage policy must be eventually merged into the correct storage policy
by an asynchronous daemon. Recovery from object writes during a network
partition which resulted in a split brain container created with
different storage policies are handled by the
`swift-container-reconciler` daemon.
The container reconciler works off a queue similar to the
object-expirer. The queue is populated during container-replication.
It is never considered incorrect to enqueue an object to be evaluated by
the container-reconciler because if there is nothing wrong with the location
of the object the reconciler will simply dequeue it. The
container-reconciler queue is an indexed log for the real location of an
object for which a discrepancy in the storage policy of the container was
discovered.
To determine the correct storage policy of a container, it is necessary
to update the status_changed_at field in the container_stat table when a
container changes status from deleted to re-created. This transaction
log allows the container-replicator to update the correct storage policy
both when replicating a container and handling REPLICATE requests.
Because each object write is a separate distributed transaction it is
not possible to determine the correctness of the storage policy for each
object write with respect to the entire transaction log at a given
container database. As such, container databases will always record the
object write regardless of the storage policy on a per object row basis.
Object byte and count stats are tracked per storage policy in each
container and reconciled using normal object row merge semantics.
The object rows are ensured to be fully durable during replication using
the normal container replication. After the container
replicator pushes its object rows to available primary nodes any
misplaced object rows are bulk loaded into containers based off the
object timestamp under the ``.misplaced_objects`` system account. The
rows are initially written to a handoff container on the local node, and
at the end of the replication pass the ``.misplaced_objects`` containers are
replicated to the correct primary nodes.
The container-reconciler processes the ``.misplaced_objects`` containers in
descending order and reaps its containers as the objects represented by
the rows are successfully reconciled. The container-reconciler will
always validate the correct storage policy for enqueued objects using
direct container HEAD requests which are accelerated via caching.
Because failure of individual storage nodes in aggregate is assumed to
be common at scale, the container-reconciler will make forward progress
with a simple quorum majority. During a combination of failures and
rebalances it is possible that a quorum could provide an incomplete
record of the correct storage policy - so an object write may have to be
applied more than once. Because storage nodes and container databases
will not process writes with an ``X-Timestamp`` less than or equal to
their existing record when objects writes are re-applied their timestamp
is slightly incremented. In order for this increment to be applied
transparently to the client a second vector of time has been added to
Swift for internal use. See :class:`~swift.common.utils.Timestamp`.
As the reconciler applies object writes to the correct storage policy it
cleans up writes which no longer apply to the incorrect storage policy
and removes the rows from the ``.misplaced_objects`` containers. After all
rows have been successfully processed it sleeps and will periodically
check for newly enqueued rows to be discovered during container
replication.
.. _default-policy:
-------------------------
Default versus 'Policy-0'
-------------------------
Storage Policies is a versatile feature intended to support both new and
pre-existing clusters with the same level of flexibility. For that reason, we
introduce the ``Policy-0`` concept which is not the same as the "default"
policy. As you will see when we begin to configure policies, each policy has
a single name and an arbitrary number of aliases (human friendly,
configurable) as well as an index (or simply policy number). Swift reserves
index 0 to map to the object ring that's present in all installations
(e.g., ``/etc/swift/object.ring.gz``). You can name this policy anything you
like, and if no policies are defined it will report itself as ``Policy-0``,
however you cannot change the index as there must always be a policy with
index 0.
Another important concept is the default policy which can be any policy
in the cluster. The default policy is the policy that is automatically
chosen when a container creation request is sent without a storage
policy being specified. :ref:`configure-policy` describes how to set the
default policy. The difference from ``Policy-0`` is subtle but
extremely important. ``Policy-0`` is what is used by Swift when
accessing pre-storage-policy containers which won't have a policy - in
this case we would not use the default as it might not have the same
policy as legacy containers. When no other policies are defined, Swift
will always choose ``Policy-0`` as the default.
In other words, default means "create using this policy if nothing else is
specified" and ``Policy-0`` means "use the legacy policy if a container doesn't
have one" which really means use ``object.ring.gz`` for lookups.
.. note::
With the Storage Policy based code, it's not possible to create a
container that doesn't have a policy. If nothing is provided, Swift will
still select the default and assign it to the container. For containers
created before Storage Policies were introduced, the legacy Policy-0 will
be used.
.. _deprecate-policy:
--------------------
Deprecating Policies
--------------------
There will be times when a policy is no longer desired; however simply
deleting the policy and associated rings would be problematic for existing
data. In order to ensure that resources are not orphaned in the cluster (left
on disk but no longer accessible) and to provide proper messaging to
applications when a policy needs to be retired, the notion of deprecation is
used. :ref:`configure-policy` describes how to deprecate a policy.
Swift's behavior with deprecated policies is as follows:
* The deprecated policy will not appear in /info
* PUT/GET/DELETE/POST/HEAD are still allowed on the pre-existing containers
created with a deprecated policy
* Clients will get an ''400 Bad Request'' error when trying to create a new
container using the deprecated policy
* Clients still have access to policy statistics via HEAD on pre-existing
containers
.. note::
A policy cannot be both the default and deprecated. If you deprecate the
default policy, you must specify a new default.
You can also use the deprecated feature to rollout new policies. If you
want to test a new storage policy before making it generally available
you could deprecate the policy when you initially roll it the new
configuration and rings to all nodes. Being deprecated will render it
innate and unable to be used. To test it you will need to create a
container with that storage policy; which will require a single proxy
instance (or a set of proxy-servers which are only internally
accessible) that has been one-off configured with the new policy NOT
marked deprecated. Once the container has been created with the new
storage policy any client authorized to use that container will be able
to add and access data stored in that container in the new storage
policy. When satisfied you can roll out a new ``swift.conf`` which does
not mark the policy as deprecated to all nodes.
.. _configure-policy:
--------------------
Configuring Policies
--------------------
.. note::
See :doc:`policies_saio` for a step by step guide on adding a policy to the
SAIO setup.
It is important that the deployer have a solid understanding of the semantics
for configuring policies. Configuring a policy is a three-step process:
#. Edit your ``/etc/swift/swift.conf`` file to define your new policy.
#. Create the corresponding policy object ring file.
#. (Optional) Create policy-specific proxy-server configuration settings.
Defining a policy
-----------------
Each policy is defined by a section in the ``/etc/swift/swift.conf`` file. The
section name must be of the form ``[storage-policy:<N>]`` where ``<N>`` is the
policy index. There's no reason other than readability that policy indexes be
sequential but the following rules are enforced:
* If a policy with index ``0`` is not declared and no other policies are
defined, Swift will create a default policy with index ``0``.
* The policy index must be a non-negative integer.
* Policy indexes must be unique.
.. warning::
The index of a policy should never be changed once a policy has been
created and used. Changing a policy index may cause loss of access to data.
Each policy section contains the following options:
* ``name = <policy_name>`` (required)
- The primary name of the policy.
- Policy names are case insensitive.
- Policy names must contain only letters, digits or a dash.
- Policy names must be unique.
- Policy names can be changed.
- The name ``Policy-0`` can only be used for the policy with
index ``0``.
* ``aliases = <policy_name>[, <policy_name>, ...]`` (optional)
- A comma-separated list of alternative names for the policy.
- The default value is an empty list (i.e. no aliases).
- All alias names must follow the rules for the ``name`` option.
- Aliases can be added to and removed from the list.
- Aliases can be useful to retain support for old primary names if the
primary name is changed.
* ``default = [true|false]`` (optional)
- If ``true`` then this policy will be used when the client does not
specify a policy.
- The default value is ``false``.
- The default policy can be changed at any time, by setting
``default = true`` in the desired policy section.
- If no policy is declared as the default and no other policies are
defined, the policy with index ``0`` is set as the default;
- Otherwise, exactly one policy must be declared default.
- Deprecated policies cannot be declared the default.
- See :ref:`default-policy` for more information.
* ``deprecated = [true|false]`` (optional)
- If ``true`` then new containers cannot be created using this policy.
- The default value is ``false``.
- Any policy may be deprecated by adding the ``deprecated`` option to
the desired policy section. However, a deprecated policy may not also
be declared the default. Therefore, since there must always be a
default policy, there must also always be at least one policy which
is not deprecated.
- See :ref:`deprecate-policy` for more information.
* ``policy_type = [replication|erasure_coding]`` (optional)
- The option ``policy_type`` is used to distinguish between different
policy types.
- The default value is ``replication``.
- When defining an EC policy use the value ``erasure_coding``.
* ``diskfile_module = <entry point>`` (optional)
- The option ``diskfile_module`` is used to load an alternate backend
object storage plug-in architecture.
- The default value is ``egg:swift#replication.fs`` or
``egg:swift#erasure_coding.fs`` depending on the policy type. The scheme
and package name are optionals and default to ``egg`` and ``swift``.
The EC policy type has additional required options. See
:ref:`using_ec_policy` for details.
The following is an example of a properly configured ``swift.conf`` file. See
:doc:`policies_saio` for full instructions on setting up an all-in-one with
this example configuration.::
[swift-hash]
# random unique strings that can never change (DO NOT LOSE)
# Use only printable chars (python -c "import string; print(string.printable)")
swift_hash_path_prefix = changeme
swift_hash_path_suffix = changeme
[storage-policy:0]
name = gold
aliases = yellow, orange
policy_type = replication
default = yes
[storage-policy:1]
name = silver
policy_type = replication
diskfile_module = replication.fs
deprecated = yes
Creating a ring
---------------
Once ``swift.conf`` is configured for a new policy, a new ring must be created.
The ring tools are not policy name aware so it's critical that the correct
policy index be used when creating the new policy's ring file. Additional
object rings are created using ``swift-ring-builder`` in the same manner as the
legacy ring except that ``-N`` is appended after the word ``object`` in the
builder file name, where ``N`` matches the policy index used in ``swift.conf``.
So, to create the ring for policy index ``1``::
swift-ring-builder object-1.builder create 10 3 1
Continue to use the same naming convention when using ``swift-ring-builder`` to
add devices, rebalance etc. This naming convention is also used in the pattern
for per-policy storage node data directories.
.. note::
The same drives can indeed be used for multiple policies and the details
of how that's managed on disk will be covered in a later section, it's
important to understand the implications of such a configuration before
setting one up. Make sure it's really what you want to do, in many cases
it will be, but in others maybe not.
Proxy server configuration (optional)
-------------------------------------
The :ref:`proxy-server` configuration options related to read and write
affinity may optionally be overridden for individual storage policies. See
:ref:`proxy_server_per_policy_config` for more details.
--------------
Using Policies
--------------
Using policies is very simple - a policy is only specified when a container is
initially created. There are no other API changes. Creating a container can
be done without any special policy information::
curl -v -X PUT -H 'X-Auth-Token: <your auth token>' \
http://127.0.0.1:8080/v1/AUTH_test/myCont0
Which will result in a container created that is associated with the
policy name 'gold' assuming we're using the swift.conf example from
above. It would use 'gold' because it was specified as the default.
Now, when we put an object into this container, it will get placed on
nodes that are part of the ring we created for policy 'gold'.
If we wanted to explicitly state that we wanted policy 'gold' the command
would simply need to include a new header as shown below::
curl -v -X PUT -H 'X-Auth-Token: <your auth token>' \
-H 'X-Storage-Policy: gold' http://127.0.0.1:8080/v1/AUTH_test/myCont0
And that's it! The application does not need to specify the policy name ever
again. There are some illegal operations however:
* If an invalid (typo, non-existent) policy is specified: 400 Bad Request
* if you try to change the policy either via PUT or POST: 409 Conflict
If you'd like to see how the storage in the cluster is being used, simply HEAD
the account and you'll see not only the cumulative numbers, as before, but
per policy statistics as well. In the example below there's 3 objects total
with two of them in policy 'gold' and one in policy 'silver'::
curl -i -X HEAD -H 'X-Auth-Token: <your auth token>' \
http://127.0.0.1:8080/v1/AUTH_test
and your results will include (some output removed for readability)::
X-Account-Container-Count: 3
X-Account-Object-Count: 3
X-Account-Bytes-Used: 21
X-Storage-Policy-Gold-Object-Count: 2
X-Storage-Policy-Gold-Bytes-Used: 14
X-Storage-Policy-Silver-Object-Count: 1
X-Storage-Policy-Silver-Bytes-Used: 7
--------------
Under the Hood
--------------
Now that we've explained a little about what Policies are and how to
configure/use them, let's explore how Storage Policies fit in at the
nuts-n-bolts level.
Parsing and Configuring
-----------------------
The module, :ref:`storage_policy`, is responsible for parsing the
``swift.conf`` file, validating the input, and creating a global collection of
configured policies via class :class:`.StoragePolicyCollection`. This
collection is made up of policies of class :class:`.StoragePolicy`. The
collection class includes handy functions for getting to a policy either by
name or by index , getting info about the policies, etc. There's also one
very important function, :meth:`~.StoragePolicyCollection.get_object_ring`.
Object rings are members of the :class:`.StoragePolicy` class and are
actually not instantiated until the :meth:`~.StoragePolicy.load_ring`
method is called. Any caller anywhere in the code base that needs to access
an object ring must use the :data:`.POLICIES` global singleton to access the
:meth:`~.StoragePolicyCollection.get_object_ring` function and provide the
policy index which will call :meth:`~.StoragePolicy.load_ring` if
needed; however, when starting request handling services such as the
:ref:`proxy-server` rings are proactively loaded to provide moderate
protection against a mis-configuration resulting in a run time error. The
global is instantiated when Swift starts and provides a mechanism to patch
policies for the test code.
Middleware
----------
Middleware can take advantage of policies through the :data:`.POLICIES` global
and by importing :func:`.get_container_info` to gain access to the policy index
associated with the container in question. From the index it can then use the
:data:`.POLICIES` singleton to grab the right ring. For example,
:ref:`list_endpoints` is policy aware using the means just described. Another
example is :ref:`recon` which will report the md5 sums for all of the rings.
Proxy Server
------------
The :ref:`proxy-server` module's role in Storage Policies is essentially to make
sure the correct ring is used as its member element. Before policies, the one
object ring would be instantiated when the :class:`.Application` class was
instantiated and could be overridden by test code via init parameter. With
policies, however, there is no init parameter and the :class:`.Application`
class instead depends on the :data:`.POLICIES` global singleton to retrieve the
ring which is instantiated the first time it's needed. So, instead of an object
ring member of the :class:`.Application` class, there is an accessor function,
:meth:`~.Application.get_object_ring`, that gets the ring from
:data:`.POLICIES`.
In general, when any module running on the proxy requires an object ring, it
does so via first getting the policy index from the cached container info. The
exception is during container creation where it uses the policy name from the
request header to look up policy index from the :data:`.POLICIES` global. Once
the proxy has determined the policy index, it can use the
:meth:`~.Application.get_object_ring` method described earlier to gain access to
the correct ring. It then has the responsibility of passing the index
information, not the policy name, on to the back-end servers via the header ``X
-Backend-Storage-Policy-Index``. Going the other way, the proxy also strips the
index out of headers that go back to clients, and makes sure they only see the
friendly policy names.
On Disk Storage
---------------
Policies each have their own directories on the back-end servers and are
identified by their storage policy indexes. Organizing the back-end directory
structures by policy index helps keep track of things and also allows for
sharing of disks between policies which may or may not make sense depending on
the needs of the provider. More on this later, but for now be aware of the
following directory naming convention:
* ``/objects`` maps to objects associated with Policy-0
* ``/objects-N`` maps to storage policy index #N
* ``/async_pending`` maps to async pending update for Policy-0
* ``/async_pending-N`` maps to async pending update for storage policy index #N
* ``/tmp`` maps to the DiskFile temporary directory for Policy-0
* ``/tmp-N`` maps to the DiskFile temporary directory for policy index #N
* ``/quarantined/objects`` maps to the quarantine directory for Policy-0
* ``/quarantined/objects-N`` maps to the quarantine directory for policy index #N
Note that these directory names are actually owned by the specific Diskfile
implementation, the names shown above are used by the default Diskfile.
Object Server
-------------
The :ref:`object-server` is not involved with selecting the storage policy
placement directly. However, because of how back-end directory structures are
setup for policies, as described earlier, the object server modules do play a
role. When the object server gets a :class:`.Diskfile`, it passes in the
policy index and leaves the actual directory naming/structure mechanisms to
:class:`.Diskfile`. By passing in the index, the instance of
:class:`.Diskfile` being used will assure that data is properly located in the
tree based on its policy.
For the same reason, the :ref:`object-updater` also is policy aware. As
previously described, different policies use different async pending directories
so the updater needs to know how to scan them appropriately.
The :ref:`object-replicator` is policy aware in that, depending on the policy,
it may have to do drastically different things, or maybe not. For example, the
difference in handling a replication job for 2x versus 3x is trivial; however,
the difference in handling replication between 3x and erasure code is most
definitely not. In fact, the term 'replication' really isn't appropriate for
some policies like erasure code; however, the majority of the framework for
collecting and processing jobs is common. Thus, those functions in the
replicator are leveraged for all policies and then there is policy specific code
required for each policy, added when the policy is defined if needed.
The ssync functionality is policy aware for the same reason. Some of the
other modules may not obviously be affected, but the back-end directory
structure owned by :class:`.Diskfile` requires the policy index
parameter. Therefore ssync being policy aware really means passing the
policy index along. See :class:`~swift.obj.ssync_sender` and
:class:`~swift.obj.ssync_receiver` for more information on ssync.
For :class:`.Diskfile` itself, being policy aware is all about managing the
back-end structure using the provided policy index. In other words, callers who
get a :class:`.Diskfile` instance provide a policy index and
:class:`.Diskfile`'s job is to keep data separated via this index (however it
chooses) such that policies can share the same media/nodes if desired. The
included implementation of :class:`.Diskfile` lays out the directory structure
described earlier but that's owned within :class:`.Diskfile`; external modules
have no visibility into that detail. A common function is provided to map
various directory names and/or strings based on their policy index. For example
:class:`.Diskfile` defines ``get_data_dir`` which builds off of a generic
:func:`.get_policy_string` to consistently build policy aware strings for
various usage.
Container Server
----------------
The :ref:`container-server` plays a very important role in Storage Policies, it
is responsible for handling the assignment of a policy to a container and the
prevention of bad things like changing policies or picking the wrong policy to
use when nothing is specified (recall earlier discussion on Policy-0 versus
default).
The :ref:`container-updater` is policy aware, however its job is very simple, to
pass the policy index along to the :ref:`account-server` via a request header.
The :ref:`container-backend` is responsible for both altering existing DB
schema as well as assuring new DBs are created with a schema that supports
storage policies. The "on-demand" migration of container schemas allows Swift
to upgrade without downtime (sqlite's alter statements are fast regardless of
row count). To support rolling upgrades (and downgrades) the incompatible
schema changes to the ``container_stat`` table are made to a
``container_info`` table, and the ``container_stat`` table is replaced with a
view that includes an ``INSTEAD OF UPDATE`` trigger which makes it behave like
the old table.
The policy index is stored here for use in reporting information
about the container as well as managing split-brain scenario induced
discrepancies between containers and their storage policies. Furthermore,
during split-brain, containers must be prepared to track object updates from
multiple policies so the object table also includes a
``storage_policy_index`` column. Per-policy object counts and bytes are
updated in the ``policy_stat`` table using ``INSERT`` and ``DELETE`` triggers
similar to the pre-policy triggers that updated ``container_stat`` directly.
The :ref:`container-replicator` daemon will pro-actively migrate legacy
schemas as part of its normal consistency checking process when it updates the
``reconciler_sync_point`` entry in the ``container_info`` table. This ensures
that read heavy containers which do not encounter any writes will still get
migrated to be fully compatible with the post-storage-policy queries without
having to fall back and retry queries with the legacy schema to service
container read requests.
The :ref:`container-sync-daemon` functionality only needs to be policy aware in
that it accesses the object rings. Therefore, it needs to pull the policy index
out of the container information and use it to select the appropriate object
ring from the :data:`.POLICIES` global.
Account Server
--------------
The :ref:`account-server`'s role in Storage Policies is really limited to
reporting. When a HEAD request is made on an account (see example provided
earlier), the account server is provided with the storage policy index and
builds the ``object_count`` and ``byte_count`` information for the client on a
per policy basis.
The account servers are able to report per-storage-policy object and byte
counts because of some policy specific DB schema changes. A policy specific
table, ``policy_stat``, maintains information on a per policy basis (one row
per policy) in the same manner in which the ``account_stat`` table does. The
``account_stat`` table still serves the same purpose and is not replaced by
``policy_stat``, it holds the total account stats whereas ``policy_stat`` just
has the break downs. The backend is also responsible for migrating
pre-storage-policy accounts by altering the DB schema and populating the
``policy_stat`` table for Policy-0 with current ``account_stat`` data at that
point in time.
The per-storage-policy object and byte counts are not updated with each object
PUT and DELETE request, instead container updates to the account server are
performed asynchronously by the ``swift-container-updater``.
.. _upgrade-policy:
Upgrading and Confirming Functionality
--------------------------------------
Upgrading to a version of Swift that has Storage Policy support is not
difficult, in fact, the cluster administrator isn't required to make any special
configuration changes to get going. Swift will automatically begin using the
existing object ring as both the default ring and the Policy-0 ring. Adding the
declaration of policy 0 is totally optional and in its absence, the name given
to the implicit policy 0 will be 'Policy-0'. Let's say for testing purposes
that you wanted to take an existing cluster that already has lots of data on it
and upgrade to Swift with Storage Policies. From there you want to go ahead and
create a policy and test a few things out. All you need to do is:
#. Upgrade all of your Swift nodes to a policy-aware version of Swift
#. Define your policies in ``/etc/swift/swift.conf``
#. Create the corresponding object rings
#. Create containers and objects and confirm their placement is as expected
For a specific example that takes you through these steps, please see
:doc:`policies_saio`
.. note::
If you downgrade from a Storage Policy enabled version of Swift to an
older version that doesn't support policies, you will not be able to
access any data stored in policies other than the policy with index 0 but
those objects WILL appear in container listings (possibly as duplicates if
there was a network partition and un-reconciled objects). It is EXTREMELY
important that you perform any necessary integration testing on the
upgraded deployment before enabling an additional storage policy to ensure
a consistent API experience for your clients. DO NOT downgrade to a
version of Swift that does not support storage policies once you expose
multiple storage policies.