swift/releasenotes/notes/2_26_0_release-6548eadcba544f72.yaml
Tim Burke d6c6ab764e Authors/ChangeLog for 2.26.0
Change-Id: Ia8e31ed0d5aefe67f2f926dc92d9acd6c0c98007
2020-09-17 11:56:22 -07:00

217 lines
8.3 KiB
YAML

---
features:
- |
Extend concurrent reads to erasure coded policies. Previously, the
options ``concurrent_gets`` and ``concurrency_timeout`` only applied to
replicated policies.
- |
Add a new ``concurrent_ec_extra_requests`` option to allow the proxy to
make some extra backend requests immediately. The proxy will respond as
soon as there are enough responses available to reconstruct.
- |
The concurrent read options (``concurrent_gets``, ``concurrency_timeout``,
and ``concurrent_ec_extra_requests``) may now be configured per
storage-policy.
- |
Replication servers can now handle all request methods. This allows
ssync to work with a separate replication network.
- |
All background daemons now use the replication network. This allows
better isolation between external, client-facing traffic and internal,
background traffic. Note that during a rolling upgrade, replication
servers may respond with ``405 Method Not Allowed``. To avoid this,
operators should remove the config option ``replication_server = true``
from their replication servers; this will allow them to handle all
request methods before upgrading.
- |
S3 API improvements:
* Fixed some SignatureDoesNotMatch errors when using the AWS .NET SDK.
* Add basic read support for object tagging. This improves
compatibility with AWS CLI version 2. Write support is not
yet implemented, so the tag set will always be empty.
* CompleteMultipartUpload requests may now be safely retried.
* Improved quota-exceeded error messages.
* Improved logging and statsd metrics. Be aware that this will cause
an increase in the proxy-logging statsd metrics emited for S3
responses. However, this should more accurately reflect the state
of the system.
* S3 requests are now less demanding on the container layer.
- |
Servers now open one listen socket per worker, ensuring each worker
serves roughly the same number of concurrent connections.
- |
Server workers may now be gracefully terminated via ``SIGHUP`` or
``SIGUSR1``. The parent process will then spawn a fresh worker.
- |
Allow proxy-logging middlewares to be configured more independently.
- |
Improve performance when increasing partition power.
issues:
- |
In a rolling upgrade from liberasurecode 1.5.0 or earlier to 1.6.0 or
later, object-servers may quarantine newly-written data, leading to
availability issues or even data loss. See `bug 1886088
<https://bugs.launchpad.net/liberasurecode/+bug/1886088>`__ for more
information, including how to determine whether you are affected.
Several mitigations are available to operators:
* If proxy and object layers can be upgraded independently and proxies
can be upgraded quickly:
1. Stop and disable the object-reconstructor before upgrading. This
ensures no upgraded object server starts writing new fragments
that old object servers would quarantine.
2. Upgrade liberasurecode on all object servers. Object servers can
now read both old and new fragments.
3. Upgrade liberasurecode on all proxy servers. Newly-written data
will now use new fragments. Note that not-yet-upgraded proxies
will not be able to read these newly-written fragments but will
instead respond ``500 Internal Server Error``.
4. After upgrading, re-enable and restart the object-reconstructor.
* If your users can tolerate it, consider a read-only rolling upgrade.
Before upgrading, enable the `read-only middleware
<https://docs.openstack.org/swift/latest/middleware.html#read-only>`__
cluster-wide to prevent new writes during the upgrade. Additionally,
stop and disable the object-reconstructor as above. Upgrade normally,
then disable the read-only middleware and re-enable and restart the
object-reconstructor.
* Avoid upgrading liberasurecode until swift and liberasurecode
better-support a rolling upgrade. Swift remains compatible with
liberasurecode 1.5.0 and earlier.
.. note::
Ubuntu 18.04 and RDO's CentOS 7 repos package liberasurecode 1.5.0,
while Ubuntu 20.04 and RDO's CentOS 8 repos currently package
liberasurecode 1.6.0 or 1.6.1. Take care when upgrading major distro
versions!
upgrade:
- |
**If your cluster has encryption enabled and is still running Swift
under Python 2**, we recommend upgrading Swift *before* transitioning to
Python 3. Otherwise, new writes to objects with non-ASCII characters
in their paths may result in corrupted downloads when read from a
proxy-server still running old swift on Python 2. See `bug 1888037
<https://bugs.launchpad.net/swift/+bug/1888037>`__ for more information.
Note that new tags including a fix for the bug are planned for all
maintained stable branches; upgrading to any one of those should be
sufficient to ensure a smooth upgrade to the latest Swift.
- |
The above bug was caused by a difference in string types that resulted
in ambiguity when decrypting. To prevent the ambiguity for new data, set
``meta_version_to_write = 3`` in your keymaster configuration *after*
upgrading all proxy servers.
If upgrading from Swift 2.20.0 or Swift 2.19.1 or earlier, set
``meta_version_to_write = 1`` in your keymaster configuration *prior*
to upgrading.
See the provided ``keymaster.conf-sample`` for more information about
this setting.
- |
**If your cluster is configured with a separate replication network**,
note that background daemons will switch to using this network for all
traffic. If your account, container, or object replication servers are
configured with ``replication_server = true``, these daemons may log a
flood of ``405 Method Not Allowed`` messages during a rolling upgrade.
To avoid this, comment out the option and restart replication servers
before upgrading.
fixes:
- |
Python 3 bug fixes:
* Fixed an error when reading encrypted data that was written while
running Python 2 for a path that includes non-ASCII characters.
* Object expiration respects the ``expiring_objects_container_divisor``
config option.
* ``fallocate_reserve`` may be specified as a percentage in more places.
* The ETag-quoting middleware no longer raises TypeErrors.
- |
Sharding improvements:
* Prevent object updates from auto-creating shard containers. This
ensures more consistent listings for sharded containers during
rebalances.
* Deleted shard containers are no longer considered root containers.
This prevents unnecessary sharding audit failures and allows the
deleted shard database to actually be unlinked.
* ``swift-container-info`` now summarizes shard range information.
Pass ``-v``/``--verbose`` if you want to see all of them.
* Improved container-sharder stat reporting to reduce load on root
container databases.
* Don't inject shard ranges when user quits.
- |
During rebalances, clients should no longer get 404s for data that
exists but whose replicas are overloaded.
- |
Improved cache management for account and container responses.
- |
Allow operators to pass either raw or URL-quoted paths to
``swift-get-nodes``. Notably, this allows ``swift-get-nodes`` to
work with the reserved namespace used for object versioning.
- |
Container read ACLs now work with object versioning. This only
allows access to the most-recent version via an unversioned URL.
- |
Improved how containers reclaim deleted rows to reduce locking and object
update throughput.
- |
Large object reads log fewer client disconnects.
- |
Allow ratelimit to be placed multiple times in a proxy pipeline,
such as both before s3api and auth (to handle swift requests without
needing to make an auth decision) and after (to limit S3 requests).
- |
Shuffle object-updater work. This somewhat reduces the impact a
single overloaded database has on other containers' listings.
- |
Fix a proxy-server error when retrieving erasure coded data when
there are durable fragments but not enough to reconstruct.
- |
Fix an error in the proxy server when finalizing data.
- |
Various other minor bug fixes and improvements.