Confirm receipt of SLO PUT with etag

With a multipart-manifest PUT request, if client sends the md5 of the
segments' etags, a 422 Unprocessable Entity response is returned. This
patch fixes that and confirms the etag

Change-Id: I4598a2a3f16ca8727bb07bbb6d8efcfcae777796
Closes-Bug: #1213200
Co-Authored-By: Tim Burke <tim@swiftstack.com>
This commit is contained in:
Mahati Chamarthy 2016-10-26 20:57:51 +05:30
parent 43a175ebd2
commit fd6edf7bc5
5 changed files with 127 additions and 19 deletions

View File

@ -152,18 +152,21 @@ ETag_obj_copied:
ETag_obj_received:
description: |
The MD5 checksum of the uploaded object content.
The value is not quoted.
The value is not quoted. If it is an SLO, it would
be MD5 checksum of the segments' etags.
in: header
required: true
type: string
ETag_obj_req:
description: |
The MD5 checksum value of the request body. For
example, the MD5 checksum value of the object content. You are
strongly recommended to compute the MD5 checksum value of object
content and include it in the request. This enables the Object
Storage API to check the integrity of the upload. The value is not
quoted.
example, the MD5 checksum value of the object content. For
manifest objects, this value is the MD5 checksum of the
concatenated string of ETag values for each of the segments in
the manifest. You are strongly recommended to compute
the MD5 checksum value and include it in the request. This
enables the Object Storage API to check the integrity of the
upload. The value is not quoted.
in: header
required: false
type: string
@ -172,7 +175,7 @@ ETag_obj_resp:
For objects smaller than 5 GB, this value is the
MD5 checksum of the object content. The value is not quoted. For
manifest objects, this value is the MD5 checksum of the
concatenated string of MD5 checksums and ETags for each of the
concatenated string of ETag values for each of the
segments in the manifest, and not the MD5 checksum of the content
that was downloaded. Also the value is enclosed in double-quote
characters. You are strongly recommended to compute the MD5

View File

@ -105,12 +105,10 @@ contrast to dynamic large objects.
The ``Content-Length`` request header must contain the length of the
json content—not the length of the segment objects. However, after the
**PUT** operation completes, the ``Content-Length`` metadata is set to
the total length of all the object segments. A similar situation applies
to the ``ETag``. If used in the **PUT** operation, it must contain the
MD5 checksum of the JSON content. The ``ETag`` metadata value is then
set to be the MD5 checksum of the concatenated ``ETag`` values of the
object segments. You can also set the ``Content-Type`` request header
and custom object metadata.
the total length of all the object segments. When using the ``ETag``
request header in a **PUT** operation, it must contain the MD5 checksum
of the concatenated ``ETag`` values of the object segments. You can also
set the ``Content-Type`` request header and custom object metadata.
When the **PUT** operation sees the ``multipart-manifest=put`` query
string, it reads the request body and verifies that each segment

View File

@ -88,6 +88,49 @@ segments of a SLO manifest can even be other SLO manifests. Treat them as any
other object i.e., use the Etag and Content-Length given on the PUT of the
sub-SLO in the manifest to the parent SLO.
While uploading a manifest, a user can send Etag for verification. It needs to
be md5 of the segments' etags, if there is no range specified. For example, if
the manifest to be uploaded looks like this:
.. code::
[{"path": "/cont/object1",
"etag": "etagoftheobjectsegment1",
"size_bytes": 10485760},
{"path": "/cont/object2",
"etag": "etagoftheobjectsegment2",
"size_bytes": 10485760}]
The Etag of the above manifest would be md5 of etagoftheobjectsegment1 and
etagoftheobjectsegment2. This could be computed in the following way:
.. code::
echo -n 'etagoftheobjectsegment1etagoftheobjectsegment2' | md5sum
If a manifest to be uploaded with a segment range looks like this:
.. code::
[{"path": "/cont/object1",
"etag": "etagoftheobjectsegmentone",
"size_bytes": 10485760,
"range": "1-2"},
{"path": "/cont/object2",
"etag": "etagoftheobjectsegmenttwo",
"size_bytes": 10485760,
"range": "3-4"}]
While computing the Etag of the above manifest, internally each segment's etag
will be taken in the form of 'etagvalue:rangevalue;'. Hence the Etag of the
above manifest would be:
.. code::
echo -n 'etagoftheobjectsegmentone:1-2;etagoftheobjectsegmenttwo:3-4;' \
| md5sum
-------------------
Range Specification
-------------------
@ -211,7 +254,7 @@ from swift.common.exceptions import ListingIterError, SegmentError
from swift.common.swob import Request, HTTPBadRequest, HTTPServerError, \
HTTPMethodNotAllowed, HTTPRequestEntityTooLarge, HTTPLengthRequired, \
HTTPOk, HTTPPreconditionFailed, HTTPException, HTTPNotFound, \
HTTPUnauthorized, HTTPConflict, Response, Range
HTTPUnauthorized, HTTPConflict, HTTPUnprocessableEntity, Response, Range
from swift.common.utils import get_logger, config_true_value, \
get_valid_utf8_str, override_bytes_from_content_type, split_path, \
register_swift_info, RateLimitedIterator, quote, close_if_possible, \
@ -982,16 +1025,20 @@ class StaticLargeObject(object):
slo_etag.update(seg_data['hash'])
slo_etag = slo_etag.hexdigest()
req.headers.update({
SYSMETA_SLO_ETAG: slo_etag,
SYSMETA_SLO_SIZE: total_size,
'X-Static-Large-Object': 'True',
})
client_etag = req.headers.get('Etag')
if client_etag and client_etag.strip('"') != slo_etag:
raise HTTPUnprocessableEntity(request=req)
json_data = json.dumps(data_for_storage)
if six.PY3:
json_data = json_data.encode('utf-8')
req.body = json_data
req.headers.update({
SYSMETA_SLO_ETAG: slo_etag,
SYSMETA_SLO_SIZE: total_size,
'X-Static-Large-Object': 'True',
'Etag': md5(json_data).hexdigest(),
})
env = req.environ
if not env.get('CONTENT_TYPE'):

View File

@ -451,6 +451,35 @@ class TestSlo(Base):
else:
self.fail("Expected ResponseError but didn't get it")
def test_slo_client_etag_mismatch(self):
file_item = self.env.container.file("manifest-a-mismatch-etag")
try:
file_item.write(
json.dumps([{
'size_bytes': 1024 * 1024,
'etag': hashlib.md5('a' * 1024 * 1024).hexdigest(),
'path': '/%s/%s' % (self.env.container.name, 'seg_a')}]),
parms={'multipart-manifest': 'put'},
hdrs={'Etag': 'NOTetagofthesegments'})
except ResponseError as err:
self.assertEqual(422, err.status)
def test_slo_client_etag(self):
file_item = self.env.container.file("manifest-a-b-etag")
etag_a = hashlib.md5('a' * 1024 * 1024).hexdigest()
etag_b = hashlib.md5('b' * 1024 * 1024).hexdigest()
file_item.write(
json.dumps([{
'size_bytes': 1024 * 1024,
'etag': etag_a,
'path': '/%s/%s' % (self.env.container.name, 'seg_a')}, {
'size_bytes': 1024 * 1024,
'etag': etag_b,
'path': '/%s/%s' % (self.env.container.name, 'seg_b')}]),
parms={'multipart-manifest': 'put'},
hdrs={'Etag': hashlib.md5(etag_a + etag_b).hexdigest()})
self.assert_status(201)
def test_slo_unspecified_etag(self):
file_item = self.env.container.file("manifest-a-unspecified-etag")
file_item.write(

View File

@ -436,6 +436,37 @@ class TestSloPutManifest(SloTestCase):
'Content-Type %r does not end with swift_bytes=100' %
req.headers['Content-Type'])
def test_manifest_put_no_etag_success(self):
req = Request.blank(
'/v1/AUTH_test/c/man?multipart-manifest=put',
method='PUT', body=test_json_data)
resp = req.get_response(self.slo)
self.assertEqual(resp.status_int, 201)
def test_manifest_put_with_etag_success(self):
req = Request.blank(
'/v1/AUTH_test/c/man?multipart-manifest=put',
method='PUT', body=test_json_data)
req.headers['Etag'] = md5hex('etagoftheobjectsegment')
resp = req.get_response(self.slo)
self.assertEqual(resp.status_int, 201)
def test_manifest_put_with_etag_with_quotes_success(self):
req = Request.blank(
'/v1/AUTH_test/c/man?multipart-manifest=put',
method='PUT', body=test_json_data)
req.headers['Etag'] = '"%s"' % md5hex('etagoftheobjectsegment')
resp = req.get_response(self.slo)
self.assertEqual(resp.status_int, 201)
def test_manifest_put_bad_etag_fail(self):
req = Request.blank(
'/v1/AUTH_test/c/man?multipart-manifest=put',
method='PUT', body=test_json_data)
req.headers['Etag'] = md5hex('NOTetagoftheobjectsegment')
resp = req.get_response(self.slo)
self.assertEqual(resp.status_int, 422)
def test_handle_multipart_put_disallow_empty_first_segment(self):
test_json_data = json.dumps([{'path': '/cont/object',
'etag': 'etagoftheobjectsegment',