Atomically concatenate blob objects

We are seeing problems where manifests are invalid because the size in
the manifest for blobs is smaller than the actual size of a blob. The
reason for this appears to be that the docker client will HEAD blobs to
get their size for use in the manifest, and zuul-registry will return a
HEAD value if the blob is on disk. With larger blobs concatenating the
chunks into the final blob location may be slow and we'll return a short
HEAD response while the blob is being concetenated.

Avoid this problem by concatenating into a tmpfile and then atomically
moving that file into its final position. The file length should be
complete at that point and HEAD will only see the path as existing once
it is complete.

Change-Id: I0c85fa5657106175140fbb8b8193659bcfe2d6c8
This commit is contained in:
Clark Boylan 2022-02-24 13:49:32 -08:00
parent 120cadf2f6
commit 0acf995175

View File

@ -108,9 +108,18 @@ class FilesystemDriver(storageutils.StorageDriver):
os.rename(src_path, dst_path)
def cat_objects(self, path, chunks, uuid=None):
# We cat the objects into a tmp file that we will atomically move into
# its final resting place. The reason for this is that HEAD requests
# for the blob may arrive before we have completely written the
# concatenated file out. If this occurs the presence of the half
# written file is sufficient to respond with an invalid size indicating
# to the client that the upload is complete and that the small size
# can be used to generate a manifest. When this manifest is pushed
# we fail due to a size mismatch.
path = os.path.join(self.root, path)
tmp_path = path + '.tmp'
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, 'wb') as outf:
with open(tmp_path, 'wb') as outf:
for chunk in chunks:
chunk_path = os.path.join(self.root, chunk['path'])
with open(chunk_path, 'rb') as inf:
@ -121,6 +130,7 @@ class FilesystemDriver(storageutils.StorageDriver):
outf.write(d)
outf.flush()
os.fsync(outf.fileno())
os.rename(tmp_path, path)
for chunk in chunks:
chunk_path = os.path.join(self.root, chunk['path'])
os.unlink(chunk_path)