Implement paging in swift object listing

Swift by default only returns a maximum of 10k objects per object
listing. This is problematic when pruning (and potentially otherwise) as
we may not list valid manifests resulting in us never marking the blobs
belonging to those manfiests as valid. These blobs would then get
erroneously deleted.

Address this by paging through object listings using swift's marker
parameter.

Change-Id: Ida85076b716a7718a8ca5fe50e4fbb90b3a41cbf
This commit is contained in:
Clark Boylan 2024-11-15 12:08:38 -08:00 committed by James E. Blair
parent aa4a1be00b
commit 1105d2bada

View File

@ -65,7 +65,19 @@ class SwiftDriver(storageutils.StorageDriver):
def list_objects(self, path): def list_objects(self, path):
self.log.debug("List objects %s", path) self.log.debug("List objects %s", path)
url = self.get_url('') + '?prefix=%s&delimiter=/&format=json' % (path,) marker = ''
ret = []
while inner_ret := self._list_objects(path, marker):
# Swift limits the total number of responses per request
# (typically 10k) so we have to paginate and accumulate responses.
ret.extend(inner_ret)
marker = inner_ret[-1].path
return ret
def _list_objects(self, path, marker):
# TODO should path and marker be url encoded?
url = self.get_url('') + \
'?prefix=%s&delimiter=/&format=json&marker=%s' % (path, marker)
ret = retry_function( ret = retry_function(
lambda: self.conn.session.get(url).content.decode('utf8')) lambda: self.conn.session.get(url).content.decode('utf8'))
data = json.loads(ret) data = json.loads(ret)