Speed up get_more_nodes() when there is an empty zone

The ring has some optimizations in get_more_nodes() so that it can
find handoffs that span all the regions/zones/et cetera and then stop
looking. The stopping is the important part.

Previously, it would quickly find a handoff in each unused region,
then spend way too long looking for more unused regions; the same was
true for zones, IPs, and so on. Thus, in commit 9cd7c6c, we started
counting regions and zones, then stopping when we found them all.

This count included all regions and zones in the ring, regardless of
whether or not there were actually any parts assigned or not. In rings
with an empty region, i.e. a region for which there are only
zero-weight devices, get_more_nodes() would be very slow.

This commit ignores devices with no assigned partitions when counting
regions, zones, and so forth, thus greatly speeding things up.

The output of get_more_nodes() is unchanged. This is purely an
optimization.

Closes-Bug: 1534303

Change-Id: I4a5c57205e87e1205d40fd5d9458d4114e524332
This commit is contained in:
Samuel Merritt 2016-01-13 18:08:45 -08:00
parent 8460ddd607
commit 3c0cf549f1

View File

@ -203,12 +203,23 @@ class Ring(object):
# Do this now, when we know the data has changed, rather than
# doing it on every call to get_more_nodes().
#
# Since this is to speed up the finding of handoffs, we only
# consider devices with at least one partition assigned. This
# way, a region, zone, or server with no partitions assigned
# does not count toward our totals, thereby keeping the early
# bailouts in get_more_nodes() working.
dev_ids_with_parts = set()
for part2dev_id in self._replica2part2dev_id:
for dev_id in part2dev_id:
dev_ids_with_parts.add(dev_id)
regions = set()
zones = set()
ips = set()
self._num_devs = 0
for dev in self._devs:
if dev:
if dev and dev['id'] in dev_ids_with_parts:
regions.add(dev['region'])
zones.add((dev['region'], dev['zone']))
ips.add((dev['region'], dev['zone'], dev['ip']))