Clear allocation when unlocking failed node requests

If a subset of nodes in a multinode request fails, we will decline
the request and unlock the nodes, but we do not clear the node
allocation, which means that these ready nodes can not be reused
until the original node request is resolved (and it may stick
around to be tried by another provider).

Handle this case by explicitly clearing the node allocations when
unlocking the nodes from the declined request.

Change-Id: I37560293976f594b1360ed4b17dac4f8a831d0cc
This commit is contained in:
James E. Blair 2025-02-27 10:56:01 -08:00 committed by Clark Boylan
parent 4a9e7b4642
commit 454f1edb39

View File

@ -757,9 +757,11 @@ class NodeRequestHandler(NodeRequestHandlerNotifications,
"when attempting to clean up the lock")
return True
clear_allocation = False
if self.failed_nodes:
self.log.debug("Declining node request because nodes failed")
self._declineRequest()
clear_allocation = True
# We perform our own cleanup
elif aborted_nodes:
# Because nodes are added to the satisfied types list before they
@ -781,7 +783,7 @@ class NodeRequestHandler(NodeRequestHandlerNotifications,
self.log.debug("Fulfilled node request")
self.request.state = zk.FULFILLED
self.unlockNodeSet()
self.unlockNodeSet(clear_allocation=clear_allocation)
self.zk.storeNodeRequest(self.request)
self.zk.unlockNodeRequest(self.request)
return True