Delete init nodes when resetting lost requests
When creating a new node it is first created it is done in the following order: 1. Store new node with state 'init' and linked to a node request 2. Lock node 3. Set node state 'building' There is a chance that if a launcher got killed between steps 1 and 3 that we leak znodes in state 'init'. Nodes that are tied to a node request get deallocated from it when resetting it if the according launcher got offline. If the state is init the node will never be deleted so we leak that znode. While resetting a lost request we can be sure that a node in state init is orphaned since the lock on the according request got lost already. Thus we can mark those nodes to be deleted to prevent this leak. Change-Id: I83ec79ebf89e935339e9f3b39411f6ea23951a9b
This commit is contained in:
parent
c85472be6c
commit
43c1a28b84
@ -390,6 +390,14 @@ class CleanupWorker(BaseCleanupWorker):
|
||||
"request", node.id)
|
||||
return
|
||||
|
||||
# If the node is in state init then the launcher that worked
|
||||
# on the lost request has been interrupted between creating
|
||||
# the znode and locking/setting to building. In this case the
|
||||
# znode is leaked and we should delete the node instead of
|
||||
# just deallocating it.
|
||||
if node.state == zk.INIT:
|
||||
node.state = zk.DELETING
|
||||
|
||||
node.allocated_to = None
|
||||
try:
|
||||
zk_conn.storeNode(node)
|
||||
|
Loading…
x
Reference in New Issue
Block a user