545dc2106b
The agent command exec model is based upon an incoming heartbeat, however heartbeats are independent and commands can take a long time. For example, software RAID setup in CI can encounter this. From an IPA log: [-] Picked root device /dev/md0 for node c6ca0af2-baec-40d6-879d-cbb5c751aafb based on root device hints {'name': '/dev/md0'} [-] Attempting to download image from http://199.204.45.248:3928/agent_images/ c6ca0af2-baec-40d6-879d-cbb5c751aafb [-] Executing command: standby.get_partition_uuids with args: {} execute_command /usr/local/lib/python3.6/site-packages/ironic_python_agent/extensions/base.py:255 [-] Tried to execute standby.get_partition_uuids, agent is still executing Command name: execute_deploy_step, params: {'step': {'interface': 'deploy', 'step': 'write_image', 'args': {'image_info': {'id': 'cb9e199a-af1b-4a6f-b00e-f284008b8046', 'urls': ['http://199.204.45.248:3928/agent_images/c6ca0af2-baec-40d6-879d-cbb5c751aafb'], 'disk_format': 'raw', 'container_format': 'bare', 'stream_raw_images': True, 'os_hash_algo': 'sha512', 'os_hash_value':<trimed> This was with code built on master, using master images. Inside the conductor log, it notes that it is likely an out of date agent because only AgentAPIError is evaluated, however any API error is evaluated this way. In reality, we need to explicitly flag *when* we have an error that is because we've tried to soon as something is already being worked upon. The result, is to evaluate and return an exception indicating work is already in flight. Update - It looks like, the original fix to prevent busy agent recognition did not fully detect all cases as getting steps is a command which can get skipped by accident with a busy agent, under certain circumstances. Change I5d86878b5ed6142ed2630adee78c0867c49b663f in ironic-python-agent also changed the string that was being checked for the previous handling, where we really should have just made the string we were checking lower case in ironic. Oh well! This should fix things right up. Story: 2008167 Task: 41175 Change-Id: Ia169640b7084d17d26f22e457c7af512db6d21d6 |
||
---|---|---|
api-ref | ||
devstack | ||
doc | ||
etc | ||
ironic | ||
playbooks/ci-workarounds | ||
releasenotes | ||
tools | ||
zuul.d | ||
.gitignore | ||
.gitreview | ||
.mailmap | ||
.stestr.conf | ||
bindep.txt | ||
CONTRIBUTING.rst | ||
driver-requirements.txt | ||
LICENSE | ||
lower-constraints.txt | ||
README.rst | ||
reno.yaml | ||
requirements.txt | ||
setup.cfg | ||
setup.py | ||
test-requirements.txt | ||
tox.ini |
Ironic
Team and repository tags
Overview
Ironic consists of an API and plug-ins for managing and provisioning physical machines in a security-aware and fault-tolerant manner. It can be used with nova as a hypervisor driver, or standalone service using bifrost. By default, it will use PXE and IPMI to interact with bare metal machines. Ironic also supports vendor-specific plug-ins which may implement additional functionality.
Ironic is distributed under the terms of the Apache License, Version 2.0. The full terms and conditions of this license are detailed in the LICENSE file.
Project resources
- Documentation: https://docs.openstack.org/ironic/latest
- Source: https://opendev.org/openstack/ironic
- Bugs: https://storyboard.openstack.org/#!/project/943
- Wiki: https://wiki.openstack.org/wiki/Ironic
- APIs: https://docs.openstack.org/api-ref/baremetal/index.html
- Release Notes: https://docs.openstack.org/releasenotes/ironic/
- Design Specifications: https://specs.openstack.org/openstack/ironic-specs/
Project status, bugs, and requests for feature enhancements (RFEs) are tracked in StoryBoard: https://storyboard.openstack.org/#!/project/943
For information on how to contribute to ironic, see https://docs.openstack.org/ironic/latest/contributor