First pass at the new in-band inspection docs

Reorganize the existing docs to give space to more information.
Cover the most critical topics, except for installation.

Change-Id: If0f185e0303d6f8071306edbc64b9c5704f58d16
This commit is contained in:
Dmitry Tantsur 2023-10-13 18:07:17 +02:00
parent 28b3f56b2c
commit 9c0996d1a3
No known key found for this signature in database
GPG Key ID: 315B2AF9FD216C60
7 changed files with 392 additions and 84 deletions

View File

@ -1,5 +1,3 @@
.. _inspection:
=================== ===================
Hardware Inspection Hardware Inspection
=================== ===================
@ -14,12 +12,16 @@ discovered ethernet MACs. Operators will have to manually delete the Bare Metal
service ports for which physical media is not connected. This is required due service ports for which physical media is not connected. This is required due
to the `bug 1405131 <https://bugs.launchpad.net/ironic/+bug/1405131>`_. to the `bug 1405131 <https://bugs.launchpad.net/ironic/+bug/1405131>`_.
There are two kinds of inspection supported by Bare Metal service: There are three kinds of inspection supported by Bare Metal service:
#. Out-of-band inspection is currently implemented by several hardware types, #. Out-of-band inspection is currently implemented by several hardware types,
including ``ilo``, ``idrac`` and ``irmc``. including ``ilo``, ``idrac`` and ``irmc``.
#. `In-band inspection`_ by utilizing the ironic-inspector_ project. #. :doc:`In-band inspection </admin/inspection/inspector>` utilizing
the ironic-inspector_ project.
#. New experimental built-in :doc:`in-band inspection
</admin/inspection/index>`.
The node should be in the ``manageable`` state before inspection is initiated. The node should be in the ``manageable`` state before inspection is initiated.
If it is in the ``enroll`` or ``available`` state, move it to ``manageable`` If it is in the ``enroll`` or ``available`` state, move it to ``manageable``
@ -31,6 +33,8 @@ Then inspection can be initiated using the following command::
baremetal node inspect <node_UUID> baremetal node inspect <node_UUID>
.. _ironic-inspector: https://pypi.org/project/ironic-inspector
.. _capabilities-discovery: .. _capabilities-discovery:
Capabilities discovery Capabilities discovery
@ -76,84 +80,12 @@ In-band inspection
In-band inspection involves booting a ramdisk on the target node and fetching In-band inspection involves booting a ramdisk on the target node and fetching
information directly from it. This process is more fragile and time-consuming information directly from it. This process is more fragile and time-consuming
than the out-of-band inspection, but it is not vendor-specific and works than the out-of-band inspection, but it is not vendor-specific and works
across a wide range of hardware. In-band inspection is using the across a wide range of hardware.
ironic-inspector_ project.
It is supported by all hardware types, and used by default, if enabled, by the .. toctree::
``ipmi`` hardware type. The ``inspector`` *inspect* interface has to be
enabled to use it:
.. code-block:: ini inspection/inspector
[DEFAULT] .. toctree::
enabled_inspect_interfaces = inspector,no-inspect
If the ironic-inspector service is not registered in the service catalog, set inspection/index
the following option:
.. code-block:: ini
[inspector]
endpoint_override = http://inspector.example.com:5050
In order to ensure that ports in Bare Metal service are synchronized with
NIC ports on the node, the following settings in the ironic-inspector
configuration file must be set:
.. code-block:: ini
[processing]
add_ports = all
keep_ports = present
There are two modes of in-band inspection: `managed inspection`_ and `unmanaged
inspection`_.
.. _ironic-inspector: https://pypi.org/project/ironic-inspector
.. _python-ironicclient: https://pypi.org/project/python-ironicclient
Managed inspection
~~~~~~~~~~~~~~~~~~
Inspection is *managed* when the Bare Metal conductor fully configures the node
for inspection, including setting boot device, boot mode and power state. This
is the only way to conduct inspection using :ref:`redfish-virtual-media` or
with :doc:`/admin/dhcp-less`. This mode is engaged automatically when the node
has sufficient information to configure boot (e.g. ports in case of iPXE).
There are a few configuration options that tune managed inspection, the most
important is ``extra_kernel_params``, which allows adding kernel parameters for
inspection specifically. This is where you can configure
:ironic-python-agent-doc:`inspection collectors and other parameters
<admin/how_it_works.html#inspection>`, for example:
.. code-block:: ini
[inspector]
extra_kernel_params = ipa-inspection-collectors=default,logs ipa-collect-lldp=1
For the callback URL the ironic-inspector endpoint from the service catalog is
used. If you want to override the endpoint for callback only, set the following
option:
.. code-block:: ini
[inspector]
callback_endpoint_override = https://example.com/baremetal-introspection/v1/continue
Unmanaged inspection
~~~~~~~~~~~~~~~~~~~~
Under *unmanaged* inspection we understand in-band inspection orchestrated by
ironic-inspector or a third party. This was the only inspection mode before the
Ussuri release, and it is still used when the node's boot cannot be configured
by the conductor. The options described above do not affect unmanaged
inspection. See :ironic-inspector-doc:`ironic-inspector installation guide
<install/index.html>` for more information.
If you want to **prevent** unmanaged inspection from working, set this option:
.. code-block:: ini
[inspector]
require_managed_boot = True

View File

@ -0,0 +1,89 @@
Inspection data
===============
The in-band inspection processes collects a lot of information about the node.
This data consists of two parts:
* *Inventory* is :ironic-python-agent-doc:`hardware inventory
<admin/how_it_works.html#hardware-inventory>` reported by the agent.
* *Plugin data* is data populated by ramdisk-side and server-side plug-ins.
After a successful inspection, you can get both parts as JSON with:
.. code-block:: console
$ baremetal node inventory save <NODE>
Use `jq` to filter the parts you need, e.g. only the inventory itself:
.. code-block:: console
$ # System vendor information from the inventory
$ baremetal node inventory save <NODE> | jq .inventory.system_vendor
{
"product_name": "KVM (9.2.0)",
"serial_number": "",
"manufacturer": "Red Hat",
"firmware": {
"vendor": "EDK II",
"version": "edk2-20221207gitfff6d81270b5-7.el9",
"build_date": "12/07/2022"
}
}
$ # Interfaces used to create ports
$ baremetal node inventory save <NODE> | jq .plugin_data.valid_interfaces
{
"eth0": {
"name": "eth0",
"mac_address": "52:54:00:5e:09:ff",
"ipv4_address": "192.168.122.164",
"ipv6_address": "fe80::5054:ff:fe5e:9ff",
"has_carrier": true,
"lldp": null,
"vendor": "0x1af4",
"product": "0x0001",
"client_id": null,
"biosdevname": null,
"speed_mbps": null,
"pxe_enabled": true
}
}
Plugin data
-----------
Plugin data is the storage for all information that is collected or processed
by various plugins. Its format is not a part of the API stability promise
and may change depending on your configuration.
Plugin data comes from two sources:
* :ironic-python-agent-doc:`inspection collectors
<admin/how_it_works.html#inspection-data>` - ramdisk-side inspection
plug-ins.
* :doc:`hooks` - server-side inspection plug-ins.
.. TODO(dtantsur): inspection rules API once it's ready
Data storage
------------
There are several options to store the inspection data, specified via the
:oslo.config:option:`inventory.data_backend` option:
``none``
Do not store inspection data at all. The API will always return 404 NOT
FOUND.
``database``
Store inspection data in a separate table in the main database.
``swift``
Store inspection data in the Object Store service (swift) in the container
specified by the :oslo.config:option:`inventory.swift_data_container`
option.
.. warning::
There is currently no way to migrate data between backends. Changing the
backend will remove access to existing data.

View File

@ -0,0 +1,142 @@
Inspection hooks
================
*Inspection hooks* are a type of the Bare Metal service plug-ins responsible
for processing data from in-band inspection. By confuguring these hooks, an
operator can fully customize the inspection processing phase. How the data is
collected can be configured with `inspection collectors
<https://docs.openstack.org/ironic-python-agent/latest/admin/how_it_works.html#inspection-data>`_.
Configuring hooks
-----------------
Two configuration options are responsible for inspection hooks:
:oslo.config:option:`inspector.default_hooks` defines which hooks run by
default, while :oslo.config:option:`inspector.hooks` defines which hooks to run
in your deployment. Only the second option should be modified by operators,
while the first one is to provide the defaults without hardcoding them:
.. code-block:: ini
[inspector]
hooks = $default_hooks
To make a hook run after the default ones, append it to the list, e.g.
.. code-block:: ini
[inspector]
hooks = $default_hooks,extra-hardware
Default hooks
-------------
In the order they go in the :oslo.config:option:`inspector.default_hooks`
option:
``ramdisk-error``
Processes the ``error`` field from the ramdisk, aborting inspection if
it is not empty.
``validate-interfaces``
Validates network interfaces and stores the result in the ``plugin_data``
in two fields:
* ``all_interfaces`` - all interfaces that pass the basic sanity check.
* ``valid_interfaces`` - interfaces that satisfy the configuration
in the :oslo.config:option:`inspector.add_ports` option.
In both cases, interfaces get an addition field:
* ``pxe_enabled`` - whether PXE was enabled on this interface during
the inspection boot.
``ports``
Creates ports for interfaces in ``valid_interfaces`` as set by the
``validate-interfaces`` hook.
Deletes ports that don't match the
:oslo.config:option:`inspector.keep_ports` setting.
``architecture``
Populates the ``cpu_arch`` property on the node.
Optional hooks
--------------
``accelerators``
Populates the ``accelerators`` property based on the reported PCI devices.
The known accelerators are specified in the YAML file linked in the
:oslo.config:option:`inspector.known_accelerators` option. The default
file is the following:
.. literalinclude:: ../../../../ironic/drivers/modules/inspector/hooks/known_accelerators.yaml
:language: YAML
``boot-mode``
Sets the ``boot_mode`` capability based on the observed boot mode, see
:ref:`boot_mode_support`.
``cpu-capabilities``
Uses the CPU flags to :ref:`discover CPU capabilities
<capabilities-discovery>`. The exact mapping can be customized via
configuration:
.. code-block:: ini
[inspector]
cpu_capabilities = vmx:cpu_vt,svm:cpu_vt
See :oslo.config:option:`inspector.cpu_capabilities` for the default
mapping.
``extra-hardware``
Converts the data collected by python-hardware_ from its raw format
into nested dictionaries under the ``extra`` plugin data field.
``local-link-connection``
Uses the LLDP information from the ramdisk to populate the
``local_link_connection`` field on ports with the physical switch
information.
``memory``
Populates the ``memory_mb`` property based on physical RAM information
from DMI.
``parse-lldp``
Parses the raw binary LLDP information from the ramdisk and populates
the ``parsed_lldp`` dictionary in plugin data. The keys are network
interface names, the values are dictionaries with LLDP values. Example:
.. code-block:: json
"parsed_lldp": {
"eth0": {
"switch_chassis_id": "11:22:33:aa:bb:cc",
"switch_system_name": "sw01-dist-1b-b12"
}
}
``pci-devices``
Populates the capabilities based on PCI devices. The mapping is provided
by the :oslo.config:option:`inspector.pci_device_alias` option.
``physical-network``
Populates the ``physical_network`` port field for
:doc:`/admin/multitenancy` based on the detected IP addresses. The mapping
is provided by the
:oslo.config:option:`inspector.physical_network_cidr_map` option.
``raid-device``
Detects the newly created RAID device and populates the ``root_device``
property used in :ref:`root device hints <root-device-hints>`. Requires two
inspections: one before and one after the RAID creation.
``root-device``
Uses :ref:`root device hints <root-device-hints>` on the node and the
storage device information from the ramdisk to calculate the expected root
device and populate the ``local_gb`` property (taking the
:oslo.config:option:`inspector.disk_partitioning_spacing` option into
account).
.. _python-hardware: https://github.com/redhat-cip/hardware

View File

@ -0,0 +1,52 @@
==================
In-Band Inspection
==================
In-band inspection involves booting a ramdisk on the target node and fetching
information directly from it. This process is more fragile and time-consuming
than the out-of-band inspection, but it is not vendor-specific and works
across a wide range of hardware.
In the 2023.2 "Bobcat" release series, Ironic received an experimental
implementation of in-band inspection that does not require the separate
ironic-inspector_ service.
.. note::
The implementation described in this document is not 100% compatible with
the previous one (based on ironic-inspector_). Check the documentation and
the release notes for which features are currently available.
Use :doc:`inspector` for production deployments of Ironic 2023.2 or earlier
releases.
.. _ironic-inspector: https://pypi.org/project/ironic-inspector
.. toctree::
managed
data
hooks
Configuration
-------------
In-band inspection is supported by all hardware types. The ``agent``
*inspect* interface has to be enabled to use it:
.. code-block:: ini
[DEFAULT]
enabled_inspect_interfaces = agent,no-inspect
You can make it the default if you want all nodes to use it automatically:
.. code-block:: ini
[DEFAULT]
default_inspect_interface = agent
Of course, you can configure it per node:
.. code-block:: console
$ baremetal node set --inspect-interface agent <NODE>

View File

@ -0,0 +1,85 @@
Inspector Support
=================
Ironic supports in-band inspection using the ironic-inspector_ project. This
is the original in-band inspection implementation, which is being gradually
phased out in favour of a similar implementation inside Ironic proper.
It is supported by all hardware types, and used by default, if enabled, by the
``ipmi`` hardware type. The ``inspector`` *inspect* interface has to be
enabled to use it:
.. code-block:: ini
[DEFAULT]
enabled_inspect_interfaces = inspector,no-inspect
If the ironic-inspector service is not registered in the service catalog, set
the following option:
.. code-block:: ini
[inspector]
endpoint_override = http://inspector.example.com:5050
In order to ensure that ports in Bare Metal service are synchronized with
NIC ports on the node, the following settings in the ironic-inspector
configuration file must be set:
.. code-block:: ini
[processing]
add_ports = all
keep_ports = present
There are two modes of in-band inspection: `managed inspection`_ and `unmanaged
inspection`_.
.. _ironic-inspector: https://pypi.org/project/ironic-inspector
.. _python-ironicclient: https://pypi.org/project/python-ironicclient
Managed inspection
~~~~~~~~~~~~~~~~~~
Inspection is *managed* when the Bare Metal conductor fully configures the node
for inspection, including setting boot device, boot mode and power state. This
is the only way to conduct inspection using :ref:`redfish-virtual-media` or
with :doc:`/admin/dhcp-less`. This mode is engaged automatically when the node
has sufficient information to configure boot (e.g. ports in case of iPXE).
There are a few configuration options that tune managed inspection, the most
important is ``extra_kernel_params``, which allows adding kernel parameters for
inspection specifically. This is where you can configure
:ironic-python-agent-doc:`inspection collectors and other parameters
<admin/how_it_works.html#inspection>`, for example:
.. code-block:: ini
[inspector]
extra_kernel_params = ipa-inspection-collectors=default,logs ipa-collect-lldp=1
For the callback URL the ironic-inspector endpoint from the service catalog is
used. If you want to override the endpoint for callback only, set the following
option:
.. code-block:: ini
[inspector]
callback_endpoint_override = https://example.com/baremetal-introspection/v1/continue
Unmanaged inspection
~~~~~~~~~~~~~~~~~~~~
Under *unmanaged* inspection we understand in-band inspection orchestrated by
ironic-inspector or a third party. This was the only inspection mode before the
Ussuri release, and it is still used when the node's boot cannot be configured
by the conductor. The options described above do not affect unmanaged
inspection. See :ironic-inspector-doc:`ironic-inspector installation guide
<install/index.html>` for more information.
If you want to **prevent** unmanaged inspection from working, set this option:
.. code-block:: ini
[inspector]
require_managed_boot = True

View File

@ -0,0 +1,8 @@
Managed and unmanaged inspection
================================
In-band inspection can be *managed* or *unmanaged*. Please see :doc:`the
Inspector documentation <inspector>` for information on these concepts and
how to configure them.
.. TODO(dtantsur): migrate that information here once inspector is deprecated

View File

@ -695,9 +695,9 @@ Add node ``clean_step`` field.
1.6 (Kilo) 1.6 (Kilo)
---------- ----------
Add :ref:`inspection` process: introduce ``inspecting`` and ``inspectfail`` Add :doc:`inspection </admin/inspection>` process: introduce ``inspecting`` and
provision states, and ``inspect`` action that can be used when a node is in ``inspectfail`` provision states, and ``inspect`` action that can be used when
``manageable`` provision state. a node is in ``manageable`` provision state.
1.5 (Kilo) 1.5 (Kilo)
---------- ----------