
The mtcClient is required to 'start host services' autonomously following a node reboot. This is to handle the usecase where the administrator disables maintenance heartbeat loss auto recovery. If that node then reboots on its own, for whatever reason, maintenance needs to ensure that it auto starts 'host services'. A fairly recent update delivered support for that usecase: https://opendev.org/starlingx/metal/commit/ 1335bc484df331771e995ae822df3af84cc5739d However, the current mechanism the mtcClient used to manage auto- starting host services did not handle the worker subfunction case. Moreover, the current implementation is not handling the potential concurrency between the mtcClient process startup case and mtcAgent requests during unlock recovery. This case also fixes an issue where the mtcClient sometimes gets into a mode where it floods the mtcAgent with a start host services result message ; 20 unnecessary messages / sec. The aforementioned update modified the mtcAgent to log receipt of this message which then floods the mtcAgent log leading to unnecessary message handling and log rotations. Test Plan: Success Path: PASS: Verify mtcClient success path handling of start and stop host services function for the various node types in a ... - standard system with worker and storage nodes - all-in-one system with worker node PASS: Verify appropriate start host services are run on each node type following a Dead Office Recovery (DOR). - standard system with worker and storage nodes - all-in-one system with worker node PASS: Verify the mtcClient does not unnecessarily send host services result messages. PASS: Verify handling of periodic start host services message while a node is in service. Failure Path: PASS: Verify mtcClient failure path handling of start and stop host services function for the various node types in a ... - standard system with worker and storage nodes - all-in-one system with worker node PASS: Verify mtcClient start host services command handling when when message requests interleave with auto start handling during unlock recovery. Closes-Bug: 2073802 Change-Id: I0da7a16c1f600cc60364f6bcec7587e2ff71c624 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Description
Languages
C++
83%
Shell
10.2%
Python
3.3%
C
2.5%
Makefile
1%