fault/fm-common/sources
Eric MacDonald 54f9fed7c3 Set 5 second socket read timeout
FM messaging socket reads that are triggered by FM API calls from
client services have been seen to rarely but occasionally block/stall
the fmManager process. This fmManager stall can then lead to other
client service process stalls which in the case of mtcAgent has been
seen to lead to uncontrolled switch of activity ; aka Swact.

This update adds a 5 second socket read timeout to FM's client services
socket setup to avoid the prolonged blocking cases that lead to Swact
or adversely affect (block) other client service process execution.

Setting a read timeout on Linux sockets is a good programming practice.
Doing so it helps ensure that an application, FM and client services
do not hang indefinitely if a network operation like a socket read
becomes unresponsive.

Configuring a timeout helps manage network communication reliability
and efficiency, especially in applications where responsiveness is
critical. Especially in server-client application such as FM.

Test Plan:

PASS: Verify AIO DX system install.
PASS: Verify blocked socket timeout and error log after 5 seconds.
PASS: Verify unblocked socket reads complete successfully.
PASS: Verify alarm assert/clear functions operate normally.
PASS: Verify set socket timeout failure handling.
PASS: Verify fmManager is not leaking files or memory.
PASS: Verify rook-ceph apply remove 100 loop soak
      - no stall or swact
      - AIO DX
      - with 2 OSDs on each controller

Closes-Bug: 2088025
Change-Id: I1d947bccf9faeedcc2b96c7bc398fbab77b7ae09
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2024-11-14 11:20:17 +00:00
..
fm_cli.cpp Fix indentation and style issues in fm_cli 2019-09-16 23:35:47 -05:00
fm_db_sync_event_suppression.py Support newer version of yaml 2023-05-31 16:36:27 +00:00
fm_log.py Use alternate log handler if running in container 2021-11-19 10:07:36 -03:00
fm_python_mod_main.cpp Add new FM API methods 2024-08-12 16:43:59 -03:00
fmAlarm.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmAlarmUtils.cpp Fix indentation and style issues in fmAlarmUtils 2019-09-17 00:22:37 -05:00
fmAlarmUtils.h Fix indentation and style issues in fmAlarmUtils 2019-09-17 00:22:37 -05:00
fmAPI.cpp Add new FM API methods 2024-08-12 16:43:59 -03:00
fmAPI.h Fix stx-fm-subagent image build failure 2024-08-14 21:25:37 -03:00
fmConfig.cpp Update FM Manager old net-snmp related code 2020-12-17 10:03:38 -03:00
fmConfig.h Decouple Fault Management from stx-config 2018-08-16 13:23:33 -04:00
fmConstants.h Reimplementation logic for trap generation 2020-12-08 18:37:50 -03:00
fmDb.cpp Fix clear_fault/clear_all in FaultAPIsV2 raise exception when alarm is 2019-04-01 01:31:49 +08:00
fmDb.h Fix clear_fault/clear_all in FaultAPIsV2 raise exception when alarm is 2019-04-01 01:31:49 +08:00
fmDbAlarm.cpp Add new FM API methods 2024-08-12 16:43:59 -03:00
fmDbAlarm.h Add new FM API methods 2024-08-12 16:43:59 -03:00
fmDbAPI.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmDbEventLog.cpp replace strncpy by snprintf 2018-11-03 00:56:57 +08:00
fmDbEventLog.h Decouple Fault Management from stx-config 2018-08-16 13:23:33 -04:00
fmDbUtils.cpp Add new FM API methods 2024-08-12 16:43:59 -03:00
fmDbUtils.h Decouple Fault Management from stx-config 2018-08-16 13:23:33 -04:00
fmEventSuppression.cpp Fix clear_fault/clear_all in FaultAPIsV2 raise exception when alarm is 2019-04-01 01:31:49 +08:00
fmEventSuppression.h Decouple Fault Management from stx-config 2018-08-16 13:23:33 -04:00
fmFile.cpp StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmFile.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmLog.cpp Add "proposed_repair_action" attr to event log file 2023-11-02 19:07:49 +00:00
fmLog.h Decouple Fault Management from stx-config 2018-08-16 13:23:33 -04:00
fmMsg.h Add new FM API methods 2024-08-12 16:43:59 -03:00
fmMsgServer.cpp Add new FM API methods 2024-08-12 16:43:59 -03:00
fmMsgServer.h Add new FM API methods 2024-08-12 16:43:59 -03:00
fmMutex.cpp StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmMutex.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmSnmpConstants.h Add missing fields for Event traps 2023-08-31 12:01:35 -03:00
fmSnmpUtils.cpp Add missing fields for Event traps 2023-08-31 12:01:35 -03:00
fmSnmpUtils.h Update FM Manager old net-snmp related code 2020-12-17 10:03:38 -03:00
fmSocket.cpp Set 5 second socket read timeout 2024-11-14 11:20:17 +00:00
fmSocket.h Set 5 second socket read timeout 2024-11-14 11:20:17 +00:00
fmThread.cpp StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmThread.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
fmTime.cpp Fix zuul jobs broken due to pip upversion 2020-12-18 08:41:07 -06:00
fmTime.h StarlingX open source release updates 2018-05-31 07:36:00 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:00 -07:00
Makefile Restrict fmClientCli binary permissions 2022-09-28 11:19:55 -03:00
requirements.txt Use DevStack's setup_*() functions for Python packages 2019-05-17 13:11:58 -05:00
setup.cfg Use DevStack's setup_*() functions for Python packages 2019-05-17 13:11:58 -05:00
setup.py Add hooks for python wheel generation 2018-10-24 17:12:50 +00:00