Add collectd memory plugin entity IDs to fm-doc

This update adds the following collectd memory
plugin instance based alarm entity IDs to
fm-doc's events.yaml file.

  host=<hostname>.memory=platform
  host=<hostname>.memory=total
  host=<hostname>.numa=node<number>

It also removes the obsoleted MINOR threshold
level from the cpu, memory and filesystem alarm
definitions.

Change-Id: I259ba5c84ff90c3e1acd82fc7e72ba63a2fab50a
Partial-Bug: 1903731
Depends-On: https://review.opendev.org/c/starlingx/monitoring/+/764388
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit is contained in:
Eric MacDonald 2020-11-16 12:16:02 -05:00
parent 42a9cfa882
commit 3b341c6e5e

View File

@ -93,9 +93,8 @@
Platform CPU threshold exceeded; threshold x%, actual y% .
CRITICAL @ 95%
MAJOR @ 90%
MINOR @ 80%
Entity_Instance_ID: host=<hostname>
Severity: [critical, major, minor]
Severity: [critical, major]
Proposed_Repair_Action: "Monitor and if condition persists, contact next level of support."
Maintenance_Action:
critical: degrade
@ -135,9 +134,15 @@
Memory threshold exceeded; threshold x%, actual y% .
CRITICAL @ 90%
MAJOR @ 80%
MINOR @ 70%
Entity_Instance_ID: host=<hostname>
Severity: [critical, major, minor]
Entity_Instance_ID: |-
host=<hostname>
OR
host=<hostname>.memory=total
OR
host=<hostname>.memory=platform
OR
host=<hostname>.numa=node<number>
Severity: [critical, major]
Proposed_Repair_Action: "Monitor and if condition persists, contact next level of support; may require additional memory on Host."
Maintenance_Action:
critical: degrade
@ -157,7 +162,6 @@
File System threshold exceeded; threshold x%, actual y% .
CRITICAL @ 90%
MAJOR @ 80%
MINOR @ 70%
OR
host=<hostname>.volumegroup=<volumegroup-name>
Monitor and if condition persists, consider adding additional physical volumes to the volume group.
@ -165,7 +169,7 @@
host=<hostname>.filesystem=<mount-dir>
OR
host=<hostname>.volumegroup=<volumegroup-name>
Severity: [critical, major, minor]
Severity: [critical, major]
Proposed_Repair_Action: "Monitor and if condition persists, contact next level of support."
Maintenance_Action:
critical: degrade