systemd: fix rate-limiting of mount events

Backport the patches for this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1968528
It reports:
The fix for Bug 1819868 has introduced a new issue related to its
implementation of rate limiting.
Rate limiting the mount_event_source can cause unmount events to be
missed, which leads to mount unit cgroups being leaked (not cleaned up
when the mount is gone).

The fix for 1968528 can fix the issue we met:
During the reboot process of subclouds (either lock-unlock or sudo
reboot), unmounting failure messages repeat for a few hundred of times.

The patches are listed at:
https://github.com/redhat-plumbers/systemd-rhel8/pull/198/commits
And they are picked from https://github.com/systemd-rhel/rhel-8/ (branch
rhel-8.4.0).

Verification:
  In my test on an AIO-SX lab, the bug appears as:
  run "sudo reboot" on controller, endless unmounting failure logs
  printed.
  Verified that the problem was there during the shutdown
  phase of a reboot. Reinstalled with a fixed image, and verified that
  the issue was now gone by doing 5 reboots. Ran sanity on the lab,
  and verified no new issues seen.

Closes-Bug: #1948899
Signed-off-by: Li Zhou <li.zhou@windriver.com>
Change-Id: If95932ceead1bea973f2219d3a8d6b04cf0fd5f8
This commit is contained in:
Li Zhou 2021-10-26 04:41:24 -04:00
parent a33039e3da
commit bed1e46362
7 changed files with 435 additions and 5 deletions

View File

@ -1,4 +1,4 @@
From 8ca176c7d8a3d21b928db29744f950c3cb336658 Mon Sep 17 00:00:00 2001 From b2c505a9b2a532974b5f69332bd87f03087d74a4 Mon Sep 17 00:00:00 2001
From: Li Zhou <li.zhou@windriver.com> From: Li Zhou <li.zhou@windriver.com>
Date: Wed, 21 Apr 2021 14:41:27 +0800 Date: Wed, 21 Apr 2021 14:41:27 +0800
Subject: [PATCH] Add STX patches Subject: [PATCH] Add STX patches
@ -6,14 +6,14 @@ Subject: [PATCH] Add STX patches
Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com> Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Signed-off-by: Li Zhou <li.zhou@windriver.com> Signed-off-by: Li Zhou <li.zhou@windriver.com>
--- ---
SPECS/systemd.spec | 59 ++++++++++++++++++++++++++++++++++++++++++++++ SPECS/systemd.spec | 68 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+) 1 file changed, 68 insertions(+)
diff --git a/SPECS/systemd.spec b/SPECS/systemd.spec diff --git a/SPECS/systemd.spec b/SPECS/systemd.spec
index 9ed7715..bd14767 100644 index 07e022e..a9b6fe0 100644
--- a/SPECS/systemd.spec --- a/SPECS/systemd.spec
+++ b/SPECS/systemd.spec +++ b/SPECS/systemd.spec
@@ -884,6 +884,65 @@ Patch0842: 0842-core-don-t-update-unit-description-if-it-is-already-.patch @@ -884,6 +884,74 @@ Patch0842: 0842-core-don-t-update-unit-description-if-it-is-already-.patch
Patch0843: 0843-unit-don-t-emit-PropertiesChanged-signal-if-adding-a.patch Patch0843: 0843-unit-don-t-emit-PropertiesChanged-signal-if-adding-a.patch
Patch0844: 0844-core-fix-unnecessary-fallback-to-the-rescue-mode-cau.patch Patch0844: 0844-core-fix-unnecessary-fallback-to-the-rescue-mode-cau.patch
Patch0845: 0845-core-Detect-initial-timer-state-from-serialized-data.patch Patch0845: 0845-core-Detect-initial-timer-state-from-serialized-data.patch
@ -75,6 +75,15 @@ index 9ed7715..bd14767 100644
+# upstream patches as unmodified as possible to facilitate maintaining them, so instead +# upstream patches as unmodified as possible to facilitate maintaining them, so instead
+# of individually changing them for compilation, we just have one patch at the end to do it. +# of individually changing them for compilation, we just have one patch at the end to do it.
+Patch0921: 921-systemd-Fix-compiling-errors-when-merging-1819868.patch +Patch0921: 921-systemd-Fix-compiling-errors-when-merging-1819868.patch
+
+# This cluster of patches relates to fixing redhat bug #1968528
+# "fix rate-limiting of mount events"
+Patch0922: 922-sd-event-change-ordering-of-pending-ratelimited-even.patch
+Patch0923: 923-sd-event-drop-unnecessary-else.patch
+Patch0924: 924-sd-event-use-CMP-macro.patch
+Patch0925: 925-sd-event-use-usec_add.patch
+Patch0926: 926-sd-event-make-event_source_time_prioq_reshuffle-acce.patch
+Patch0927: 927-sd-event-always-reshuffle-time-prioq-on-changing-onl.patch
+ +
Patch9999: 9999-Update-kernel-install-script-by-backporting-fedora-p.patch Patch9999: 9999-Update-kernel-install-script-by-backporting-fedora-p.patch

View File

@ -0,0 +1,104 @@
From 762ba1d9cd3571f294965cb86525999e81fdec5d Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Tue, 8 Jun 2021 00:07:51 -0700
Subject: [PATCH 1/6] sd-event: change ordering of pending/ratelimited events
Instead of ordering non-pending before pending we should order
"non-pending OR ratelimited" before "pending AND not-ratelimited".
This fixes a bug where ratelimited events were ordered at the end of the
priority queue and could be stuck there for an indeterminate amount of
time.
(cherry picked from commit 81107b8419c39f726fd2805517a5b9faab204e59)
Related: #1984406
[commit 93de7820843c175f4c9661dbfcb312e8ee09fbd3 from
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 48 +++++++++++++-----------------
1 file changed, 20 insertions(+), 28 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 282b38f..fcf333e 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -352,25 +352,6 @@ static usec_t time_event_source_next(const sd_event_source *s) {
return USEC_INFINITY;
}
-static int earliest_time_prioq_compare(const void *a, const void *b) {
- const sd_event_source *x = a, *y = b;
-
- /* Enabled ones first */
- if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
- return -1;
- if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
- return 1;
-
- /* Move the pending ones to the end */
- if (!x->pending && y->pending)
- return -1;
- if (x->pending && !y->pending)
- return 1;
-
- /* Order by time */
- return CMP(time_event_source_next(x), time_event_source_next(y));
-}
-
static usec_t time_event_source_latest(const sd_event_source *s) {
assert(s);
@@ -389,7 +370,15 @@ static usec_t time_event_source_latest(const sd_event_source *s) {
return USEC_INFINITY;
}
-static int latest_time_prioq_compare(const void *a, const void *b) {
+static bool event_source_timer_candidate(const sd_event_source *s) {
+ assert(s);
+
+ /* Returns true for event sources that either are not pending yet (i.e. where it's worth to mark them pending)
+ * or which are currently ratelimited (i.e. where it's worth leaving the ratelimited state) */
+ return !s->pending || s->ratelimited;
+}
+
+static int time_prioq_compare(const void *a, const void *b, usec_t (*time_func)(const sd_event_source *s)) {
const sd_event_source *x = a, *y = b;
/* Enabled ones first */
@@ -398,19 +387,22 @@ static int latest_time_prioq_compare(const void *a, const void *b) {
if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
return 1;
- /* Move the pending ones to the end */
- if (!x->pending && y->pending)
+ /* Order "non-pending OR ratelimited" before "pending AND not-ratelimited" */
+ if (event_source_timer_candidate(x) && !event_source_timer_candidate(y))
return -1;
- if (x->pending && !y->pending)
+ if (!event_source_timer_candidate(x) && event_source_timer_candidate(y))
return 1;
/* Order by time */
- if (time_event_source_latest(x) < time_event_source_latest(y))
- return -1;
- if (time_event_source_latest(x) > time_event_source_latest(y))
- return 1;
+ return CMP(time_func(x), time_func(y));
+}
- return 0;
+static int earliest_time_prioq_compare(const void *a, const void *b) {
+ return time_prioq_compare(a, b, time_event_source_next);
+}
+
+static int latest_time_prioq_compare(const void *a, const void *b) {
+ return time_prioq_compare(a, b, time_event_source_latest);
}
static int exit_prioq_compare(const void *a, const void *b) {
--
2.17.1

View File

@ -0,0 +1,35 @@
From 9824f4e131b5ffea0be23dd25b24b953314f1a79 Mon Sep 17 00:00:00 2001
From: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Tue, 15 Jun 2021 00:44:04 +0900
Subject: [PATCH 2/6] sd-event: drop unnecessary "else"
(cherry picked from commit 7e2bf71ca3638e36ee33215ceee386ba8013da6d)
Related: #1984406
[commit 3e7e54c63236c65aa01bb332fd5135a13e51b992 from
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index fcf333e..9b6d2f0 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -2199,8 +2199,8 @@ static int event_arm_timer(
if (!d->needs_rearm)
return 0;
- else
- d->needs_rearm = false;
+
+ d->needs_rearm = false;
a = prioq_peek(d->earliest);
if (!a || a->enabled == SD_EVENT_OFF || time_event_source_next(a) == USEC_INFINITY) {
--
2.17.1

View File

@ -0,0 +1,98 @@
From 766177b0afd897e43504c1228fe46ea03833df29 Mon Sep 17 00:00:00 2001
From: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Tue, 15 Jun 2021 00:51:33 +0900
Subject: [PATCH 3/6] sd-event: use CMP() macro
(cherry picked from commit 06e131477d82b83c5d516e66d6e413affd7c774a)
Related: #1984406
[commit eaab8d57d9db0d98d7e618ba634983c34cdb9c06 from
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 37 ++++++++++++++----------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 9b6d2f0..84a874d 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -283,10 +283,9 @@ static int pending_prioq_compare(const void *a, const void *b) {
assert(y->pending);
/* Enabled ones first */
- if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
- return -1;
- if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
- return 1;
+ r = CMP(x->enabled == SD_EVENT_OFF, y->enabled == SD_EVENT_OFF);
+ if (r != 0)
+ return r;
/* Non rate-limited ones first. */
r = CMP(!!x->ratelimited, !!y->ratelimited);
@@ -310,10 +309,9 @@ static int prepare_prioq_compare(const void *a, const void *b) {
assert(y->prepare);
/* Enabled ones first */
- if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
- return -1;
- if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
- return 1;
+ r = CMP(x->enabled == SD_EVENT_OFF, y->enabled == SD_EVENT_OFF);
+ if (r != 0)
+ return r;
/* Non rate-limited ones first. */
r = CMP(!!x->ratelimited, !!y->ratelimited);
@@ -380,18 +378,17 @@ static bool event_source_timer_candidate(const sd_event_source *s) {
static int time_prioq_compare(const void *a, const void *b, usec_t (*time_func)(const sd_event_source *s)) {
const sd_event_source *x = a, *y = b;
+ int r;
/* Enabled ones first */
- if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
- return -1;
- if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
- return 1;
+ r = CMP(x->enabled == SD_EVENT_OFF, y->enabled == SD_EVENT_OFF);
+ if (r != 0)
+ return r;
/* Order "non-pending OR ratelimited" before "pending AND not-ratelimited" */
- if (event_source_timer_candidate(x) && !event_source_timer_candidate(y))
- return -1;
- if (!event_source_timer_candidate(x) && event_source_timer_candidate(y))
- return 1;
+ r = CMP(!event_source_timer_candidate(x), !event_source_timer_candidate(y));
+ if (r != 0)
+ return r;
/* Order by time */
return CMP(time_func(x), time_func(y));
@@ -407,15 +404,15 @@ static int latest_time_prioq_compare(const void *a, const void *b) {
static int exit_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
+ int r;
assert(x->type == SOURCE_EXIT);
assert(y->type == SOURCE_EXIT);
/* Enabled ones first */
- if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
- return -1;
- if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
- return 1;
+ r = CMP(x->enabled == SD_EVENT_OFF, y->enabled == SD_EVENT_OFF);
+ if (r != 0)
+ return r;
/* Lower priority values first */
if (x->priority < y->priority)
--
2.17.1

View File

@ -0,0 +1,35 @@
From e2088e9fd7dd09c542d8c456b62dbd2d21ee9e51 Mon Sep 17 00:00:00 2001
From: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Tue, 15 Jun 2021 01:01:48 +0900
Subject: [PATCH 4/6] sd-event: use usec_add()
(cherry picked from commit a595fb5ca9c69c589e758e9ebe3b70ac90450ba3)
Related: #1984406
[commit b8732d647162b50ce9b34de2ad7ae11a53f6e7ba from
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 84a874d..1cf1c41 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -2663,8 +2663,8 @@ static int arm_watchdog(sd_event *e) {
assert(e->watchdog_fd >= 0);
t = sleep_between(e,
- e->watchdog_last + (e->watchdog_period / 2),
- e->watchdog_last + (e->watchdog_period * 3 / 4));
+ usec_add(e->watchdog_last, (e->watchdog_period / 2)),
+ usec_add(e->watchdog_last, (e->watchdog_period * 3 / 4)));
timespec_store(&its.it_value, t);
--
2.17.1

View File

@ -0,0 +1,48 @@
From 9a3a48fde35fd02981b44ff6b2e184f33377d36c Mon Sep 17 00:00:00 2001
From: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Tue, 15 Jun 2021 02:03:02 +0900
Subject: [PATCH 5/6] sd-event: make event_source_time_prioq_reshuffle() accept
all event source type
But it does nothing for an event source which is neither a timer nor
ratelimited.
(cherry picked from commit 5c08c7ab23dbf02aaf4e4bbae8e08a195da230a4)
Related: #1984406
[commit 9f044118dbc6a0f04b3820ffaa9d4c7807ae48a7
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 1cf1c41..6215bac 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -845,14 +845,15 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) {
assert(s);
/* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy,
- * pending, enable state. Makes sure the two prioq's are ordered properly again. */
+ * pending, enable state, and ratelimiting state. Makes sure the two prioq's are ordered
+ * properly again. */
if (s->ratelimited)
d = &s->event->monotonic;
- else {
- assert(EVENT_SOURCE_IS_TIME(s->type));
+ else if (EVENT_SOURCE_IS_TIME(s->type))
assert_se(d = event_get_clock_data(s->event, s->type));
- }
+ else
+ return; /* no-op for an event source which is neither a timer nor ratelimited. */
prioq_reshuffle(d->earliest, s, &s->earliest_index);
prioq_reshuffle(d->latest, s, &s->latest_index);
--
2.17.1

View File

@ -0,0 +1,101 @@
From 71552a073ed08beb227b9b007fb9818f08923baa Mon Sep 17 00:00:00 2001
From: Li Zhou <li.zhou@windriver.com>
Date: Tue, 26 Oct 2021 11:46:48 +0800
Subject: [PATCH 6/6] sd-event: always reshuffle time prioq on changing
online/offline state
Before 81107b8419c39f726fd2805517a5b9faab204e59, the compare functions
for the latest or earliest prioq did not handle ratelimited flag.
So, it was ok to not reshuffle the time prioq when changing the flag.
But now, those two compare functions also compare the source is
ratelimited or not. So, it is necessary to reshuffle the time prioq
after changing the ratelimited flag.
Hopefully fixes #19903.
(cherry picked from commit 2115b9b6629eeba7bc9f42f757f38205febb1cb7)
Related: #1984406
[commit f5611a22d4a65ef440352792085774ce898adb0f from
https://github.com/systemd-rhel/rhel-8/ (branch rhel-8.4.0)
LZ: Adapt the patch for context changes and no real change on the
patch.]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 33 ++++++++++--------------------
1 file changed, 11 insertions(+), 22 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 6215bac..c17b368 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1724,14 +1724,6 @@ static int event_source_offline(
source_io_unregister(s);
break;
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- event_source_time_prioq_reshuffle(s);
- break;
-
case SOURCE_SIGNAL:
event_gc_signal_data(s->event, &s->priority, s->signal.sig);
break;
@@ -1749,6 +1741,11 @@ static int event_source_offline(
prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
break;
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
case SOURCE_DEFER:
case SOURCE_POST:
break;
@@ -1757,6 +1754,9 @@ static int event_source_offline(
assert_not_reached("Wut? I shouldn't exist.");
}
+ /* Always reshuffle time prioq, as the ratelimited flag may be changed. */
+ event_source_time_prioq_reshuffle(s);
+
return 1;
}
@@ -1837,22 +1837,11 @@ static int event_source_online(
s->ratelimited = ratelimited;
/* Non-failing operations below */
- switch (s->type) {
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- event_source_time_prioq_reshuffle(s);
- break;
-
- case SOURCE_EXIT:
+ if (s->type == SOURCE_EXIT)
prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
- break;
- default:
- break;
- }
+ /* Always reshuffle time prioq, as the ratelimited flag may be changed. */
+ event_source_time_prioq_reshuffle(s);
return 1;
}
--
2.17.1