From ccfeeef59d39e42b2775bb5a216732c4999f6e42 Mon Sep 17 00:00:00 2001 From: Li Zhou Date: Mon, 12 Apr 2021 02:15:25 -0400 Subject: [PATCH] systemd: Prevent excessive /proc/1/mountinfo reparsing Backport the patches for this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1819868 We met such an issue: When testing a large number of pods (> 230), occasionally observed a number of issues related to systemd process: systemd ran continually 90-100% cpu usage systemd memory usage started increasing rapidly (20GB/hour) systemctl commands would always timeout (Failed to get properties: Connection timed out) sm services failed and can't recover: open-ldap, registry-token-server, docker-distribution, etcd new pods can't start, and got stuck in state ContainerCreating Those patches work to prevent excessive /proc/1/mountinfo reparsing. It has been verified that those patches can improve this performance greatly. 16 commits are listed in sequence (from [1] to [16]) at below link for the issue: https://github.com/systemd-rhel/rhel-8/pull/154/commits [16](10)core: prevent excessive /proc/self/mountinfo parsing [15][Dropped-6]test: add ratelimiting test [14](9)sd-event: add ability to ratelimit event sources [13](8)sd-event: increase n_enabled_child_sources just once [12](7)sd-event: update state at the end in event_source_enable [11](6)sd-event: remove earliest_index/latest_index into common part of event source objects [10][Dropped-5]sd-event: follow coding style with naming return parameter [9] [Dropped-4]sd-event: ref event loop while in sd_event_prepare() ot sd_event_run() [8] (5)sd-event: refuse running default event loops in any other thread than the one they are default for [7] [Dropped-3]sd-event: let's suffix last_run/last_log with "_usec" [6] [Dropped-2]sd-event: fix delays assert brain-o (#17790) [5] (4)sd-event: split out code to add/remove timer event sources to earliest/latest prioq [4] (3)sd-event: split clock data allocation out of sd_event_add_time() [3] [Dropped-1]sd-event: mention that two debug logged events are ignored [2] (2)sd-event: split out enable and disable codepaths from sd_event_source_set_enabled() [1] (1)sd-event: split out helper functions for reshuffling prioqs I ported 10 of them back (from (1) to (10)) to fix this issue and dropped the other 6 (from [Dropped-1] to [Dropped-6]) for those reasons: [Dropped-1]Only changes error log. [Dropped-2]Fixes a bug introduced in a commit which doesn't exist in this version. [Dropped-3]Only changes vars' names and there is no functional change. [Dropped-4]More commits are needed for merging it, while I don't see any help on adding the rate-limiting ability. [Dropped-5]Change coding style for a function which isn't really used by anyone. [Dropped-6]Add test cases. Closes-Bug: #1924686 Signed-off-by: Li Zhou Change-Id: Ia4c8f162cb1a47b40d1b26cf4d604976b97e92d6 --- .../centos/meta_patches/Add-STX-patches.patch | 75 +- ...event-don-t-touch-fd-s-accross-forks.patch | 54 ++ ...make-sure-RT-signals-are-not-dropped.patch | 815 +++++++++++++++++ ...ut-helper-functions-for-reshuffling-.patch | 216 +++++ ...nding-events-when-we-turn-off-on-an-.patch | 50 ++ ...t-fix-call-to-event_make_signal_data.patch | 31 + ...re-to-create-a-signal-queue-for-the-.patch | 37 + ...ut-enable-and-disable-codepaths-from.patch | 315 +++++++ ...rioq_ensure_allocated-where-possible.patch | 73 ++ ...lock-data-allocation-out-of-sd_event.patch | 79 ++ ...ut-code-to-add-remove-timer-event-so.patch | 120 +++ ...me-PASSIVE-PREPARED-to-INITIAL-ARMED.patch | 126 +++ ...running-default-event-loops-in-any-o.patch | 39 + ...earliest_index-latest_index-into-com.patch | 106 +++ ...state-at-the-end-in-event_source_ena.patch | 125 +++ ...se-n_enabled_child_sources-just-once.patch | 44 + ...ent-don-t-provide-priority-stability.patch | 97 ++ ...termining-the-last-allowed-time-a-ti.patch | 53 ++ ...a-USEC_INFINITY-timeout-as-an-altern.patch | 60 ++ ...d-ability-to-ratelimit-event-sources.patch | 841 ++++++++++++++++++ ...xcessive-proc-self-mountinfo-parsing.patch | 37 + ...ompiling-errors-when-merging-1819868.patch | 64 ++ 22 files changed, 3445 insertions(+), 12 deletions(-) create mode 100644 base/systemd/centos/patches/901-sd-event-don-t-touch-fd-s-accross-forks.patch create mode 100644 base/systemd/centos/patches/902-sd-event-make-sure-RT-signals-are-not-dropped.patch create mode 100644 base/systemd/centos/patches/903-sd-event-split-out-helper-functions-for-reshuffling-.patch create mode 100644 base/systemd/centos/patches/904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch create mode 100644 base/systemd/centos/patches/905-sd-event-fix-call-to-event_make_signal_data.patch create mode 100644 base/systemd/centos/patches/906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch create mode 100644 base/systemd/centos/patches/907-sd-event-split-out-enable-and-disable-codepaths-from.patch create mode 100644 base/systemd/centos/patches/908-sd-event-use-prioq_ensure_allocated-where-possible.patch create mode 100644 base/systemd/centos/patches/909-sd-event-split-clock-data-allocation-out-of-sd_event.patch create mode 100644 base/systemd/centos/patches/910-sd-event-split-out-code-to-add-remove-timer-event-so.patch create mode 100644 base/systemd/centos/patches/911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch create mode 100644 base/systemd/centos/patches/912-sd-event-refuse-running-default-event-loops-in-any-o.patch create mode 100644 base/systemd/centos/patches/913-sd-event-remove-earliest_index-latest_index-into-com.patch create mode 100644 base/systemd/centos/patches/914-sd-event-update-state-at-the-end-in-event_source_ena.patch create mode 100644 base/systemd/centos/patches/915-sd-event-increase-n_enabled_child_sources-just-once.patch create mode 100644 base/systemd/centos/patches/916-sd-event-don-t-provide-priority-stability.patch create mode 100644 base/systemd/centos/patches/917-sd-event-when-determining-the-last-allowed-time-a-ti.patch create mode 100644 base/systemd/centos/patches/918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch create mode 100644 base/systemd/centos/patches/919-sd-event-add-ability-to-ratelimit-event-sources.patch create mode 100644 base/systemd/centos/patches/920-core-prevent-excessive-proc-self-mountinfo-parsing.patch create mode 100644 base/systemd/centos/patches/921-systemd-Fix-compiling-errors-when-merging-1819868.patch diff --git a/base/systemd/centos/meta_patches/Add-STX-patches.patch b/base/systemd/centos/meta_patches/Add-STX-patches.patch index a33a95edf..3508aad2d 100644 --- a/base/systemd/centos/meta_patches/Add-STX-patches.patch +++ b/base/systemd/centos/meta_patches/Add-STX-patches.patch @@ -1,21 +1,19 @@ -From 3c0e59a677c921f60f27002a27eb5f4776475e44 Mon Sep 17 00:00:00 2001 -Message-Id: <3c0e59a677c921f60f27002a27eb5f4776475e44.1574265913.git.Jim.Somerville@windriver.com> -In-Reply-To: -References: -From: Jim Somerville -Date: Wed, 20 Nov 2019 10:59:45 -0500 -Subject: [PATCH 3/3] Add STX patches +From 91090adc8d4c774796d36f7563eea224569a9b0f Mon Sep 17 00:00:00 2001 +From: Li Zhou +Date: Wed, 21 Apr 2021 13:59:22 +0800 +Subject: [PATCH] Add STX patches Signed-off-by: Jim Somerville +Signed-off-by: Li Zhou --- - SPECS/systemd.spec | 5 +++++ - 1 file changed, 5 insertions(+) + SPECS/systemd.spec | 58 ++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 58 insertions(+) diff --git a/SPECS/systemd.spec b/SPECS/systemd.spec -index 4c83150..e1e98bb 100644 +index 4c83150..72de7d6 100644 --- a/SPECS/systemd.spec +++ b/SPECS/systemd.spec -@@ -786,6 +786,11 @@ Patch0744: 0744-selinux-don-t-log-SELINUX_INFO-and-SELINUX_WARNING-m.patch +@@ -786,6 +786,64 @@ Patch0744: 0744-selinux-don-t-log-SELINUX_INFO-and-SELINUX_WARNING-m.patch Patch0745: 0745-fix-mis-merge.patch Patch0746: 0746-fs-util-chase_symlinks-prevent-double-free.patch @@ -23,10 +21,63 @@ index 4c83150..e1e98bb 100644 +Patch0801: 801-inject-millisec-in-syslog-date.patch +Patch0802: 802-fix-build-error-for-unused-variable.patch +Patch0803: 803-Fix-compile-failure-due-to-deprecated-value.patch ++ ++# This cluster of patches relates to fixing redhat bug #1819868 ++# "systemd excessively reads mountinfo and udev in dense container environments" ++ ++# Below patches are added for merging patch (1) ++Patch0901: 901-sd-event-don-t-touch-fd-s-accross-forks.patch ++Patch0902: 902-sd-event-make-sure-RT-signals-are-not-dropped.patch ++# Patch (1) for solving #1819868 ++Patch0903: 903-sd-event-split-out-helper-functions-for-reshuffling-.patch ++ ++# Below patches are added for merging patch (2) ++Patch0904: 904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch ++Patch0905: 905-sd-event-fix-call-to-event_make_signal_data.patch ++Patch0906: 906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch ++# Patch (2) for solving #1819868 ++Patch0907: 907-sd-event-split-out-enable-and-disable-codepaths-from.patch ++ ++# Below patch is added for merging patch (3) ++Patch0908: 908-sd-event-use-prioq_ensure_allocated-where-possible.patch ++# Patch (3) for solving #1819868 ++Patch0909: 909-sd-event-split-clock-data-allocation-out-of-sd_event.patch ++ ++# Patch (4) for solving #1819868 ++Patch0910: 910-sd-event-split-out-code-to-add-remove-timer-event-so.patch ++ ++# Below patch is added for merging patch (5) ++Patch0911: 911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch ++# Patch (5) for solving #1819868 ++Patch0912: 912-sd-event-refuse-running-default-event-loops-in-any-o.patch ++ ++# Patch (6) for solving #1819868 ++Patch0913: 913-sd-event-remove-earliest_index-latest_index-into-com.patch ++ ++# Patch (7) for solving #1819868 ++Patch0914: 914-sd-event-update-state-at-the-end-in-event_source_ena.patch ++ ++# Patch (8) for solving #1819868 ++Patch0915: 915-sd-event-increase-n_enabled_child_sources-just-once.patch ++ ++# Below patches are added for merging patch (9) ++Patch0916: 916-sd-event-don-t-provide-priority-stability.patch ++Patch0917: 917-sd-event-when-determining-the-last-allowed-time-a-ti.patch ++Patch0918: 918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch ++# Patch (9) for solving #1819868 ++Patch0919: 919-sd-event-add-ability-to-ratelimit-event-sources.patch ++ ++# Patch (10) for solving #1819868 ++Patch0920: 920-core-prevent-excessive-proc-self-mountinfo-parsing.patch ++ ++# This patch fixes build issues related to the above patches. Our goal is to keep ++# upstream patches as unmodified as possible to facilitate maintaining them, so instead ++# of individually changing them for compilation, we just have one patch at the end to do it. ++Patch0921: 921-systemd-Fix-compiling-errors-when-merging-1819868.patch + Patch9999: 9999-Update-kernel-install-script-by-backporting-fedora-p.patch %global num_patches %{lua: c=0; for i,p in ipairs(patches) do c=c+1; end; print(c);} -- -1.8.3.1 +2.17.1 diff --git a/base/systemd/centos/patches/901-sd-event-don-t-touch-fd-s-accross-forks.patch b/base/systemd/centos/patches/901-sd-event-don-t-touch-fd-s-accross-forks.patch new file mode 100644 index 000000000..6a93bde91 --- /dev/null +++ b/base/systemd/centos/patches/901-sd-event-don-t-touch-fd-s-accross-forks.patch @@ -0,0 +1,54 @@ +From 5de71cb7d887a569bfb987efdceda493338990bf Mon Sep 17 00:00:00 2001 +From: Tom Gundersen +Date: Thu, 4 Jun 2015 16:54:45 +0200 +Subject: [PATCH 01/20] sd-event: don't touch fd's accross forks + +We protect most of the API from use accross forks, but we still allow both +sd_event and sd_event_source objects to be unref'ed. This would cause +problems as it would unregister sources from the underlying eventfd, hence +also affecting the original instance in the parent process. + +This fixes the issue by not touching the fds on unref when done accross a fork, +but still free the memory. + +This fixes a regression introduced by + "udevd: move main-loop to sd-event": 693d371d30fee + +where the worker processes were disabling the inotify event source in the +main daemon. + +[commit f68067348f58cd08d8f4f5325ce22f9a9d2c2140 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 9d48e5a..a84bfbb 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -474,6 +474,9 @@ static int source_io_unregister(sd_event_source *s) { + assert(s); + assert(s->type == SOURCE_IO); + ++ if (event_pid_changed(s->event)) ++ return 0; ++ + if (!s->io.registered) + return 0; + +@@ -604,6 +607,9 @@ static int event_update_signal_fd(sd_event *e) { + + assert(e); + ++ if (event_pid_changed(e)) ++ return 0; ++ + add_to_epoll = e->signal_fd < 0; + + r = signalfd(e->signal_fd, &e->sigset, SFD_NONBLOCK|SFD_CLOEXEC); +-- +2.17.1 + diff --git a/base/systemd/centos/patches/902-sd-event-make-sure-RT-signals-are-not-dropped.patch b/base/systemd/centos/patches/902-sd-event-make-sure-RT-signals-are-not-dropped.patch new file mode 100644 index 000000000..b66e0852f --- /dev/null +++ b/base/systemd/centos/patches/902-sd-event-make-sure-RT-signals-are-not-dropped.patch @@ -0,0 +1,815 @@ +From 2976f3b959bef0e6f0a1f4d55d998c5d60e56b0d Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Thu, 3 Sep 2015 20:13:09 +0200 +Subject: [PATCH 02/20] sd-event: make sure RT signals are not dropped + +RT signals operate in a queue, and we should be careful to never merge +two queued signals into one. Hence, makes sure we only ever dequeue a +single signal at a time and leave the remaining ones queued in the +signalfd. In order to implement correct priorities for the signals +introduce one signalfd per priority, so that we only process the highest +priority signal at a time. + +[commit 9da4cb2be260ed123f2676cb85cb350c527b1492 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 430 ++++++++++++++++++--------- + src/libsystemd/sd-event/test-event.c | 66 +++- + 2 files changed, 357 insertions(+), 139 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index a84bfbb..26ef3ea 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -56,9 +56,22 @@ typedef enum EventSourceType { + _SOURCE_EVENT_SOURCE_TYPE_INVALID = -1 + } EventSourceType; + ++/* All objects we use in epoll events start with this value, so that ++ * we know how to dispatch it */ ++typedef enum WakeupType { ++ WAKEUP_NONE, ++ WAKEUP_EVENT_SOURCE, ++ WAKEUP_CLOCK_DATA, ++ WAKEUP_SIGNAL_DATA, ++ _WAKEUP_TYPE_MAX, ++ _WAKEUP_TYPE_INVALID = -1, ++} WakeupType; ++ + #define EVENT_SOURCE_IS_TIME(t) IN_SET((t), SOURCE_TIME_REALTIME, SOURCE_TIME_BOOTTIME, SOURCE_TIME_MONOTONIC, SOURCE_TIME_REALTIME_ALARM, SOURCE_TIME_BOOTTIME_ALARM) + + struct sd_event_source { ++ WakeupType wakeup; ++ + unsigned n_ref; + + sd_event *event; +@@ -120,6 +133,7 @@ struct sd_event_source { + }; + + struct clock_data { ++ WakeupType wakeup; + int fd; + + /* For all clocks we maintain two priority queues each, one +@@ -136,11 +150,23 @@ struct clock_data { + bool needs_rearm:1; + }; + ++struct signal_data { ++ WakeupType wakeup; ++ ++ /* For each priority we maintain one signal fd, so that we ++ * only have to dequeue a single event per priority at a ++ * time. */ ++ ++ int fd; ++ int64_t priority; ++ sigset_t sigset; ++ sd_event_source *current; ++}; ++ + struct sd_event { + unsigned n_ref; + + int epoll_fd; +- int signal_fd; + int watchdog_fd; + + Prioq *pending; +@@ -157,8 +183,8 @@ struct sd_event { + + usec_t perturb; + +- sigset_t sigset; +- sd_event_source **signal_sources; ++ sd_event_source **signal_sources; /* indexed by signal number */ ++ Hashmap *signal_data; /* indexed by priority */ + + Hashmap *child_sources; + unsigned n_enabled_child_sources; +@@ -355,6 +381,7 @@ static int exit_prioq_compare(const void *a, const void *b) { + + static void free_clock_data(struct clock_data *d) { + assert(d); ++ assert(d->wakeup == WAKEUP_CLOCK_DATA); + + safe_close(d->fd); + prioq_free(d->earliest); +@@ -378,7 +405,6 @@ static void event_free(sd_event *e) { + *(e->default_event_ptr) = NULL; + + safe_close(e->epoll_fd); +- safe_close(e->signal_fd); + safe_close(e->watchdog_fd); + + free_clock_data(&e->realtime); +@@ -392,6 +418,7 @@ static void event_free(sd_event *e) { + prioq_free(e->exit); + + free(e->signal_sources); ++ hashmap_free(e->signal_data); + + hashmap_free(e->child_sources); + set_free(e->post_sources); +@@ -409,13 +436,12 @@ _public_ int sd_event_new(sd_event** ret) { + return -ENOMEM; + + e->n_ref = 1; +- e->signal_fd = e->watchdog_fd = e->epoll_fd = e->realtime.fd = e->boottime.fd = e->monotonic.fd = e->realtime_alarm.fd = e->boottime_alarm.fd = -1; ++ e->watchdog_fd = e->epoll_fd = e->realtime.fd = e->boottime.fd = e->monotonic.fd = e->realtime_alarm.fd = e->boottime_alarm.fd = -1; + e->realtime.next = e->boottime.next = e->monotonic.next = e->realtime_alarm.next = e->boottime_alarm.next = USEC_INFINITY; ++ e->realtime.wakeup = e->boottime.wakeup = e->monotonic.wakeup = e->realtime_alarm.wakeup = e->boottime_alarm.wakeup = WAKEUP_CLOCK_DATA; + e->original_pid = getpid(); + e->perturb = USEC_INFINITY; + +- assert_se(sigemptyset(&e->sigset) == 0); +- + e->pending = prioq_new(pending_prioq_compare); + if (!e->pending) { + r = -ENOMEM; +@@ -510,7 +536,6 @@ static int source_io_register( + r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_MOD, s->io.fd, &ev); + else + r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_ADD, s->io.fd, &ev); +- + if (r < 0) + return -errno; + +@@ -592,45 +617,171 @@ static struct clock_data* event_get_clock_data(sd_event *e, EventSourceType t) { + } + } + +-static bool need_signal(sd_event *e, int signal) { +- return (e->signal_sources && e->signal_sources[signal] && +- e->signal_sources[signal]->enabled != SD_EVENT_OFF) +- || +- (signal == SIGCHLD && +- e->n_enabled_child_sources > 0); +-} ++static int event_make_signal_data( ++ sd_event *e, ++ int sig, ++ struct signal_data **ret) { + +-static int event_update_signal_fd(sd_event *e) { + struct epoll_event ev = {}; +- bool add_to_epoll; ++ struct signal_data *d; ++ bool added = false; ++ sigset_t ss_copy; ++ int64_t priority; + int r; + + assert(e); + + if (event_pid_changed(e)) +- return 0; ++ return -ECHILD; + +- add_to_epoll = e->signal_fd < 0; ++ if (e->signal_sources && e->signal_sources[sig]) ++ priority = e->signal_sources[sig]->priority; ++ else ++ priority = 0; + +- r = signalfd(e->signal_fd, &e->sigset, SFD_NONBLOCK|SFD_CLOEXEC); +- if (r < 0) +- return -errno; ++ d = hashmap_get(e->signal_data, &priority); ++ if (d) { ++ if (sigismember(&d->sigset, sig) > 0) { ++ if (ret) ++ *ret = d; ++ return 0; ++ } ++ } else { ++ r = hashmap_ensure_allocated(&e->signal_data, &uint64_hash_ops); ++ if (r < 0) ++ return r; ++ ++ d = new0(struct signal_data, 1); ++ if (!d) ++ return -ENOMEM; ++ ++ d->wakeup = WAKEUP_SIGNAL_DATA; ++ d->fd = -1; ++ d->priority = priority; ++ ++ r = hashmap_put(e->signal_data, &d->priority, d); ++ if (r < 0) ++ return r; + +- e->signal_fd = r; ++ added = true; ++ } ++ ++ ss_copy = d->sigset; ++ assert_se(sigaddset(&ss_copy, sig) >= 0); ++ ++ r = signalfd(d->fd, &ss_copy, SFD_NONBLOCK|SFD_CLOEXEC); ++ if (r < 0) { ++ r = -errno; ++ goto fail; ++ } ++ ++ d->sigset = ss_copy; + +- if (!add_to_epoll) ++ if (d->fd >= 0) { ++ if (ret) ++ *ret = d; + return 0; ++ } ++ ++ d->fd = r; + + ev.events = EPOLLIN; +- ev.data.ptr = INT_TO_PTR(SOURCE_SIGNAL); ++ ev.data.ptr = d; + +- r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, e->signal_fd, &ev); +- if (r < 0) { +- e->signal_fd = safe_close(e->signal_fd); +- return -errno; ++ r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, d->fd, &ev); ++ if (r < 0) { ++ r = -errno; ++ goto fail; + } + ++ if (ret) ++ *ret = d; ++ + return 0; ++ ++fail: ++ if (added) { ++ d->fd = safe_close(d->fd); ++ hashmap_remove(e->signal_data, &d->priority); ++ free(d); ++ } ++ ++ return r; ++} ++ ++static void event_unmask_signal_data(sd_event *e, struct signal_data *d, int sig) { ++ assert(e); ++ assert(d); ++ ++ /* Turns off the specified signal in the signal data ++ * object. If the signal mask of the object becomes empty that ++ * way removes it. */ ++ ++ if (sigismember(&d->sigset, sig) == 0) ++ return; ++ ++ assert_se(sigdelset(&d->sigset, sig) >= 0); ++ ++ if (sigisemptyset(&d->sigset)) { ++ ++ /* If all the mask is all-zero we can get rid of the structure */ ++ hashmap_remove(e->signal_data, &d->priority); ++ assert(!d->current); ++ safe_close(d->fd); ++ free(d); ++ return; ++ } ++ ++ assert(d->fd >= 0); ++ ++ if (signalfd(d->fd, &d->sigset, SFD_NONBLOCK|SFD_CLOEXEC) < 0) ++ log_debug_errno(errno, "Failed to unset signal bit, ignoring: %m"); ++} ++ ++static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig) { ++ struct signal_data *d; ++ static const int64_t zero_priority = 0; ++ ++ assert(e); ++ ++ /* Rechecks if the specified signal is still something we are ++ * interested in. If not, we'll unmask it, and possibly drop ++ * the signalfd for it. */ ++ ++ if (sig == SIGCHLD && ++ e->n_enabled_child_sources > 0) ++ return; ++ ++ if (e->signal_sources && ++ e->signal_sources[sig] && ++ e->signal_sources[sig]->enabled != SD_EVENT_OFF) ++ return; ++ ++ /* ++ * The specified signal might be enabled in three different queues: ++ * ++ * 1) the one that belongs to the priority passed (if it is non-NULL) ++ * 2) the one that belongs to the priority of the event source of the signal (if there is one) ++ * 3) the 0 priority (to cover the SIGCHLD case) ++ * ++ * Hence, let's remove it from all three here. ++ */ ++ ++ if (priority) { ++ d = hashmap_get(e->signal_data, priority); ++ if (d) ++ event_unmask_signal_data(e, d, sig); ++ } ++ ++ if (e->signal_sources && e->signal_sources[sig]) { ++ d = hashmap_get(e->signal_data, &e->signal_sources[sig]->priority); ++ if (d) ++ event_unmask_signal_data(e, d, sig); ++ } ++ ++ d = hashmap_get(e->signal_data, &zero_priority); ++ if (d) ++ event_unmask_signal_data(e, d, sig); + } + + static void source_disconnect(sd_event_source *s) { +@@ -669,17 +820,11 @@ static void source_disconnect(sd_event_source *s) { + + case SOURCE_SIGNAL: + if (s->signal.sig > 0) { ++ + if (s->event->signal_sources) + s->event->signal_sources[s->signal.sig] = NULL; + +- /* If the signal was on and now it is off... */ +- if (s->enabled != SD_EVENT_OFF && !need_signal(s->event, s->signal.sig)) { +- assert_se(sigdelset(&s->event->sigset, s->signal.sig) == 0); +- +- (void) event_update_signal_fd(s->event); +- /* If disabling failed, we might get a spurious event, +- * but otherwise nothing bad should happen. */ +- } ++ event_gc_signal_data(s->event, &s->priority, s->signal.sig); + } + + break; +@@ -689,18 +834,10 @@ static void source_disconnect(sd_event_source *s) { + if (s->enabled != SD_EVENT_OFF) { + assert(s->event->n_enabled_child_sources > 0); + s->event->n_enabled_child_sources--; +- +- /* We know the signal was on, if it is off now... */ +- if (!need_signal(s->event, SIGCHLD)) { +- assert_se(sigdelset(&s->event->sigset, SIGCHLD) == 0); +- +- (void) event_update_signal_fd(s->event); +- /* If disabling failed, we might get a spurious event, +- * but otherwise nothing bad should happen. */ +- } + } + +- hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid)); ++ (void) hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid)); ++ event_gc_signal_data(s->event, &s->priority, SIGCHLD); + } + + break; +@@ -779,6 +916,14 @@ static int source_set_pending(sd_event_source *s, bool b) { + d->needs_rearm = true; + } + ++ if (s->type == SOURCE_SIGNAL && !b) { ++ struct signal_data *d; ++ ++ d = hashmap_get(s->event->signal_data, &s->priority); ++ if (d && d->current == s) ++ d->current = NULL; ++ } ++ + return 0; + } + +@@ -828,6 +973,7 @@ _public_ int sd_event_add_io( + if (!s) + return -ENOMEM; + ++ s->wakeup = WAKEUP_EVENT_SOURCE; + s->io.fd = fd; + s->io.events = events; + s->io.callback = callback; +@@ -884,7 +1030,7 @@ static int event_setup_timer_fd( + return -errno; + + ev.events = EPOLLIN; +- ev.data.ptr = INT_TO_PTR(clock_to_event_source_type(clock)); ++ ev.data.ptr = d; + + r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, fd, &ev); + if (r < 0) { +@@ -994,9 +1140,9 @@ _public_ int sd_event_add_signal( + void *userdata) { + + sd_event_source *s; ++ struct signal_data *d; + sigset_t ss; + int r; +- bool previous; + + assert_return(e, -EINVAL); + assert_return(sig > 0, -EINVAL); +@@ -1021,8 +1167,6 @@ _public_ int sd_event_add_signal( + } else if (e->signal_sources[sig]) + return -EBUSY; + +- previous = need_signal(e, sig); +- + s = source_new(e, !ret, SOURCE_SIGNAL); + if (!s) + return -ENOMEM; +@@ -1034,14 +1178,10 @@ _public_ int sd_event_add_signal( + + e->signal_sources[sig] = s; + +- if (!previous) { +- assert_se(sigaddset(&e->sigset, sig) == 0); +- +- r = event_update_signal_fd(e); +- if (r < 0) { +- source_free(s); +- return r; +- } ++ r = event_make_signal_data(e, sig, &d); ++ if (r < 0) { ++ source_free(s); ++ return r; + } + + /* Use the signal name as description for the event source by default */ +@@ -1063,7 +1203,6 @@ _public_ int sd_event_add_child( + + sd_event_source *s; + int r; +- bool previous; + + assert_return(e, -EINVAL); + assert_return(pid > 1, -EINVAL); +@@ -1080,8 +1219,6 @@ _public_ int sd_event_add_child( + if (hashmap_contains(e->child_sources, INT_TO_PTR(pid))) + return -EBUSY; + +- previous = need_signal(e, SIGCHLD); +- + s = source_new(e, !ret, SOURCE_CHILD); + if (!s) + return -ENOMEM; +@@ -1100,14 +1237,11 @@ _public_ int sd_event_add_child( + + e->n_enabled_child_sources ++; + +- if (!previous) { +- assert_se(sigaddset(&e->sigset, SIGCHLD) == 0); +- +- r = event_update_signal_fd(e); +- if (r < 0) { +- source_free(s); +- return r; +- } ++ r = event_make_signal_data(e, SIGCHLD, NULL); ++ if (r < 0) { ++ e->n_enabled_child_sources--; ++ source_free(s); ++ return r; + } + + e->need_process_child = true; +@@ -1407,6 +1541,8 @@ _public_ int sd_event_source_get_priority(sd_event_source *s, int64_t *priority) + } + + _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority) { ++ int r; ++ + assert_return(s, -EINVAL); + assert_return(s->event->state != SD_EVENT_FINISHED, -ESTALE); + assert_return(!event_pid_changed(s->event), -ECHILD); +@@ -1414,7 +1550,25 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority) + if (s->priority == priority) + return 0; + +- s->priority = priority; ++ if (s->type == SOURCE_SIGNAL && s->enabled != SD_EVENT_OFF) { ++ struct signal_data *old, *d; ++ ++ /* Move us from the signalfd belonging to the old ++ * priority to the signalfd of the new priority */ ++ ++ assert_se(old = hashmap_get(s->event->signal_data, &s->priority)); ++ ++ s->priority = priority; ++ ++ r = event_make_signal_data(s->event, s->signal.sig, &d); ++ if (r < 0) { ++ s->priority = old->priority; ++ return r; ++ } ++ ++ event_unmask_signal_data(s->event, old, s->signal.sig); ++ } else ++ s->priority = priority; + + if (s->pending) + prioq_reshuffle(s->event->pending, s, &s->pending_index); +@@ -1482,34 +1636,18 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + } + + case SOURCE_SIGNAL: +- assert(need_signal(s->event, s->signal.sig)); +- + s->enabled = m; + +- if (!need_signal(s->event, s->signal.sig)) { +- assert_se(sigdelset(&s->event->sigset, s->signal.sig) == 0); +- +- (void) event_update_signal_fd(s->event); +- /* If disabling failed, we might get a spurious event, +- * but otherwise nothing bad should happen. */ +- } +- ++ event_gc_signal_data(s->event, &s->priority, s->signal.sig); + break; + + case SOURCE_CHILD: +- assert(need_signal(s->event, SIGCHLD)); +- + s->enabled = m; + + assert(s->event->n_enabled_child_sources > 0); + s->event->n_enabled_child_sources--; + +- if (!need_signal(s->event, SIGCHLD)) { +- assert_se(sigdelset(&s->event->sigset, SIGCHLD) == 0); +- +- (void) event_update_signal_fd(s->event); +- } +- ++ event_gc_signal_data(s->event, &s->priority, SIGCHLD); + break; + + case SOURCE_EXIT: +@@ -1555,37 +1693,33 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + } + + case SOURCE_SIGNAL: +- /* Check status before enabling. */ +- if (!need_signal(s->event, s->signal.sig)) { +- assert_se(sigaddset(&s->event->sigset, s->signal.sig) == 0); +- +- r = event_update_signal_fd(s->event); +- if (r < 0) { +- s->enabled = SD_EVENT_OFF; +- return r; +- } +- } + + s->enabled = m; ++ ++ r = event_make_signal_data(s->event, s->signal.sig, NULL); ++ if (r < 0) { ++ s->enabled = SD_EVENT_OFF; ++ event_gc_signal_data(s->event, &s->priority, s->signal.sig); ++ return r; ++ } ++ + break; + + case SOURCE_CHILD: +- /* Check status before enabling. */ +- if (s->enabled == SD_EVENT_OFF) { +- if (!need_signal(s->event, SIGCHLD)) { +- assert_se(sigaddset(&s->event->sigset, s->signal.sig) == 0); +- +- r = event_update_signal_fd(s->event); +- if (r < 0) { +- s->enabled = SD_EVENT_OFF; +- return r; +- } +- } + ++ if (s->enabled == SD_EVENT_OFF) + s->event->n_enabled_child_sources++; +- } + + s->enabled = m; ++ ++ r = event_make_signal_data(s->event, s->signal.sig, SIGCHLD); ++ if (r < 0) { ++ s->enabled = SD_EVENT_OFF; ++ s->event->n_enabled_child_sources--; ++ event_gc_signal_data(s->event, &s->priority, SIGCHLD); ++ return r; ++ } ++ + break; + + case SOURCE_EXIT: +@@ -2029,20 +2163,35 @@ static int process_child(sd_event *e) { + return 0; + } + +-static int process_signal(sd_event *e, uint32_t events) { ++static int process_signal(sd_event *e, struct signal_data *d, uint32_t events) { + bool read_one = false; + int r; + + assert(e); +- + assert_return(events == EPOLLIN, -EIO); + ++ /* If there's a signal queued on this priority and SIGCHLD is ++ on this priority too, then make sure to recheck the ++ children we watch. This is because we only ever dequeue ++ the first signal per priority, and if we dequeue one, and ++ SIGCHLD might be enqueued later we wouldn't know, but we ++ might have higher priority children we care about hence we ++ need to check that explicitly. */ ++ ++ if (sigismember(&d->sigset, SIGCHLD)) ++ e->need_process_child = true; ++ ++ /* If there's already an event source pending for this ++ * priority we don't read another */ ++ if (d->current) ++ return 0; ++ + for (;;) { + struct signalfd_siginfo si; + ssize_t n; + sd_event_source *s = NULL; + +- n = read(e->signal_fd, &si, sizeof(si)); ++ n = read(d->fd, &si, sizeof(si)); + if (n < 0) { + if (errno == EAGAIN || errno == EINTR) + return read_one; +@@ -2057,24 +2206,21 @@ static int process_signal(sd_event *e, uint32_t events) { + + read_one = true; + +- if (si.ssi_signo == SIGCHLD) { +- r = process_child(e); +- if (r < 0) +- return r; +- if (r > 0) +- continue; +- } +- + if (e->signal_sources) + s = e->signal_sources[si.ssi_signo]; +- + if (!s) + continue; ++ if (s->pending) ++ continue; + + s->signal.siginfo = si; ++ d->current = s; ++ + r = source_set_pending(s, true); + if (r < 0) + return r; ++ ++ return 1; + } + } + +@@ -2393,23 +2539,31 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) { + + for (i = 0; i < m; i++) { + +- if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_REALTIME)) +- r = flush_timer(e, e->realtime.fd, ev_queue[i].events, &e->realtime.next); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_BOOTTIME)) +- r = flush_timer(e, e->boottime.fd, ev_queue[i].events, &e->boottime.next); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_MONOTONIC)) +- r = flush_timer(e, e->monotonic.fd, ev_queue[i].events, &e->monotonic.next); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_REALTIME_ALARM)) +- r = flush_timer(e, e->realtime_alarm.fd, ev_queue[i].events, &e->realtime_alarm.next); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_BOOTTIME_ALARM)) +- r = flush_timer(e, e->boottime_alarm.fd, ev_queue[i].events, &e->boottime_alarm.next); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_SIGNAL)) +- r = process_signal(e, ev_queue[i].events); +- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_WATCHDOG)) ++ if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_WATCHDOG)) + r = flush_timer(e, e->watchdog_fd, ev_queue[i].events, NULL); +- else +- r = process_io(e, ev_queue[i].data.ptr, ev_queue[i].events); ++ else { ++ WakeupType *t = ev_queue[i].data.ptr; ++ ++ switch (*t) { ++ ++ case WAKEUP_EVENT_SOURCE: ++ r = process_io(e, ev_queue[i].data.ptr, ev_queue[i].events); ++ break; + ++ case WAKEUP_CLOCK_DATA: { ++ struct clock_data *d = ev_queue[i].data.ptr; ++ r = flush_timer(e, d->fd, ev_queue[i].events, &d->next); ++ break; ++ } ++ ++ case WAKEUP_SIGNAL_DATA: ++ r = process_signal(e, ev_queue[i].data.ptr, ev_queue[i].events); ++ break; ++ ++ default: ++ assert_not_reached("Invalid wake-up pointer"); ++ } ++ } + if (r < 0) + goto finish; + } +diff --git a/src/libsystemd/sd-event/test-event.c b/src/libsystemd/sd-event/test-event.c +index 721700b..6bb1420 100644 +--- a/src/libsystemd/sd-event/test-event.c ++++ b/src/libsystemd/sd-event/test-event.c +@@ -160,7 +160,7 @@ static int exit_handler(sd_event_source *s, void *userdata) { + return 3; + } + +-int main(int argc, char *argv[]) { ++static void test_basic(void) { + sd_event *e = NULL; + sd_event_source *w = NULL, *x = NULL, *y = NULL, *z = NULL, *q = NULL, *t = NULL; + static const char ch = 'x'; +@@ -248,6 +248,70 @@ int main(int argc, char *argv[]) { + safe_close_pair(b); + safe_close_pair(d); + safe_close_pair(k); ++} ++ ++static int last_rtqueue_sigval = 0; ++static int n_rtqueue = 0; ++ ++static int rtqueue_handler(sd_event_source *s, const struct signalfd_siginfo *si, void *userdata) { ++ last_rtqueue_sigval = si->ssi_int; ++ n_rtqueue ++; ++ return 0; ++} ++ ++static void test_rtqueue(void) { ++ sd_event_source *u = NULL, *v = NULL, *s = NULL; ++ sd_event *e = NULL; ++ ++ assert_se(sd_event_default(&e) >= 0); ++ ++ assert_se(sigprocmask_many(SIG_BLOCK, NULL, SIGRTMIN+2, SIGRTMIN+3, SIGUSR2, -1) >= 0); ++ assert_se(sd_event_add_signal(e, &u, SIGRTMIN+2, rtqueue_handler, NULL) >= 0); ++ assert_se(sd_event_add_signal(e, &v, SIGRTMIN+3, rtqueue_handler, NULL) >= 0); ++ assert_se(sd_event_add_signal(e, &s, SIGUSR2, rtqueue_handler, NULL) >= 0); ++ ++ assert_se(sd_event_source_set_priority(v, -10) >= 0); ++ ++ assert(sigqueue(getpid(), SIGRTMIN+2, (union sigval) { .sival_int = 1 }) >= 0); ++ assert(sigqueue(getpid(), SIGRTMIN+3, (union sigval) { .sival_int = 2 }) >= 0); ++ assert(sigqueue(getpid(), SIGUSR2, (union sigval) { .sival_int = 3 }) >= 0); ++ assert(sigqueue(getpid(), SIGRTMIN+3, (union sigval) { .sival_int = 4 }) >= 0); ++ assert(sigqueue(getpid(), SIGUSR2, (union sigval) { .sival_int = 5 }) >= 0); ++ ++ assert_se(n_rtqueue == 0); ++ assert_se(last_rtqueue_sigval == 0); ++ ++ assert_se(sd_event_run(e, (uint64_t) -1) >= 1); ++ assert_se(n_rtqueue == 1); ++ assert_se(last_rtqueue_sigval == 2); /* first SIGRTMIN+3 */ ++ ++ assert_se(sd_event_run(e, (uint64_t) -1) >= 1); ++ assert_se(n_rtqueue == 2); ++ assert_se(last_rtqueue_sigval == 4); /* second SIGRTMIN+3 */ ++ ++ assert_se(sd_event_run(e, (uint64_t) -1) >= 1); ++ assert_se(n_rtqueue == 3); ++ assert_se(last_rtqueue_sigval == 3); /* first SIGUSR2 */ ++ ++ assert_se(sd_event_run(e, (uint64_t) -1) >= 1); ++ assert_se(n_rtqueue == 4); ++ assert_se(last_rtqueue_sigval == 1); /* SIGRTMIN+2 */ ++ ++ assert_se(sd_event_run(e, 0) == 0); /* the other SIGUSR2 is dropped, because the first one was still queued */ ++ assert_se(n_rtqueue == 4); ++ assert_se(last_rtqueue_sigval == 1); ++ ++ sd_event_source_unref(u); ++ sd_event_source_unref(v); ++ sd_event_source_unref(s); ++ ++ sd_event_unref(e); ++} ++ ++int main(int argc, char *argv[]) { ++ ++ test_basic(); ++ test_rtqueue(); + + return 0; + } +-- +2.17.1 + diff --git a/base/systemd/centos/patches/903-sd-event-split-out-helper-functions-for-reshuffling-.patch b/base/systemd/centos/patches/903-sd-event-split-out-helper-functions-for-reshuffling-.patch new file mode 100644 index 000000000..488463e2f --- /dev/null +++ b/base/systemd/centos/patches/903-sd-event-split-out-helper-functions-for-reshuffling-.patch @@ -0,0 +1,216 @@ +From ea762f1c0206c99d2ba4d3cba41cadf70311a3cc Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Michal=20Sekleta=CC=81r?= +Date: Fri, 23 Oct 2020 18:29:27 +0200 +Subject: [PATCH 03/20] sd-event: split out helper functions for reshuffling + prioqs + +We typically don't just reshuffle a single prioq at once, but always +two. Let's add two helper functions that do this, and reuse them +everywhere. + +(Note that this drops one minor optimization: +sd_event_source_set_time_accuracy() previously only reshuffled the +"latest" prioq, since changing the accuracy has no effect on the +earliest time of an event source, just the latest time an event source +can run. This optimization is removed to simplify things, given that +it's not really worth the effort as prioq_reshuffle() on properly +ordered prioqs has practically zero cost O(1)). + +(Slightly generalized, commented and split out of #17284 by Lennart) + +(cherry picked from commit e1951c16a8fbe5b0b9ecc08f4f835a806059d28f) + +Related: #1819868 + +[commit 4ce10f8e41a85a56ad9b805442eb1149ece7c82a from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 96 ++++++++++++------------------ + 1 file changed, 38 insertions(+), 58 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 26ef3ea..eb3182f 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -784,6 +784,33 @@ static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig) + event_unmask_signal_data(e, d, sig); + } + ++static void event_source_pp_prioq_reshuffle(sd_event_source *s) { ++ assert(s); ++ ++ /* Reshuffles the pending + prepare prioqs. Called whenever the dispatch order changes, i.e. when ++ * they are enabled/disabled or marked pending and such. */ ++ ++ if (s->pending) ++ prioq_reshuffle(s->event->pending, s, &s->pending_index); ++ ++ if (s->prepare) ++ prioq_reshuffle(s->event->prepare, s, &s->prepare_index); ++} ++ ++static void event_source_time_prioq_reshuffle(sd_event_source *s) { ++ struct clock_data *d; ++ ++ assert(s); ++ assert(EVENT_SOURCE_IS_TIME(s->type)); ++ ++ /* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy, ++ * pending, enable state. Makes sure the two prioq's are ordered properly again. */ ++ assert_se(d = event_get_clock_data(s->event, s->type)); ++ prioq_reshuffle(d->earliest, s, &s->time.earliest_index); ++ prioq_reshuffle(d->latest, s, &s->time.latest_index); ++ d->needs_rearm = true; ++} ++ + static void source_disconnect(sd_event_source *s) { + sd_event *event; + +@@ -905,16 +932,8 @@ static int source_set_pending(sd_event_source *s, bool b) { + } else + assert_se(prioq_remove(s->event->pending, s, &s->pending_index)); + +- if (EVENT_SOURCE_IS_TIME(s->type)) { +- struct clock_data *d; +- +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; +- } ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ event_source_time_prioq_reshuffle(s); + + if (s->type == SOURCE_SIGNAL && !b) { + struct signal_data *d; +@@ -1570,11 +1589,7 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority) + } else + s->priority = priority; + +- if (s->pending) +- prioq_reshuffle(s->event->pending, s, &s->pending_index); +- +- if (s->prepare) +- prioq_reshuffle(s->event->prepare, s, &s->prepare_index); ++ event_source_pp_prioq_reshuffle(s); + + if (s->type == SOURCE_EXIT) + prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); +@@ -1622,18 +1637,10 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + case SOURCE_TIME_BOOTTIME: + case SOURCE_TIME_MONOTONIC: + case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: { +- struct clock_data *d; +- ++ case SOURCE_TIME_BOOTTIME_ALARM: + s->enabled = m; +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; ++ event_source_time_prioq_reshuffle(s); + break; +- } + + case SOURCE_SIGNAL: + s->enabled = m; +@@ -1679,18 +1686,10 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + case SOURCE_TIME_BOOTTIME: + case SOURCE_TIME_MONOTONIC: + case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: { +- struct clock_data *d; +- ++ case SOURCE_TIME_BOOTTIME_ALARM: + s->enabled = m; +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; ++ event_source_time_prioq_reshuffle(s); + break; +- } + + case SOURCE_SIGNAL: + +@@ -1737,11 +1736,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + } + } + +- if (s->pending) +- prioq_reshuffle(s->event->pending, s, &s->pending_index); +- +- if (s->prepare) +- prioq_reshuffle(s->event->prepare, s, &s->prepare_index); ++ event_source_pp_prioq_reshuffle(s); + + return 0; + } +@@ -1757,7 +1752,6 @@ _public_ int sd_event_source_get_time(sd_event_source *s, uint64_t *usec) { + } + + _public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) { +- struct clock_data *d; + + assert_return(s, -EINVAL); + assert_return(usec != (uint64_t) -1, -EINVAL); +@@ -1769,13 +1763,7 @@ _public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) { + + source_set_pending(s, false); + +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; +- ++ event_source_time_prioq_reshuffle(s); + return 0; + } + +@@ -1790,7 +1778,6 @@ _public_ int sd_event_source_get_time_accuracy(sd_event_source *s, uint64_t *use + } + + _public_ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec) { +- struct clock_data *d; + + assert_return(s, -EINVAL); + assert_return(usec != (uint64_t) -1, -EINVAL); +@@ -1805,12 +1792,7 @@ _public_ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec + + source_set_pending(s, false); + +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; +- ++ event_source_time_prioq_reshuffle(s); + return 0; + } + +@@ -2088,9 +2070,7 @@ static int process_timer( + if (r < 0) + return r; + +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; ++ event_source_time_prioq_reshuffle(s); + } + + return 0; +-- +2.17.1 + diff --git a/base/systemd/centos/patches/904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch b/base/systemd/centos/patches/904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch new file mode 100644 index 000000000..e53b00c16 --- /dev/null +++ b/base/systemd/centos/patches/904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch @@ -0,0 +1,50 @@ +From 76969d09522ca2ab58bc157eb9ce357af5677f3a Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Fri, 25 May 2018 17:06:39 +0200 +Subject: [PATCH 04/20] sd-event: drop pending events when we turn off/on an + event source + +[commit ac989a783a31df95e6c0ce2a90a8d2e1abe73592 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 15 +++++++++++++++ + 1 file changed, 15 insertions(+) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index eb3182f..6e93059 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1623,6 +1623,13 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + + if (m == SD_EVENT_OFF) { + ++ /* Unset the pending flag when this event source is disabled */ ++ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ r = source_set_pending(s, false); ++ if (r < 0) ++ return r; ++ } ++ + switch (s->type) { + + case SOURCE_IO: +@@ -1672,6 +1679,14 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + } + + } else { ++ ++ /* Unset the pending flag when this event source is enabled */ ++ if (s->enabled == SD_EVENT_OFF && !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ r = source_set_pending(s, false); ++ if (r < 0) ++ return r; ++ } ++ + switch (s->type) { + + case SOURCE_IO: +-- +2.17.1 + diff --git a/base/systemd/centos/patches/905-sd-event-fix-call-to-event_make_signal_data.patch b/base/systemd/centos/patches/905-sd-event-fix-call-to-event_make_signal_data.patch new file mode 100644 index 000000000..68c62d8ce --- /dev/null +++ b/base/systemd/centos/patches/905-sd-event-fix-call-to-event_make_signal_data.patch @@ -0,0 +1,31 @@ +From 7380d2cca8bda0f8c821645f8a5ddb8ac47aec46 Mon Sep 17 00:00:00 2001 +From: Thomas Hindoe Paaboel Andersen +Date: Sun, 6 Sep 2015 22:06:45 +0200 +Subject: [PATCH 05/20] sd-event: fix call to event_make_signal_data + +This looks like a typo from commit 9da4cb2b where it was added. + +[commit b8a50a99a6e158a5b3ceacf0764dbe9f42558f3e from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 6e93059..7c33dcd 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1726,7 +1726,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + + s->enabled = m; + +- r = event_make_signal_data(s->event, s->signal.sig, SIGCHLD); ++ r = event_make_signal_data(s->event, s->signal.sig, NULL); + if (r < 0) { + s->enabled = SD_EVENT_OFF; + s->event->n_enabled_child_sources--; +-- +2.17.1 + diff --git a/base/systemd/centos/patches/906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch b/base/systemd/centos/patches/906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch new file mode 100644 index 000000000..2ab52989c --- /dev/null +++ b/base/systemd/centos/patches/906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch @@ -0,0 +1,37 @@ +From 0a2519a5ab04e775115c90039d30bdc576a79c06 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 7 Sep 2015 00:31:24 +0200 +Subject: [PATCH 06/20] sd-event: make sure to create a signal queue for the + right signal + +We should never access the "signal" part of the event source unless the +event source is actually for a signal. In this case it's a child pid +handler however, hence make sure to use the right signal. + +This is a fix for PR #1177, which in turn was a fix for +9da4cb2be260ed123f2676cb85cb350c527b1492. + +[commit 10edebf6cd69cfbe0d38dbaf5478264fbb60a51e from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 7c33dcd..2f5ff23 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1726,7 +1726,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + + s->enabled = m; + +- r = event_make_signal_data(s->event, s->signal.sig, NULL); ++ r = event_make_signal_data(s->event, SIGCHLD, NULL); + if (r < 0) { + s->enabled = SD_EVENT_OFF; + s->event->n_enabled_child_sources--; +-- +2.17.1 + diff --git a/base/systemd/centos/patches/907-sd-event-split-out-enable-and-disable-codepaths-from.patch b/base/systemd/centos/patches/907-sd-event-split-out-enable-and-disable-codepaths-from.patch new file mode 100644 index 000000000..a52d8c95e --- /dev/null +++ b/base/systemd/centos/patches/907-sd-event-split-out-enable-and-disable-codepaths-from.patch @@ -0,0 +1,315 @@ +From 477bbfd4f5012613144c5ba5517aa8de1f300da6 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Fri, 23 Oct 2020 21:21:58 +0200 +Subject: [PATCH 07/20] sd-event: split out enable and disable codepaths from + sd_event_source_set_enabled() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +So far half of sd_event_source_set_enabled() was doing enabling, the +other half was doing disabling. Let's split that into two separate +calls. + +(This also adds a new shortcut to sd_event_source_set_enabled(): if the +caller toggles between "ON" and "ONESHOT" we'll now shortcut this, since +the event source is already enabled in that case and shall remain +enabled.) + +This heavily borrows and is inspired from Michal Sekletár's #17284 +refactoring. + +(cherry picked from commit ddfde737b546c17e54182028153aa7f7e78804e3) + +Related: #1819868 + +[commit d7ad6ad123200f562081ff09f7bed3c6d969ac0a from +https://github.com/systemd-rhel/rhel-8/ + +LZ: Dropped SOURCE_INOTIFY related parts because it hasn't been added +in this systemd version.] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 224 +++++++++++++++-------------- + 1 file changed, 118 insertions(+), 106 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 2f5ff23..2e07478 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1606,153 +1606,165 @@ _public_ int sd_event_source_get_enabled(sd_event_source *s, int *m) { + return 0; + } + +-_public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { ++static int event_source_disable(sd_event_source *s) { + int r; + +- assert_return(s, -EINVAL); +- assert_return(m == SD_EVENT_OFF || m == SD_EVENT_ON || m == SD_EVENT_ONESHOT, -EINVAL); +- assert_return(!event_pid_changed(s->event), -ECHILD); ++ assert(s); ++ assert(s->enabled != SD_EVENT_OFF); + +- /* If we are dead anyway, we are fine with turning off +- * sources, but everything else needs to fail. */ +- if (s->event->state == SD_EVENT_FINISHED) +- return m == SD_EVENT_OFF ? 0 : -ESTALE; ++ /* Unset the pending flag when this event source is disabled */ ++ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ r = source_set_pending(s, false); ++ if (r < 0) ++ return r; ++ } + +- if (s->enabled == m) +- return 0; ++ s->enabled = SD_EVENT_OFF; + +- if (m == SD_EVENT_OFF) { ++ switch (s->type) { + +- /* Unset the pending flag when this event source is disabled */ +- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { +- r = source_set_pending(s, false); +- if (r < 0) +- return r; +- } ++ case SOURCE_IO: ++ source_io_unregister(s); ++ break; + +- switch (s->type) { ++ case SOURCE_TIME_REALTIME: ++ case SOURCE_TIME_BOOTTIME: ++ case SOURCE_TIME_MONOTONIC: ++ case SOURCE_TIME_REALTIME_ALARM: ++ case SOURCE_TIME_BOOTTIME_ALARM: ++ event_source_time_prioq_reshuffle(s); ++ break; + +- case SOURCE_IO: +- r = source_io_unregister(s); +- if (r < 0) +- return r; ++ case SOURCE_SIGNAL: ++ event_gc_signal_data(s->event, &s->priority, s->signal.sig); ++ break; + +- s->enabled = m; +- break; ++ case SOURCE_CHILD: ++ assert(s->event->n_enabled_child_sources > 0); ++ s->event->n_enabled_child_sources--; + +- case SOURCE_TIME_REALTIME: +- case SOURCE_TIME_BOOTTIME: +- case SOURCE_TIME_MONOTONIC: +- case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: +- s->enabled = m; +- event_source_time_prioq_reshuffle(s); +- break; ++ event_gc_signal_data(s->event, &s->priority, SIGCHLD); ++ break; + +- case SOURCE_SIGNAL: +- s->enabled = m; ++ case SOURCE_EXIT: ++ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); ++ break; + +- event_gc_signal_data(s->event, &s->priority, s->signal.sig); +- break; ++ case SOURCE_DEFER: ++ case SOURCE_POST: ++ break; + +- case SOURCE_CHILD: +- s->enabled = m; ++ default: ++ assert_not_reached("Wut? I shouldn't exist."); ++ } + +- assert(s->event->n_enabled_child_sources > 0); +- s->event->n_enabled_child_sources--; ++ return 0; ++} + +- event_gc_signal_data(s->event, &s->priority, SIGCHLD); +- break; ++static int event_source_enable(sd_event_source *s, int m) { ++ int r; + +- case SOURCE_EXIT: +- s->enabled = m; +- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); +- break; ++ assert(s); ++ assert(IN_SET(m, SD_EVENT_ON, SD_EVENT_ONESHOT)); ++ assert(s->enabled == SD_EVENT_OFF); + +- case SOURCE_DEFER: +- case SOURCE_POST: +- s->enabled = m; +- break; ++ /* Unset the pending flag when this event source is enabled */ ++ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ r = source_set_pending(s, false); ++ if (r < 0) ++ return r; ++ } + +- default: +- assert_not_reached("Wut? I shouldn't exist."); +- } ++ s->enabled = m; + +- } else { ++ switch (s->type) { + +- /* Unset the pending flag when this event source is enabled */ +- if (s->enabled == SD_EVENT_OFF && !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { +- r = source_set_pending(s, false); +- if (r < 0) +- return r; ++ case SOURCE_IO: ++ r = source_io_register(s, m, s->io.events); ++ if (r < 0) { ++ s->enabled = SD_EVENT_OFF; ++ return r; + } + +- switch (s->type) { ++ break; + +- case SOURCE_IO: +- r = source_io_register(s, m, s->io.events); +- if (r < 0) +- return r; ++ case SOURCE_TIME_REALTIME: ++ case SOURCE_TIME_BOOTTIME: ++ case SOURCE_TIME_MONOTONIC: ++ case SOURCE_TIME_REALTIME_ALARM: ++ case SOURCE_TIME_BOOTTIME_ALARM: ++ event_source_time_prioq_reshuffle(s); ++ break; + +- s->enabled = m; +- break; ++ case SOURCE_SIGNAL: ++ r = event_make_signal_data(s->event, s->signal.sig, NULL); ++ if (r < 0) { ++ s->enabled = SD_EVENT_OFF; ++ event_gc_signal_data(s->event, &s->priority, s->signal.sig); ++ return r; ++ } + +- case SOURCE_TIME_REALTIME: +- case SOURCE_TIME_BOOTTIME: +- case SOURCE_TIME_MONOTONIC: +- case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: +- s->enabled = m; +- event_source_time_prioq_reshuffle(s); +- break; ++ break; + +- case SOURCE_SIGNAL: ++ case SOURCE_CHILD: ++ s->event->n_enabled_child_sources++; + +- s->enabled = m; ++ r = event_make_signal_data(s->event, SIGCHLD, NULL); ++ if (r < 0) { ++ s->enabled = SD_EVENT_OFF; ++ s->event->n_enabled_child_sources--; ++ event_gc_signal_data(s->event, &s->priority, SIGCHLD); ++ return r; ++ } + +- r = event_make_signal_data(s->event, s->signal.sig, NULL); +- if (r < 0) { +- s->enabled = SD_EVENT_OFF; +- event_gc_signal_data(s->event, &s->priority, s->signal.sig); +- return r; +- } + +- break; ++ break; + +- case SOURCE_CHILD: ++ case SOURCE_EXIT: ++ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); ++ break; + +- if (s->enabled == SD_EVENT_OFF) +- s->event->n_enabled_child_sources++; ++ case SOURCE_DEFER: ++ case SOURCE_POST: ++ break; + +- s->enabled = m; ++ default: ++ assert_not_reached("Wut? I shouldn't exist."); ++ } + +- r = event_make_signal_data(s->event, SIGCHLD, NULL); +- if (r < 0) { +- s->enabled = SD_EVENT_OFF; +- s->event->n_enabled_child_sources--; +- event_gc_signal_data(s->event, &s->priority, SIGCHLD); +- return r; +- } ++ return 0; ++} + +- break; ++_public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { ++ int r; + +- case SOURCE_EXIT: +- s->enabled = m; +- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); +- break; ++ assert_return(s, -EINVAL); ++ assert_return(IN_SET(m, SD_EVENT_OFF, SD_EVENT_ON, SD_EVENT_ONESHOT), -EINVAL); ++ assert_return(!event_pid_changed(s->event), -ECHILD); + +- case SOURCE_DEFER: +- case SOURCE_POST: +- s->enabled = m; +- break; ++ /* If we are dead anyway, we are fine with turning off sources, but everything else needs to fail. */ ++ if (s->event->state == SD_EVENT_FINISHED) ++ return m == SD_EVENT_OFF ? 0 : -ESTALE; + +- default: +- assert_not_reached("Wut? I shouldn't exist."); ++ if (s->enabled == m) /* No change? */ ++ return 0; ++ ++ if (m == SD_EVENT_OFF) ++ r = event_source_disable(s); ++ else { ++ if (s->enabled != SD_EVENT_OFF) { ++ /* Switching from "on" to "oneshot" or back? If that's the case, we can take a shortcut, the ++ * event source is already enabled after all. */ ++ s->enabled = m; ++ return 0; + } ++ ++ r = event_source_enable(s, m); + } ++ if (r < 0) ++ return r; + + event_source_pp_prioq_reshuffle(s); +- + return 0; + } + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/908-sd-event-use-prioq_ensure_allocated-where-possible.patch b/base/systemd/centos/patches/908-sd-event-use-prioq_ensure_allocated-where-possible.patch new file mode 100644 index 000000000..54ae04b64 --- /dev/null +++ b/base/systemd/centos/patches/908-sd-event-use-prioq_ensure_allocated-where-possible.patch @@ -0,0 +1,73 @@ +From 5e365321f3006d44f57bb27ff9de96ca01c1104a Mon Sep 17 00:00:00 2001 +From: Evgeny Vereshchagin +Date: Sun, 22 Nov 2015 06:41:31 +0000 +Subject: [PATCH 08/20] sd-event: use prioq_ensure_allocated where possible + +[commit c983e776c4e7e2ea6e1990123d215e639deb353b from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 30 +++++++++++------------------- + 1 file changed, 11 insertions(+), 19 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 2e07478..7074520 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -442,11 +442,9 @@ _public_ int sd_event_new(sd_event** ret) { + e->original_pid = getpid(); + e->perturb = USEC_INFINITY; + +- e->pending = prioq_new(pending_prioq_compare); +- if (!e->pending) { +- r = -ENOMEM; ++ r = prioq_ensure_allocated(&e->pending, pending_prioq_compare); ++ if (r < 0) + goto fail; +- } + + e->epoll_fd = epoll_create1(EPOLL_CLOEXEC); + if (e->epoll_fd < 0) { +@@ -1096,17 +1094,13 @@ _public_ int sd_event_add_time( + d = event_get_clock_data(e, type); + assert(d); + +- if (!d->earliest) { +- d->earliest = prioq_new(earliest_time_prioq_compare); +- if (!d->earliest) +- return -ENOMEM; +- } ++ r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare); ++ if (r < 0) ++ return r; + +- if (!d->latest) { +- d->latest = prioq_new(latest_time_prioq_compare); +- if (!d->latest) +- return -ENOMEM; +- } ++ r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare); ++ if (r < 0) ++ return r; + + if (d->fd < 0) { + r = event_setup_timer_fd(e, d, clock); +@@ -1357,11 +1351,9 @@ _public_ int sd_event_add_exit( + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); + assert_return(!event_pid_changed(e), -ECHILD); + +- if (!e->exit) { +- e->exit = prioq_new(exit_prioq_compare); +- if (!e->exit) +- return -ENOMEM; +- } ++ r = prioq_ensure_allocated(&e->exit, exit_prioq_compare); ++ if (r < 0) ++ return r; + + s = source_new(e, !ret, SOURCE_EXIT); + if (!s) +-- +2.17.1 + diff --git a/base/systemd/centos/patches/909-sd-event-split-clock-data-allocation-out-of-sd_event.patch b/base/systemd/centos/patches/909-sd-event-split-clock-data-allocation-out-of-sd_event.patch new file mode 100644 index 000000000..4e0114122 --- /dev/null +++ b/base/systemd/centos/patches/909-sd-event-split-clock-data-allocation-out-of-sd_event.patch @@ -0,0 +1,79 @@ +From 77b772bce846db28dc447420fd380a51eadcde15 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 23 Nov 2020 11:40:24 +0100 +Subject: [PATCH 09/20] sd-event: split clock data allocation out of + sd_event_add_time() + +Just some simple refactoring, that will make things easier for us later. +But it looks better this way even without the later function reuse. + +(cherry picked from commit 41c63f36c3352af8bebf03b6181f5d866431d0af) + +Related: #1819868 + +[commit 6cc0022115afbac9ac66c456b140601d90271687 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 34 ++++++++++++++++++++---------- + 1 file changed, 23 insertions(+), 11 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 7074520..8e6536f 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1065,6 +1065,28 @@ static int time_exit_callback(sd_event_source *s, uint64_t usec, void *userdata) + return sd_event_exit(sd_event_source_get_event(s), PTR_TO_INT(userdata)); + } + ++static int setup_clock_data(sd_event *e, struct clock_data *d, clockid_t clock) { ++ int r; ++ ++ assert(d); ++ ++ if (d->fd < 0) { ++ r = event_setup_timer_fd(e, d, clock); ++ if (r < 0) ++ return r; ++ } ++ ++ r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare); ++ if (r < 0) ++ return r; ++ ++ r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare); ++ if (r < 0) ++ return r; ++ ++ return 0; ++} ++ + _public_ int sd_event_add_time( + sd_event *e, + sd_event_source **ret, +@@ -1094,20 +1116,10 @@ _public_ int sd_event_add_time( + d = event_get_clock_data(e, type); + assert(d); + +- r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare); +- if (r < 0) +- return r; +- +- r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare); ++ r = setup_clock_data(e, d, clock); + if (r < 0) + return r; + +- if (d->fd < 0) { +- r = event_setup_timer_fd(e, d, clock); +- if (r < 0) +- return r; +- } +- + s = source_new(e, !ret, type); + if (!s) + return -ENOMEM; +-- +2.17.1 + diff --git a/base/systemd/centos/patches/910-sd-event-split-out-code-to-add-remove-timer-event-so.patch b/base/systemd/centos/patches/910-sd-event-split-out-code-to-add-remove-timer-event-so.patch new file mode 100644 index 000000000..f92e6a173 --- /dev/null +++ b/base/systemd/centos/patches/910-sd-event-split-out-code-to-add-remove-timer-event-so.patch @@ -0,0 +1,120 @@ +From dad1d000b493f98f4f5eaf4bfa34c8617f41970f Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 23 Nov 2020 15:25:35 +0100 +Subject: [PATCH 10/20] sd-event: split out code to add/remove timer event + sources to earliest/latest prioq + +Just some refactoring that makes code prettier, and will come handy +later, because we can reuse these functions at more places. + +(cherry picked from commit 1e45e3fecc303e7ae9946220c742f69675e99c34) + +Related: #1819868 + +[commit 88b2618e4de850060a1c5c22b049e6de0578fbb5 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 57 +++++++++++++++++++++--------- + 1 file changed, 41 insertions(+), 16 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 8e6536f..e0e0eaa 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -809,6 +809,19 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) { + d->needs_rearm = true; + } + ++static void event_source_time_prioq_remove( ++ sd_event_source *s, ++ struct clock_data *d) { ++ ++ assert(s); ++ assert(d); ++ ++ prioq_remove(d->earliest, s, &s->time.earliest_index); ++ prioq_remove(d->latest, s, &s->time.latest_index); ++ s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL; ++ d->needs_rearm = true; ++} ++ + static void source_disconnect(sd_event_source *s) { + sd_event *event; + +@@ -833,13 +846,8 @@ static void source_disconnect(sd_event_source *s) { + case SOURCE_TIME_REALTIME_ALARM: + case SOURCE_TIME_BOOTTIME_ALARM: { + struct clock_data *d; +- +- d = event_get_clock_data(s->event, s->type); +- assert(d); +- +- prioq_remove(d->earliest, s, &s->time.earliest_index); +- prioq_remove(d->latest, s, &s->time.latest_index); +- d->needs_rearm = true; ++ assert_se(d = event_get_clock_data(s->event, s->type)); ++ event_source_time_prioq_remove(s, d); + break; + } + +@@ -1087,6 +1095,30 @@ static int setup_clock_data(sd_event *e, struct clock_data *d, clockid_t clock) + return 0; + } + ++static int event_source_time_prioq_put( ++ sd_event_source *s, ++ struct clock_data *d) { ++ ++ int r; ++ ++ assert(s); ++ assert(d); ++ ++ r = prioq_put(d->earliest, s, &s->time.earliest_index); ++ if (r < 0) ++ return r; ++ ++ r = prioq_put(d->latest, s, &s->time.latest_index); ++ if (r < 0) { ++ assert_se(prioq_remove(d->earliest, s, &s->time.earliest_index) > 0); ++ s->time.earliest_index = PRIOQ_IDX_NULL; ++ return r; ++ } ++ ++ d->needs_rearm = true; ++ return 0; ++} ++ + _public_ int sd_event_add_time( + sd_event *e, + sd_event_source **ret, +@@ -1113,8 +1145,7 @@ _public_ int sd_event_add_time( + type = clock_to_event_source_type(clock); + assert_return(type >= 0, -ENOTSUP); + +- d = event_get_clock_data(e, type); +- assert(d); ++ assert_se(d = event_get_clock_data(e, type)); + + r = setup_clock_data(e, d, clock); + if (r < 0) +@@ -1131,13 +1162,7 @@ _public_ int sd_event_add_time( + s->userdata = userdata; + s->enabled = SD_EVENT_ONESHOT; + +- d->needs_rearm = true; +- +- r = prioq_put(d->earliest, s, &s->time.earliest_index); +- if (r < 0) +- goto fail; +- +- r = prioq_put(d->latest, s, &s->time.latest_index); ++ r = event_source_time_prioq_put(s, d); + if (r < 0) + goto fail; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch b/base/systemd/centos/patches/911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch new file mode 100644 index 000000000..7ae6e8326 --- /dev/null +++ b/base/systemd/centos/patches/911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch @@ -0,0 +1,126 @@ +From 6dc0338be9020eebcbfafe078a46bc7be8e4a2ff Mon Sep 17 00:00:00 2001 +From: Tom Gundersen +Date: Sat, 14 Mar 2015 11:47:35 +0100 +Subject: [PATCH 11/20] sd-event: rename PASSIVE/PREPARED to INITIAL/ARMED + +[commit 2b0c9ef7352dae53ee746c32033999c1346633b3 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 22 +++++++++++----------- + src/systemd/sd-event.h | 4 ++-- + 2 files changed, 13 insertions(+), 13 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index e0e0eaa..299312a 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -2423,7 +2423,7 @@ static int dispatch_exit(sd_event *e) { + + r = source_dispatch(p); + +- e->state = SD_EVENT_PASSIVE; ++ e->state = SD_EVENT_INITIAL; + sd_event_unref(e); + + return r; +@@ -2492,7 +2492,7 @@ _public_ int sd_event_prepare(sd_event *e) { + assert_return(e, -EINVAL); + assert_return(!event_pid_changed(e), -ECHILD); + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); +- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY); ++ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY); + + if (e->exit_requested) + goto pending; +@@ -2526,15 +2526,15 @@ _public_ int sd_event_prepare(sd_event *e) { + if (event_next_pending(e) || e->need_process_child) + goto pending; + +- e->state = SD_EVENT_PREPARED; ++ e->state = SD_EVENT_ARMED; + + return 0; + + pending: +- e->state = SD_EVENT_PREPARED; ++ e->state = SD_EVENT_ARMED; + r = sd_event_wait(e, 0); + if (r == 0) +- e->state = SD_EVENT_PREPARED; ++ e->state = SD_EVENT_ARMED; + + return r; + } +@@ -2547,7 +2547,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) { + assert_return(e, -EINVAL); + assert_return(!event_pid_changed(e), -ECHILD); + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); +- assert_return(e->state == SD_EVENT_PREPARED, -EBUSY); ++ assert_return(e->state == SD_EVENT_ARMED, -EBUSY); + + if (e->exit_requested) { + e->state = SD_EVENT_PENDING; +@@ -2643,7 +2643,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) { + r = 0; + + finish: +- e->state = SD_EVENT_PASSIVE; ++ e->state = SD_EVENT_INITIAL; + + return r; + } +@@ -2666,14 +2666,14 @@ _public_ int sd_event_dispatch(sd_event *e) { + + e->state = SD_EVENT_RUNNING; + r = source_dispatch(p); +- e->state = SD_EVENT_PASSIVE; ++ e->state = SD_EVENT_INITIAL; + + sd_event_unref(e); + + return r; + } + +- e->state = SD_EVENT_PASSIVE; ++ e->state = SD_EVENT_INITIAL; + + return 1; + } +@@ -2684,7 +2684,7 @@ _public_ int sd_event_run(sd_event *e, uint64_t timeout) { + assert_return(e, -EINVAL); + assert_return(!event_pid_changed(e), -ECHILD); + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); +- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY); ++ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY); + + r = sd_event_prepare(e); + if (r > 0) +@@ -2704,7 +2704,7 @@ _public_ int sd_event_loop(sd_event *e) { + + assert_return(e, -EINVAL); + assert_return(!event_pid_changed(e), -ECHILD); +- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY); ++ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY); + + sd_event_ref(e); + +diff --git a/src/systemd/sd-event.h b/src/systemd/sd-event.h +index 4957f3a..ffde7c8 100644 +--- a/src/systemd/sd-event.h ++++ b/src/systemd/sd-event.h +@@ -51,8 +51,8 @@ enum { + }; + + enum { +- SD_EVENT_PASSIVE, +- SD_EVENT_PREPARED, ++ SD_EVENT_INITIAL, ++ SD_EVENT_ARMED, + SD_EVENT_PENDING, + SD_EVENT_RUNNING, + SD_EVENT_EXITING, +-- +2.17.1 + diff --git a/base/systemd/centos/patches/912-sd-event-refuse-running-default-event-loops-in-any-o.patch b/base/systemd/centos/patches/912-sd-event-refuse-running-default-event-loops-in-any-o.patch new file mode 100644 index 000000000..a8d2074d6 --- /dev/null +++ b/base/systemd/centos/patches/912-sd-event-refuse-running-default-event-loops-in-any-o.patch @@ -0,0 +1,39 @@ +From 01c94571660c44c415ba8bcba62176f45bf84be5 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Wed, 30 Oct 2019 20:26:50 +0100 +Subject: [PATCH 12/20] sd-event: refuse running default event loops in any + other thread than the one they are default for + +(cherry picked from commit e544601536ac13a288d7476f4400c7b0f22b7ea1) + +Related: #1819868 + +[commit 4c5fdbde7e745126f31542a70b45cc4faec094d2 from +https://github.com/systemd-rhel/rhel-8/ + +LZ: Dropped the part that won't affect code to simplify the merging.] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 299312a..a2f7868 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -2494,6 +2494,11 @@ _public_ int sd_event_prepare(sd_event *e) { + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); + assert_return(e->state == SD_EVENT_INITIAL, -EBUSY); + ++ /* Let's check that if we are a default event loop we are executed in the correct thread. We only do ++ * this check here once, since gettid() is typically not cached, and thus want to minimize ++ * syscalls */ ++ assert_return(!e->default_event_ptr || e->tid == gettid(), -EREMOTEIO); ++ + if (e->exit_requested) + goto pending; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/913-sd-event-remove-earliest_index-latest_index-into-com.patch b/base/systemd/centos/patches/913-sd-event-remove-earliest_index-latest_index-into-com.patch new file mode 100644 index 000000000..ef0ba6034 --- /dev/null +++ b/base/systemd/centos/patches/913-sd-event-remove-earliest_index-latest_index-into-com.patch @@ -0,0 +1,106 @@ +From f72ca8a711fc406dc52f18c7dbc3bfc5397b26ea Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 23 Nov 2020 17:49:27 +0100 +Subject: [PATCH 13/20] sd-event: remove earliest_index/latest_index into + common part of event source objects + +So far we used these fields to organize the earliest/latest timer event +priority queue. In a follow-up commit we want to introduce ratelimiting +to event sources, at which point we want any kind of event source to be +able to trigger time wakeups, and hence they all need to be included in +the earliest/latest prioqs. Thus, in preparation let's make this +generic. + +No change in behaviour, just some shifting around of struct members from +the type-specific to the generic part. + +(cherry picked from commit f41315fceb5208c496145cda2d6c865a5458ce44) + +Related: #1819868 + +[commit 97f599bf57fdaee688ae5750e9b2b2587e2b597a from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 25 +++++++++++++------------ + 1 file changed, 13 insertions(+), 12 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index a2f7868..82cb9ad 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -94,6 +94,9 @@ struct sd_event_source { + + LIST_FIELDS(sd_event_source, sources); + ++ unsigned earliest_index; ++ unsigned latest_index; ++ + union { + struct { + sd_event_io_handler_t callback; +@@ -105,8 +108,6 @@ struct sd_event_source { + struct { + sd_event_time_handler_t callback; + usec_t next, accuracy; +- unsigned earliest_index; +- unsigned latest_index; + } time; + struct { + sd_event_signal_handler_t callback; +@@ -804,8 +805,8 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) { + /* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy, + * pending, enable state. Makes sure the two prioq's are ordered properly again. */ + assert_se(d = event_get_clock_data(s->event, s->type)); +- prioq_reshuffle(d->earliest, s, &s->time.earliest_index); +- prioq_reshuffle(d->latest, s, &s->time.latest_index); ++ prioq_reshuffle(d->earliest, s, &s->earliest_index); ++ prioq_reshuffle(d->latest, s, &s->latest_index); + d->needs_rearm = true; + } + +@@ -816,9 +817,9 @@ static void event_source_time_prioq_remove( + assert(s); + assert(d); + +- prioq_remove(d->earliest, s, &s->time.earliest_index); +- prioq_remove(d->latest, s, &s->time.latest_index); +- s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL; ++ prioq_remove(d->earliest, s, &s->earliest_index); ++ prioq_remove(d->latest, s, &s->latest_index); ++ s->earliest_index = s->latest_index = PRIOQ_IDX_NULL; + d->needs_rearm = true; + } + +@@ -1104,14 +1105,14 @@ static int event_source_time_prioq_put( + assert(s); + assert(d); + +- r = prioq_put(d->earliest, s, &s->time.earliest_index); ++ r = prioq_put(d->earliest, s, &s->earliest_index); + if (r < 0) + return r; + +- r = prioq_put(d->latest, s, &s->time.latest_index); ++ r = prioq_put(d->latest, s, &s->latest_index); + if (r < 0) { +- assert_se(prioq_remove(d->earliest, s, &s->time.earliest_index) > 0); +- s->time.earliest_index = PRIOQ_IDX_NULL; ++ assert_se(prioq_remove(d->earliest, s, &s->earliest_index) > 0); ++ s->earliest_index = PRIOQ_IDX_NULL; + return r; + } + +@@ -1158,7 +1159,7 @@ _public_ int sd_event_add_time( + s->time.next = usec; + s->time.accuracy = accuracy == 0 ? DEFAULT_ACCURACY_USEC : accuracy; + s->time.callback = callback; +- s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL; ++ s->earliest_index = s->latest_index = PRIOQ_IDX_NULL; + s->userdata = userdata; + s->enabled = SD_EVENT_ONESHOT; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/914-sd-event-update-state-at-the-end-in-event_source_ena.patch b/base/systemd/centos/patches/914-sd-event-update-state-at-the-end-in-event_source_ena.patch new file mode 100644 index 000000000..20ca0f5d5 --- /dev/null +++ b/base/systemd/centos/patches/914-sd-event-update-state-at-the-end-in-event_source_ena.patch @@ -0,0 +1,125 @@ +From ad89da1e00919c510596dac78741c98052b1e2f7 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= +Date: Tue, 10 Nov 2020 10:38:37 +0100 +Subject: [PATCH 14/20] sd-event: update state at the end in + event_source_enable + +Coverity in CID#1435966 was complaining that s->enabled is not "restored" in +all cases. But the code was actually correct, since it should only be +"restored" in the error paths. But let's still make this prettier by not setting +the state before all operations that may fail are done. + +We need to set .enabled for the prioq reshuffling operations, so move those down. + +No functional change intended. + +(cherry picked from commit d2eafe61ca07f8300dc741a0491a914213fa2b6b) + +Related: #1819868 + +[commit deb9e6ad3a1d7cfbc3b53d1e74cda6ae398a90fd from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 51 +++++++++++++++++------------- + 1 file changed, 29 insertions(+), 22 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 82cb9ad..3ff15a2 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1691,11 +1691,11 @@ static int event_source_disable(sd_event_source *s) { + return 0; + } + +-static int event_source_enable(sd_event_source *s, int m) { ++static int event_source_enable(sd_event_source *s, int enable) { + int r; + + assert(s); +- assert(IN_SET(m, SD_EVENT_ON, SD_EVENT_ONESHOT)); ++ assert(IN_SET(enable, SD_EVENT_ON, SD_EVENT_ONESHOT)); + assert(s->enabled == SD_EVENT_OFF); + + /* Unset the pending flag when this event source is enabled */ +@@ -1705,31 +1705,16 @@ static int event_source_enable(sd_event_source *s, int m) { + return r; + } + +- s->enabled = m; +- + switch (s->type) { +- + case SOURCE_IO: +- r = source_io_register(s, m, s->io.events); +- if (r < 0) { +- s->enabled = SD_EVENT_OFF; ++ r = source_io_register(s, enable, s->io.events); ++ if (r < 0) + return r; +- } +- +- break; +- +- case SOURCE_TIME_REALTIME: +- case SOURCE_TIME_BOOTTIME: +- case SOURCE_TIME_MONOTONIC: +- case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: +- event_source_time_prioq_reshuffle(s); + break; + + case SOURCE_SIGNAL: + r = event_make_signal_data(s->event, s->signal.sig, NULL); + if (r < 0) { +- s->enabled = SD_EVENT_OFF; + event_gc_signal_data(s->event, &s->priority, s->signal.sig); + return r; + } +@@ -1750,10 +1735,12 @@ static int event_source_enable(sd_event_source *s, int m) { + + break; + ++ case SOURCE_TIME_REALTIME: ++ case SOURCE_TIME_BOOTTIME: ++ case SOURCE_TIME_MONOTONIC: ++ case SOURCE_TIME_REALTIME_ALARM: ++ case SOURCE_TIME_BOOTTIME_ALARM: + case SOURCE_EXIT: +- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); +- break; +- + case SOURCE_DEFER: + case SOURCE_POST: + break; +@@ -1762,6 +1749,26 @@ static int event_source_enable(sd_event_source *s, int m) { + assert_not_reached("Wut? I shouldn't exist."); + } + ++ s->enabled = enable; ++ ++ /* Non-failing operations below */ ++ switch (s->type) { ++ case SOURCE_TIME_REALTIME: ++ case SOURCE_TIME_BOOTTIME: ++ case SOURCE_TIME_MONOTONIC: ++ case SOURCE_TIME_REALTIME_ALARM: ++ case SOURCE_TIME_BOOTTIME_ALARM: ++ event_source_time_prioq_reshuffle(s); ++ break; ++ ++ case SOURCE_EXIT: ++ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index); ++ break; ++ ++ default: ++ break; ++ } ++ + return 0; + } + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/915-sd-event-increase-n_enabled_child_sources-just-once.patch b/base/systemd/centos/patches/915-sd-event-increase-n_enabled_child_sources-just-once.patch new file mode 100644 index 000000000..c0eb0d9eb --- /dev/null +++ b/base/systemd/centos/patches/915-sd-event-increase-n_enabled_child_sources-just-once.patch @@ -0,0 +1,44 @@ +From 04e2ffb437b301963804e6d199be1196d1b4307b Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= +Date: Tue, 10 Nov 2020 12:57:34 +0100 +Subject: [PATCH 15/20] sd-event: increase n_enabled_child_sources just once + +Neither source_child_pidfd_register() nor event_make_signal_data() look at +n_enabled_child_sources. + +(cherry picked from commit ac9f2640cb9c107b43f47bba7e068d3b92b5337b) + +Related: #1819868 + +[commit 188465c472996b426a1f22a9fc46d031b722c3b4 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 3ff15a2..e34fd0b 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1722,8 +1722,6 @@ static int event_source_enable(sd_event_source *s, int enable) { + break; + + case SOURCE_CHILD: +- s->event->n_enabled_child_sources++; +- + r = event_make_signal_data(s->event, SIGCHLD, NULL); + if (r < 0) { + s->enabled = SD_EVENT_OFF; +@@ -1732,6 +1730,7 @@ static int event_source_enable(sd_event_source *s, int enable) { + return r; + } + ++ s->event->n_enabled_child_sources++; + + break; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/916-sd-event-don-t-provide-priority-stability.patch b/base/systemd/centos/patches/916-sd-event-don-t-provide-priority-stability.patch new file mode 100644 index 000000000..93f25687a --- /dev/null +++ b/base/systemd/centos/patches/916-sd-event-don-t-provide-priority-stability.patch @@ -0,0 +1,97 @@ +From 2d07173304abd3f1d3fae5e0f01bf5874b1f04db Mon Sep 17 00:00:00 2001 +From: David Herrmann +Date: Tue, 29 Sep 2015 20:56:17 +0200 +Subject: [PATCH 16/20] sd-event: don't provide priority stability + +Currently, we guarantee that if two event-sources with the same priority +fire at the same time, they're always dispatched in the same order. While +this might sound nice in theory, there's is little benefit in providing +stability on that level. We have no control over the order the events are +reported, hence, we cannot guarantee that we get notified about both at +the same time. + +By dropping the stability guarantee, we loose roughly 10% Heap swaps in +the prioq on a desktop cold-boot. Krzysztof Kotlenga even reported up to +20% on his tests. This sounds worth optimizing, so drop the stability +guarantee. + +[commit 6fe869c251790a0e3cef5b243169dda363723f49 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 30 ------------------------------ + 1 file changed, 30 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index e34fd0b..6304991 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -243,12 +243,6 @@ static int pending_prioq_compare(const void *a, const void *b) { + if (x->pending_iteration > y->pending_iteration) + return 1; + +- /* Stability for the rest */ +- if (x < y) +- return -1; +- if (x > y) +- return 1; +- + return 0; + } + +@@ -278,12 +272,6 @@ static int prepare_prioq_compare(const void *a, const void *b) { + if (x->priority > y->priority) + return 1; + +- /* Stability for the rest */ +- if (x < y) +- return -1; +- if (x > y) +- return 1; +- + return 0; + } + +@@ -311,12 +299,6 @@ static int earliest_time_prioq_compare(const void *a, const void *b) { + if (x->time.next > y->time.next) + return 1; + +- /* Stability for the rest */ +- if (x < y) +- return -1; +- if (x > y) +- return 1; +- + return 0; + } + +@@ -344,12 +326,6 @@ static int latest_time_prioq_compare(const void *a, const void *b) { + if (x->time.next + x->time.accuracy > y->time.next + y->time.accuracy) + return 1; + +- /* Stability for the rest */ +- if (x < y) +- return -1; +- if (x > y) +- return 1; +- + return 0; + } + +@@ -371,12 +347,6 @@ static int exit_prioq_compare(const void *a, const void *b) { + if (x->priority > y->priority) + return 1; + +- /* Stability for the rest */ +- if (x < y) +- return -1; +- if (x > y) +- return 1; +- + return 0; + } + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/917-sd-event-when-determining-the-last-allowed-time-a-ti.patch b/base/systemd/centos/patches/917-sd-event-when-determining-the-last-allowed-time-a-ti.patch new file mode 100644 index 000000000..eb0534c09 --- /dev/null +++ b/base/systemd/centos/patches/917-sd-event-when-determining-the-last-allowed-time-a-ti.patch @@ -0,0 +1,53 @@ +From cf0a396c411c78d0d477d2226f89884df207aec2 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 1 Feb 2016 00:19:14 +0100 +Subject: [PATCH 17/20] sd-event: when determining the last allowed time a time + event may elapse, deal with overflows + +[commit 1bce0ffa66f329bd50d8bfaa943a755caa65b269 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 10 +++++++--- + 1 file changed, 7 insertions(+), 3 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 6304991..63f77ac 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -302,6 +302,10 @@ static int earliest_time_prioq_compare(const void *a, const void *b) { + return 0; + } + ++static usec_t time_event_source_latest(const sd_event_source *s) { ++ return usec_add(s->time.next, s->time.accuracy); ++} ++ + static int latest_time_prioq_compare(const void *a, const void *b) { + const sd_event_source *x = a, *y = b; + +@@ -321,9 +325,9 @@ static int latest_time_prioq_compare(const void *a, const void *b) { + return 1; + + /* Order by time */ +- if (x->time.next + x->time.accuracy < y->time.next + y->time.accuracy) ++ if (time_event_source_latest(x) < time_event_source_latest(y)) + return -1; +- if (x->time.next + x->time.accuracy > y->time.next + y->time.accuracy) ++ if (time_event_source_latest(x) > time_event_source_latest(y)) + return 1; + + return 0; +@@ -2014,7 +2018,7 @@ static int event_arm_timer( + b = prioq_peek(d->latest); + assert_se(b && b->enabled != SD_EVENT_OFF); + +- t = sleep_between(e, a->time.next, b->time.next + b->time.accuracy); ++ t = sleep_between(e, a->time.next, time_event_source_latest(b)); + if (d->next == t) + return 0; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch b/base/systemd/centos/patches/918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch new file mode 100644 index 000000000..6472d2f1e --- /dev/null +++ b/base/systemd/centos/patches/918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch @@ -0,0 +1,60 @@ +From c0521bcf58da1857a2077cd3b3abc330bab33598 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 1 Feb 2016 00:20:18 +0100 +Subject: [PATCH 18/20] sd-event: permit a USEC_INFINITY timeout as an + alternative to a disabling an event source + +This should simplify handling of time events in clients and is in-line with the USEC_INFINITY macro we already have. +This way setting a timeout to 0 indicates "elapse immediately", and a timeout of USEC_INFINITY "elapse never". + +[commit 393003e1debf7c7f75beaacbd532b92c3e3dc729 from +https://github.com/systemd-rhel/rhel-8/ + +LZ: Dropped the part that won't affect code to simplify the merging.] + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 63f77ac..69dd02b 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -1109,7 +1109,6 @@ _public_ int sd_event_add_time( + int r; + + assert_return(e, -EINVAL); +- assert_return(usec != (uint64_t) -1, -EINVAL); + assert_return(accuracy != (uint64_t) -1, -EINVAL); + assert_return(e->state != SD_EVENT_FINISHED, -ESTALE); + assert_return(!event_pid_changed(e), -ECHILD); +@@ -1791,7 +1790,6 @@ _public_ int sd_event_source_get_time(sd_event_source *s, uint64_t *usec) { + _public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) { + + assert_return(s, -EINVAL); +- assert_return(usec != (uint64_t) -1, -EINVAL); + assert_return(EVENT_SOURCE_IS_TIME(s->type), -EDOM); + assert_return(s->event->state != SD_EVENT_FINISHED, -ESTALE); + assert_return(!event_pid_changed(s->event), -ECHILD); +@@ -1909,6 +1907,8 @@ static usec_t sleep_between(sd_event *e, usec_t a, usec_t b) { + + if (a <= 0) + return 0; ++ if (a >= USEC_INFINITY) ++ return USEC_INFINITY; + + if (b <= a + 1) + return a; +@@ -1998,7 +1998,7 @@ static int event_arm_timer( + d->needs_rearm = false; + + a = prioq_peek(d->earliest); +- if (!a || a->enabled == SD_EVENT_OFF) { ++ if (!a || a->enabled == SD_EVENT_OFF || a->time.next == USEC_INFINITY) { + + if (d->fd < 0) + return 0; +-- +2.17.1 + diff --git a/base/systemd/centos/patches/919-sd-event-add-ability-to-ratelimit-event-sources.patch b/base/systemd/centos/patches/919-sd-event-add-ability-to-ratelimit-event-sources.patch new file mode 100644 index 000000000..647303479 --- /dev/null +++ b/base/systemd/centos/patches/919-sd-event-add-ability-to-ratelimit-event-sources.patch @@ -0,0 +1,841 @@ +From 69266c451910d2b57313b2fe7561e07cd5400d27 Mon Sep 17 00:00:00 2001 +From: Lennart Poettering +Date: Mon, 23 Nov 2020 18:02:40 +0100 +Subject: [PATCH 19/20] sd-event: add ability to ratelimit event sources + +Let's a concept of "rate limiting" to event sources: if specific event +sources fire too often in some time interval temporarily take them +offline, and take them back online once the interval passed. + +This is a simple scheme of avoiding starvation of event sources if some +event source fires too often. + +This introduces the new conceptual states of "offline" and "online" for +event sources: an event source is "online" only when enabled *and* not +ratelimited, and offline in all other cases. An event source that is +online hence has its fds registered in the epoll, its signals in the +signalfd and so on. + +(cherry picked from commit b6d5481b3d9f7c9b1198ab54b54326ec73e855bf) + +Related: #1819868 + +[commit 395eb7753a9772f505102fbbe3ba3261b57abbe9 from +https://github.com/systemd-rhel/rhel-8/ + +LZ: Moved the changes in libsystemd.sym to libsystemd.sym.m4 from the +file changing history; patch ratelimit.h in its old path; dropped +SOURCE_INOTIFY related parts in sd-event.c because it hasn't been +added in this systemd version.] + +Signed-off-by: Li Zhou +--- + src/libsystemd/libsystemd.sym.m4 | 7 + + src/libsystemd/sd-event/sd-event.c | 427 +++++++++++++++++++++++------ + src/shared/ratelimit.h | 8 + + src/systemd/sd-event.h | 3 + + 4 files changed, 365 insertions(+), 80 deletions(-) + +diff --git a/src/libsystemd/libsystemd.sym.m4 b/src/libsystemd/libsystemd.sym.m4 +index b1c2b43..ceb5d7f 100644 +--- a/src/libsystemd/libsystemd.sym.m4 ++++ b/src/libsystemd/libsystemd.sym.m4 +@@ -169,6 +169,13 @@ global: + sd_journal_has_persistent_files; + } LIBSYSTEMD_219; + ++LIBSYSTEMD_248 { ++global: ++ sd_event_source_set_ratelimit; ++ sd_event_source_get_ratelimit; ++ sd_event_source_is_ratelimited; ++} LIBSYSTEMD_229; ++ + m4_ifdef(`ENABLE_KDBUS', + LIBSYSTEMD_FUTURE { + global: +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 69dd02b..a3ade40 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -32,6 +32,7 @@ + #include "util.h" + #include "time-util.h" + #include "missing.h" ++#include "ratelimit.h" + #include "set.h" + #include "list.h" + +@@ -67,7 +68,24 @@ typedef enum WakeupType { + _WAKEUP_TYPE_INVALID = -1, + } WakeupType; + +-#define EVENT_SOURCE_IS_TIME(t) IN_SET((t), SOURCE_TIME_REALTIME, SOURCE_TIME_BOOTTIME, SOURCE_TIME_MONOTONIC, SOURCE_TIME_REALTIME_ALARM, SOURCE_TIME_BOOTTIME_ALARM) ++#define EVENT_SOURCE_IS_TIME(t) \ ++ IN_SET((t), \ ++ SOURCE_TIME_REALTIME, \ ++ SOURCE_TIME_BOOTTIME, \ ++ SOURCE_TIME_MONOTONIC, \ ++ SOURCE_TIME_REALTIME_ALARM, \ ++ SOURCE_TIME_BOOTTIME_ALARM) ++ ++#define EVENT_SOURCE_CAN_RATE_LIMIT(t) \ ++ IN_SET((t), \ ++ SOURCE_IO, \ ++ SOURCE_TIME_REALTIME, \ ++ SOURCE_TIME_BOOTTIME, \ ++ SOURCE_TIME_MONOTONIC, \ ++ SOURCE_TIME_REALTIME_ALARM, \ ++ SOURCE_TIME_BOOTTIME_ALARM, \ ++ SOURCE_SIGNAL, \ ++ SOURCE_DEFER) + + struct sd_event_source { + WakeupType wakeup; +@@ -85,6 +103,7 @@ struct sd_event_source { + bool pending:1; + bool dispatching:1; + bool floating:1; ++ bool ratelimited:1; + + int64_t priority; + unsigned pending_index; +@@ -94,6 +113,10 @@ struct sd_event_source { + + LIST_FIELDS(sd_event_source, sources); + ++ RateLimit rate_limit; ++ ++ /* These are primarily fields relevant for time event sources, but since any event source can ++ * effectively become one when rate-limited, this is part of the common fields. */ + unsigned earliest_index; + unsigned latest_index; + +@@ -188,7 +211,7 @@ struct sd_event { + Hashmap *signal_data; /* indexed by priority */ + + Hashmap *child_sources; +- unsigned n_enabled_child_sources; ++ unsigned n_online_child_sources; + + Set *post_sources; + +@@ -219,8 +242,19 @@ struct sd_event { + + static void source_disconnect(sd_event_source *s); + ++static bool event_source_is_online(sd_event_source *s) { ++ assert(s); ++ return s->enabled != SD_EVENT_OFF && !s->ratelimited; ++} ++ ++static bool event_source_is_offline(sd_event_source *s) { ++ assert(s); ++ return s->enabled == SD_EVENT_OFF || s->ratelimited; ++} ++ + static int pending_prioq_compare(const void *a, const void *b) { + const sd_event_source *x = a, *y = b; ++ int r; + + assert(x->pending); + assert(y->pending); +@@ -231,23 +265,23 @@ static int pending_prioq_compare(const void *a, const void *b) { + if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF) + return 1; + ++ /* Non rate-limited ones first. */ ++ r = CMP(!!x->ratelimited, !!y->ratelimited); ++ if (r != 0) ++ return r; ++ + /* Lower priority values first */ +- if (x->priority < y->priority) +- return -1; +- if (x->priority > y->priority) +- return 1; ++ r = CMP(x->priority, y->priority); ++ if (r != 0) ++ return r; + + /* Older entries first */ +- if (x->pending_iteration < y->pending_iteration) +- return -1; +- if (x->pending_iteration > y->pending_iteration) +- return 1; +- +- return 0; ++ return CMP(x->pending_iteration, y->pending_iteration); + } + + static int prepare_prioq_compare(const void *a, const void *b) { + const sd_event_source *x = a, *y = b; ++ int r; + + assert(x->prepare); + assert(y->prepare); +@@ -258,29 +292,46 @@ static int prepare_prioq_compare(const void *a, const void *b) { + if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF) + return 1; + ++ /* Non rate-limited ones first. */ ++ r = CMP(!!x->ratelimited, !!y->ratelimited); ++ if (r != 0) ++ return r; ++ + /* Move most recently prepared ones last, so that we can stop + * preparing as soon as we hit one that has already been + * prepared in the current iteration */ +- if (x->prepare_iteration < y->prepare_iteration) +- return -1; +- if (x->prepare_iteration > y->prepare_iteration) +- return 1; ++ r = CMP(x->prepare_iteration, y->prepare_iteration); ++ if (r != 0) ++ return r; + + /* Lower priority values first */ +- if (x->priority < y->priority) +- return -1; +- if (x->priority > y->priority) +- return 1; ++ return CMP(x->priority, y->priority); ++} + +- return 0; ++static usec_t time_event_source_next(const sd_event_source *s) { ++ assert(s); ++ ++ /* We have two kinds of event sources that have elapsation times associated with them: the actual ++ * time based ones and the ones for which a ratelimit can be in effect (where we want to be notified ++ * once the ratelimit time window ends). Let's return the next elapsing time depending on what we are ++ * looking at here. */ ++ ++ if (s->ratelimited) { /* If rate-limited the next elapsation is when the ratelimit time window ends */ ++ assert(s->rate_limit.begin != 0); ++ assert(s->rate_limit.interval != 0); ++ return usec_add(s->rate_limit.begin, s->rate_limit.interval); ++ } ++ ++ /* Otherwise this must be a time event source, if not ratelimited */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ return s->time.next; ++ ++ return USEC_INFINITY; + } + + static int earliest_time_prioq_compare(const void *a, const void *b) { + const sd_event_source *x = a, *y = b; + +- assert(EVENT_SOURCE_IS_TIME(x->type)); +- assert(x->type == y->type); +- + /* Enabled ones first */ + if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF) + return -1; +@@ -294,24 +345,30 @@ static int earliest_time_prioq_compare(const void *a, const void *b) { + return 1; + + /* Order by time */ +- if (x->time.next < y->time.next) +- return -1; +- if (x->time.next > y->time.next) +- return 1; +- +- return 0; ++ return CMP(time_event_source_next(x), time_event_source_next(y)); + } + + static usec_t time_event_source_latest(const sd_event_source *s) { +- return usec_add(s->time.next, s->time.accuracy); ++ assert(s); ++ ++ if (s->ratelimited) { /* For ratelimited stuff the earliest and the latest time shall actually be the ++ * same, as we should avoid adding additional inaccuracy on an inaccuracy time ++ * window */ ++ assert(s->rate_limit.begin != 0); ++ assert(s->rate_limit.interval != 0); ++ return usec_add(s->rate_limit.begin, s->rate_limit.interval); ++ } ++ ++ /* Must be a time event source, if not ratelimited */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ return usec_add(s->time.next, s->time.accuracy); ++ ++ return USEC_INFINITY; + } + + static int latest_time_prioq_compare(const void *a, const void *b) { + const sd_event_source *x = a, *y = b; + +- assert(EVENT_SOURCE_IS_TIME(x->type)); +- assert(x->type == y->type); +- + /* Enabled ones first */ + if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF) + return -1; +@@ -722,12 +779,12 @@ static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig) + * the signalfd for it. */ + + if (sig == SIGCHLD && +- e->n_enabled_child_sources > 0) ++ e->n_online_child_sources > 0) + return; + + if (e->signal_sources && + e->signal_sources[sig] && +- e->signal_sources[sig]->enabled != SD_EVENT_OFF) ++ event_source_is_online(e->signal_sources[sig])) + return; + + /* +@@ -774,11 +831,17 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) { + struct clock_data *d; + + assert(s); +- assert(EVENT_SOURCE_IS_TIME(s->type)); + + /* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy, + * pending, enable state. Makes sure the two prioq's are ordered properly again. */ +- assert_se(d = event_get_clock_data(s->event, s->type)); ++ ++ if (s->ratelimited) ++ d = &s->event->monotonic; ++ else { ++ assert(EVENT_SOURCE_IS_TIME(s->type)); ++ assert_se(d = event_get_clock_data(s->event, s->type)); ++ } ++ + prioq_reshuffle(d->earliest, s, &s->earliest_index); + prioq_reshuffle(d->latest, s, &s->latest_index); + d->needs_rearm = true; +@@ -819,12 +882,18 @@ static void source_disconnect(sd_event_source *s) { + case SOURCE_TIME_BOOTTIME: + case SOURCE_TIME_MONOTONIC: + case SOURCE_TIME_REALTIME_ALARM: +- case SOURCE_TIME_BOOTTIME_ALARM: { +- struct clock_data *d; +- assert_se(d = event_get_clock_data(s->event, s->type)); +- event_source_time_prioq_remove(s, d); ++ case SOURCE_TIME_BOOTTIME_ALARM: ++ /* Only remove this event source from the time event source here if it is not ratelimited. If ++ * it is ratelimited, we'll remove it below, separately. Why? Because the clock used might ++ * differ: ratelimiting always uses CLOCK_MONOTONIC, but timer events might use any clock */ ++ ++ if (!s->ratelimited) { ++ struct clock_data *d; ++ assert_se(d = event_get_clock_data(s->event, s->type)); ++ event_source_time_prioq_remove(s, d); ++ } ++ + break; +- } + + case SOURCE_SIGNAL: + if (s->signal.sig > 0) { +@@ -839,9 +908,9 @@ static void source_disconnect(sd_event_source *s) { + + case SOURCE_CHILD: + if (s->child.pid > 0) { +- if (s->enabled != SD_EVENT_OFF) { +- assert(s->event->n_enabled_child_sources > 0); +- s->event->n_enabled_child_sources--; ++ if (event_source_is_online(s)) { ++ assert(s->event->n_online_child_sources > 0); ++ s->event->n_online_child_sources--; + } + + (void) hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid)); +@@ -872,6 +941,9 @@ static void source_disconnect(sd_event_source *s) { + if (s->prepare) + prioq_remove(s->event->prepare, s, &s->prepare_index); + ++ if (s->ratelimited) ++ event_source_time_prioq_remove(s, &s->event->monotonic); ++ + event = s->event; + + s->type = _SOURCE_EVENT_SOURCE_TYPE_INVALID; +@@ -1259,11 +1331,11 @@ _public_ int sd_event_add_child( + return r; + } + +- e->n_enabled_child_sources ++; ++ e->n_online_child_sources++; + + r = event_make_signal_data(e, SIGCHLD, NULL); + if (r < 0) { +- e->n_enabled_child_sources--; ++ e->n_online_child_sources--; + source_free(s); + return r; + } +@@ -1476,7 +1548,7 @@ _public_ int sd_event_source_set_io_fd(sd_event_source *s, int fd) { + if (s->io.fd == fd) + return 0; + +- if (s->enabled == SD_EVENT_OFF) { ++ if (event_source_is_offline(s)) { + s->io.fd = fd; + s->io.registered = false; + } else { +@@ -1524,7 +1596,7 @@ _public_ int sd_event_source_set_io_events(sd_event_source *s, uint32_t events) + if (s->io.events == events && !(events & EPOLLET)) + return 0; + +- if (s->enabled != SD_EVENT_OFF) { ++ if (event_source_is_online(s)) { + r = source_io_register(s, s->enabled, events); + if (r < 0) + return r; +@@ -1572,7 +1644,7 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority) + if (s->priority == priority) + return 0; + +- if (s->type == SOURCE_SIGNAL && s->enabled != SD_EVENT_OFF) { ++ if (s->type == SOURCE_SIGNAL && event_source_is_online(s)) { + struct signal_data *old, *d; + + /* Move us from the signalfd belonging to the old +@@ -1609,20 +1681,29 @@ _public_ int sd_event_source_get_enabled(sd_event_source *s, int *m) { + return 0; + } + +-static int event_source_disable(sd_event_source *s) { ++static int event_source_offline( ++ sd_event_source *s, ++ int enabled, ++ bool ratelimited) { ++ ++ bool was_offline; + int r; + + assert(s); +- assert(s->enabled != SD_EVENT_OFF); ++ assert(enabled == SD_EVENT_OFF || ratelimited); + + /* Unset the pending flag when this event source is disabled */ +- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ if (s->enabled != SD_EVENT_OFF && ++ enabled == SD_EVENT_OFF && ++ !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { + r = source_set_pending(s, false); + if (r < 0) + return r; + } + +- s->enabled = SD_EVENT_OFF; ++ was_offline = event_source_is_offline(s); ++ s->enabled = enabled; ++ s->ratelimited = ratelimited; + + switch (s->type) { + +@@ -1643,8 +1724,10 @@ static int event_source_disable(sd_event_source *s) { + break; + + case SOURCE_CHILD: +- assert(s->event->n_enabled_child_sources > 0); +- s->event->n_enabled_child_sources--; ++ if (!was_offline) { ++ assert(s->event->n_online_child_sources > 0); ++ s->event->n_online_child_sources--; ++ } + + event_gc_signal_data(s->event, &s->priority, SIGCHLD); + break; +@@ -1661,26 +1744,42 @@ static int event_source_disable(sd_event_source *s) { + assert_not_reached("Wut? I shouldn't exist."); + } + +- return 0; ++ return 1; + } + +-static int event_source_enable(sd_event_source *s, int enable) { ++static int event_source_online( ++ sd_event_source *s, ++ int enabled, ++ bool ratelimited) { ++ ++ bool was_online; + int r; + + assert(s); +- assert(IN_SET(enable, SD_EVENT_ON, SD_EVENT_ONESHOT)); +- assert(s->enabled == SD_EVENT_OFF); ++ assert(enabled != SD_EVENT_OFF || !ratelimited); + + /* Unset the pending flag when this event source is enabled */ +- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { ++ if (s->enabled == SD_EVENT_OFF && ++ enabled != SD_EVENT_OFF && ++ !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) { + r = source_set_pending(s, false); + if (r < 0) + return r; + } + ++ /* Are we really ready for onlining? */ ++ if (enabled == SD_EVENT_OFF || ratelimited) { ++ /* Nope, we are not ready for onlining, then just update the precise state and exit */ ++ s->enabled = enabled; ++ s->ratelimited = ratelimited; ++ return 0; ++ } ++ ++ was_online = event_source_is_online(s); ++ + switch (s->type) { + case SOURCE_IO: +- r = source_io_register(s, enable, s->io.events); ++ r = source_io_register(s, enabled, s->io.events); + if (r < 0) + return r; + break; +@@ -1698,13 +1797,13 @@ static int event_source_enable(sd_event_source *s, int enable) { + r = event_make_signal_data(s->event, SIGCHLD, NULL); + if (r < 0) { + s->enabled = SD_EVENT_OFF; +- s->event->n_enabled_child_sources--; ++ s->event->n_online_child_sources--; + event_gc_signal_data(s->event, &s->priority, SIGCHLD); + return r; + } + +- s->event->n_enabled_child_sources++; +- ++ if (!was_online) ++ s->event->n_online_child_sources++; + break; + + case SOURCE_TIME_REALTIME: +@@ -1721,7 +1820,8 @@ static int event_source_enable(sd_event_source *s, int enable) { + assert_not_reached("Wut? I shouldn't exist."); + } + +- s->enabled = enable; ++ s->enabled = enabled; ++ s->ratelimited = ratelimited; + + /* Non-failing operations below */ + switch (s->type) { +@@ -1741,7 +1841,7 @@ static int event_source_enable(sd_event_source *s, int enable) { + break; + } + +- return 0; ++ return 1; + } + + _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { +@@ -1759,7 +1859,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + return 0; + + if (m == SD_EVENT_OFF) +- r = event_source_disable(s); ++ r = event_source_offline(s, m, s->ratelimited); + else { + if (s->enabled != SD_EVENT_OFF) { + /* Switching from "on" to "oneshot" or back? If that's the case, we can take a shortcut, the +@@ -1768,7 +1868,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) { + return 0; + } + +- r = event_source_enable(s, m); ++ r = event_source_online(s, m, s->ratelimited); + } + if (r < 0) + return r; +@@ -1900,6 +2000,96 @@ _public_ void *sd_event_source_set_userdata(sd_event_source *s, void *userdata) + return ret; + } + ++static int event_source_enter_ratelimited(sd_event_source *s) { ++ int r; ++ ++ assert(s); ++ ++ /* When an event source becomes ratelimited, we place it in the CLOCK_MONOTONIC priority queue, with ++ * the end of the rate limit time window, much as if it was a timer event source. */ ++ ++ if (s->ratelimited) ++ return 0; /* Already ratelimited, this is a NOP hence */ ++ ++ /* Make sure we can install a CLOCK_MONOTONIC event further down. */ ++ r = setup_clock_data(s->event, &s->event->monotonic, CLOCK_MONOTONIC); ++ if (r < 0) ++ return r; ++ ++ /* Timer event sources are already using the earliest/latest queues for the timer scheduling. Let's ++ * first remove them from the prioq appropriate for their own clock, so that we can use the prioq ++ * fields of the event source then for adding it to the CLOCK_MONOTONIC prioq instead. */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ event_source_time_prioq_remove(s, event_get_clock_data(s->event, s->type)); ++ ++ /* Now, let's add the event source to the monotonic clock instead */ ++ r = event_source_time_prioq_put(s, &s->event->monotonic); ++ if (r < 0) ++ goto fail; ++ ++ /* And let's take the event source officially offline */ ++ r = event_source_offline(s, s->enabled, /* ratelimited= */ true); ++ if (r < 0) { ++ event_source_time_prioq_remove(s, &s->event->monotonic); ++ goto fail; ++ } ++ ++ event_source_pp_prioq_reshuffle(s); ++ ++ log_debug("Event source %p (%s) entered rate limit state.", s, strna(s->description)); ++ return 0; ++ ++fail: ++ /* Reinstall time event sources in the priority queue as before. This shouldn't fail, since the queue ++ * space for it should already be allocated. */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ assert_se(event_source_time_prioq_put(s, event_get_clock_data(s->event, s->type)) >= 0); ++ ++ return r; ++} ++ ++static int event_source_leave_ratelimit(sd_event_source *s) { ++ int r; ++ ++ assert(s); ++ ++ if (!s->ratelimited) ++ return 0; ++ ++ /* Let's take the event source out of the monotonic prioq first. */ ++ event_source_time_prioq_remove(s, &s->event->monotonic); ++ ++ /* Let's then add the event source to its native clock prioq again — if this is a timer event source */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) { ++ r = event_source_time_prioq_put(s, event_get_clock_data(s->event, s->type)); ++ if (r < 0) ++ goto fail; ++ } ++ ++ /* Let's try to take it online again. */ ++ r = event_source_online(s, s->enabled, /* ratelimited= */ false); ++ if (r < 0) { ++ /* Do something roughly sensible when this failed: undo the two prioq ops above */ ++ if (EVENT_SOURCE_IS_TIME(s->type)) ++ event_source_time_prioq_remove(s, event_get_clock_data(s->event, s->type)); ++ ++ goto fail; ++ } ++ ++ event_source_pp_prioq_reshuffle(s); ++ ratelimit_reset(&s->rate_limit); ++ ++ log_debug("Event source %p (%s) left rate limit state.", s, strna(s->description)); ++ return 0; ++ ++fail: ++ /* Do something somewhat reasonable when we cannot move an event sources out of ratelimited mode: ++ * simply put it back in it, maybe we can then process it more successfully next iteration. */ ++ assert_se(event_source_time_prioq_put(s, &s->event->monotonic) >= 0); ++ ++ return r; ++} ++ + static usec_t sleep_between(sd_event *e, usec_t a, usec_t b) { + usec_t c; + assert(e); +@@ -1998,7 +2188,7 @@ static int event_arm_timer( + d->needs_rearm = false; + + a = prioq_peek(d->earliest); +- if (!a || a->enabled == SD_EVENT_OFF || a->time.next == USEC_INFINITY) { ++ if (!a || a->enabled == SD_EVENT_OFF || time_event_source_next(a) == USEC_INFINITY) { + + if (d->fd < 0) + return 0; +@@ -2018,7 +2208,7 @@ static int event_arm_timer( + b = prioq_peek(d->latest); + assert_se(b && b->enabled != SD_EVENT_OFF); + +- t = sleep_between(e, a->time.next, time_event_source_latest(b)); ++ t = sleep_between(e, time_event_source_next(a), time_event_source_latest(b)); + if (d->next == t) + return 0; + +@@ -2097,10 +2287,22 @@ static int process_timer( + + for (;;) { + s = prioq_peek(d->earliest); +- if (!s || +- s->time.next > n || +- s->enabled == SD_EVENT_OFF || +- s->pending) ++ if (!s || time_event_source_next(s) > n) ++ break; ++ ++ if (s->ratelimited) { ++ /* This is an event sources whose ratelimit window has ended. Let's turn it on ++ * again. */ ++ assert(s->ratelimited); ++ ++ r = event_source_leave_ratelimit(s); ++ if (r < 0) ++ return r; ++ ++ continue; ++ } ++ ++ if (s->enabled == SD_EVENT_OFF || s->pending) + break; + + r = source_set_pending(s, true); +@@ -2146,7 +2348,7 @@ static int process_child(sd_event *e) { + if (s->pending) + continue; + +- if (s->enabled == SD_EVENT_OFF) ++ if (event_source_is_offline(s)) + continue; + + zero(s->child.siginfo); +@@ -2242,11 +2444,26 @@ static int process_signal(sd_event *e, struct signal_data *d, uint32_t events) { + } + + static int source_dispatch(sd_event_source *s) { ++ _cleanup_(sd_event_unrefp) sd_event *saved_event = NULL; + int r = 0; + + assert(s); + assert(s->pending || s->type == SOURCE_EXIT); + ++ /* Similar, store a reference to the event loop object, so that we can still access it after the ++ * callback might have invalidated/disconnected the event source. */ ++ saved_event = sd_event_ref(s->event); ++ ++ /* Check if we hit the ratelimit for this event source, if so, let's disable it. */ ++ assert(!s->ratelimited); ++ if (!ratelimit_below(&s->rate_limit)) { ++ r = event_source_enter_ratelimited(s); ++ if (r < 0) ++ return r; ++ ++ return 1; ++ } ++ + if (s->type != SOURCE_DEFER && s->type != SOURCE_EXIT) { + r = source_set_pending(s, false); + if (r < 0) +@@ -2356,7 +2573,7 @@ static int event_prepare(sd_event *e) { + sd_event_source *s; + + s = prioq_peek(e->prepare); +- if (!s || s->prepare_iteration == e->iteration || s->enabled == SD_EVENT_OFF) ++ if (!s || s->prepare_iteration == e->iteration || event_source_is_offline(s)) + break; + + s->prepare_iteration = e->iteration; +@@ -2393,7 +2610,7 @@ static int dispatch_exit(sd_event *e) { + assert(e); + + p = prioq_peek(e->exit); +- if (!p || p->enabled == SD_EVENT_OFF) { ++ if (!p || event_source_is_offline(p)) { + e->state = SD_EVENT_FINISHED; + return 0; + } +@@ -2419,7 +2636,7 @@ static sd_event_source* event_next_pending(sd_event *e) { + if (!p) + return NULL; + +- if (p->enabled == SD_EVENT_OFF) ++ if (event_source_is_offline(p)) + return NULL; + + return p; +@@ -2879,3 +3096,53 @@ _public_ int sd_event_get_iteration(sd_event *e, uint64_t *ret) { + *ret = e->iteration; + return 0; + } ++ ++_public_ int sd_event_source_set_ratelimit(sd_event_source *s, uint64_t interval, unsigned burst) { ++ int r; ++ ++ assert_return(s, -EINVAL); ++ ++ /* Turning on ratelimiting on event source types that don't support it, is a loggable offense. Doing ++ * so is a programming error. */ ++ assert_return(EVENT_SOURCE_CAN_RATE_LIMIT(s->type), -EDOM); ++ ++ /* When ratelimiting is configured we'll always reset the rate limit state first and start fresh, ++ * non-ratelimited. */ ++ r = event_source_leave_ratelimit(s); ++ if (r < 0) ++ return r; ++ ++ RATELIMIT_INIT(s->rate_limit, interval, burst); ++ return 0; ++} ++ ++_public_ int sd_event_source_get_ratelimit(sd_event_source *s, uint64_t *ret_interval, unsigned *ret_burst) { ++ assert_return(s, -EINVAL); ++ ++ /* Querying whether an event source has ratelimiting configured is not a loggable offsense, hence ++ * don't use assert_return(). Unlike turning on ratelimiting it's not really a programming error */ ++ if (!EVENT_SOURCE_CAN_RATE_LIMIT(s->type)) ++ return -EDOM; ++ ++ if (!ratelimit_configured(&s->rate_limit)) ++ return -ENOEXEC; ++ ++ if (ret_interval) ++ *ret_interval = s->rate_limit.interval; ++ if (ret_burst) ++ *ret_burst = s->rate_limit.burst; ++ ++ return 0; ++} ++ ++_public_ int sd_event_source_is_ratelimited(sd_event_source *s) { ++ assert_return(s, -EINVAL); ++ ++ if (!EVENT_SOURCE_CAN_RATE_LIMIT(s->type)) ++ return false; ++ ++ if (!ratelimit_configured(&s->rate_limit)) ++ return false; ++ ++ return s->ratelimited; ++} +diff --git a/src/shared/ratelimit.h b/src/shared/ratelimit.h +index 58efca7..434089e 100644 +--- a/src/shared/ratelimit.h ++++ b/src/shared/ratelimit.h +@@ -55,3 +55,11 @@ typedef struct RateLimit { + } while (false) + + bool ratelimit_test(RateLimit *r); ++ ++static inline void ratelimit_reset(RateLimit *rl) { ++ rl->num = rl->begin = 0; ++} ++ ++static inline bool ratelimit_configured(RateLimit *rl) { ++ return rl->interval > 0 && rl->burst > 0; ++} +diff --git a/src/systemd/sd-event.h b/src/systemd/sd-event.h +index ffde7c8..f297c6a 100644 +--- a/src/systemd/sd-event.h ++++ b/src/systemd/sd-event.h +@@ -130,6 +130,9 @@ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec); + int sd_event_source_get_time_clock(sd_event_source *s, clockid_t *clock); + int sd_event_source_get_signal(sd_event_source *s); + int sd_event_source_get_child_pid(sd_event_source *s, pid_t *pid); ++int sd_event_source_set_ratelimit(sd_event_source *s, uint64_t interval_usec, unsigned burst); ++int sd_event_source_get_ratelimit(sd_event_source *s, uint64_t *ret_interval_usec, unsigned *ret_burst); ++int sd_event_source_is_ratelimited(sd_event_source *s); + + _SD_END_DECLARATIONS; + +-- +2.17.1 + diff --git a/base/systemd/centos/patches/920-core-prevent-excessive-proc-self-mountinfo-parsing.patch b/base/systemd/centos/patches/920-core-prevent-excessive-proc-self-mountinfo-parsing.patch new file mode 100644 index 000000000..3ad4114ef --- /dev/null +++ b/base/systemd/centos/patches/920-core-prevent-excessive-proc-self-mountinfo-parsing.patch @@ -0,0 +1,37 @@ +From dc3e079395816ce251c4794992f1816a61c1215d Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Michal=20Sekleta=CC=81r?= +Date: Thu, 9 Jul 2020 18:16:44 +0200 +Subject: [PATCH 20/20] core: prevent excessive /proc/self/mountinfo parsing + +(cherry picked from commit d586f642fd90e3bb378f7b6d3e3a64a753e51756) + +Resolves: #1819868 + +[commit 51737206afaa10d902c86ec9b5ec97cf425039c2 from +https://github.com/systemd-rhel/rhel-8/] + +Signed-off-by: Li Zhou +--- + src/core/mount.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +diff --git a/src/core/mount.c b/src/core/mount.c +index c7aed23..48427b7 100644 +--- a/src/core/mount.c ++++ b/src/core/mount.c +@@ -1692,6 +1692,12 @@ static int mount_enumerate(Manager *m) { + r = sd_event_source_set_priority(m->mount_utab_event_source, -10); + if (r < 0) + goto fail; ++ ++ r = sd_event_source_set_ratelimit(m->mount_event_source, 1 * USEC_PER_SEC, 5); ++ if (r < 0) { ++ log_error_errno(r, "Failed to enable rate limit for mount events: %m"); ++ goto fail; ++ } + } + + r = mount_load_proc_self_mountinfo(m, false); +-- +2.17.1 + diff --git a/base/systemd/centos/patches/921-systemd-Fix-compiling-errors-when-merging-1819868.patch b/base/systemd/centos/patches/921-systemd-Fix-compiling-errors-when-merging-1819868.patch new file mode 100644 index 000000000..8498b6b1a --- /dev/null +++ b/base/systemd/centos/patches/921-systemd-Fix-compiling-errors-when-merging-1819868.patch @@ -0,0 +1,64 @@ +From 15ac2f7ffd502cdc6f4ba47d0dd70fc39c48d8d7 Mon Sep 17 00:00:00 2001 +From: Li Zhou +Date: Wed, 31 Mar 2021 16:08:18 +0800 +Subject: [PATCH 21/21] systemd: Fix compiling errors when merging #1819868 + +A series of patches are merged in for the issue: +https://bugzilla.redhat.com/show_bug.cgi?id=1819868 +This commit is for fixing the compiling errors caused by context +conflict. + +Signed-off-by: Li Zhou +--- + src/libsystemd/sd-event/sd-event.c | 25 ++++++++++++++++++++++++- + 1 file changed, 24 insertions(+), 1 deletion(-) + +diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c +index 9dc1a27..282b38f 100644 +--- a/src/libsystemd/sd-event/sd-event.c ++++ b/src/libsystemd/sd-event/sd-event.c +@@ -37,9 +37,32 @@ + #include "list.h" + + #include "sd-event.h" ++#include "event-util.h" + + #define DEFAULT_ACCURACY_USEC (250 * USEC_PER_MSEC) + ++#define CMP(a, b) __CMP(UNIQ, (a), UNIQ, (b)) ++#define __CMP(aq, a, bq, b) \ ++ ({ \ ++ const typeof(a) UNIQ_T(A, aq) = (a); \ ++ const typeof(b) UNIQ_T(B, bq) = (b); \ ++ UNIQ_T(A, aq) < UNIQ_T(B, bq) ? -1 : \ ++ UNIQ_T(A, aq) > UNIQ_T(B, bq) ? 1 : 0; \ ++ }) ++ ++static inline usec_t usec_add(usec_t a, usec_t b) { ++ usec_t c; ++ ++ /* Adds two time values, and makes sure USEC_INFINITY as input results as USEC_INFINITY in output, and doesn't ++ * overflow. */ ++ ++ c = a + b; ++ if (c < a || c < b) /* overflow check */ ++ return USEC_INFINITY; ++ ++ return c; ++} ++ + typedef enum EventSourceType { + SOURCE_IO, + SOURCE_TIME_REALTIME, +@@ -2456,7 +2479,7 @@ static int source_dispatch(sd_event_source *s) { + + /* Check if we hit the ratelimit for this event source, if so, let's disable it. */ + assert(!s->ratelimited); +- if (!ratelimit_below(&s->rate_limit)) { ++ if (!ratelimit_test(&s->rate_limit)) { + r = event_source_enter_ratelimited(s); + if (r < 0) + return r; +-- +2.17.1 +