kernel-modules: IRQ affinity hint fix-ups

This commit modifies a number of out-of-tree kernel modules to ensure
that the irqaffinity= kernel command line option is honored by the
interrupts set up and serviced by the device drivers. For further
information about the rationale for the changes, please see the the
following change: Ibf47fd301a460638f3bb4c49865adc3b2429e06d

Here is a summary of the changes made by this commit:

- i40e: Replicate mainline commit d34c54d1739c ("i40e: Use
  irq_update_affinity_hint()").

- iavf: Replicate mainline commit 0f9744f4ed53 ("iavf: Use
  irq_update_affinity_hint()").

- ice: The device driver is made to use the irq_update_affinity_hint
  function instead of the irq_set_affinity_hint function. Please note
  that this driver was not modified in the mainline kernel, and hence
  this modification is a StarlingX-specific change.

- mlx5: Diverge from mainline commit 7451e9ea8e20 ("net/mlx5: Use
  irq_set_affinity_and_hint()") by using irq_update_affinity_hint
  instead of irq_set_affinity_and_hint, so that StarlingX users can rely
  on the irqaffinity= kernel command line argument to set the affinities
  of the interrupts serviced by mlx5.

  Please note that, due to the way the Mellanox module build works,
  there is a need to have a patch that adds a patch for the build system
  to apply; otherwise, there are patch application failures at build
  time.

The reasons for not modifying the remaining modules are as follows:

- igb_uio: This driver does not use the deprecated API function.

- intel-opae-fpga: This driver does not use the deprecated API function.

- qat17: A patch to make use of the irq_update_affinity_hint function
  was prepared, but after some consideration, it was decided to not
  publish the patch due to concerns about unintended side-effects.

Testing:
- An ISO image was built successfully with this patch, via a monolithic
  build.

- The built ISO image was successfully installed onto and bootstrapped
  on an All-in-One simplex (physical) server with network interfaces
  handled by the i40e and ice (as well as ixgbe) device drivers. The
  low-latency profile was used during the tests.

  The following test steps were carried out with and without this patch
  on the aforementioned All-in-One simplex server:
  - After bootstrap, /etc/rc.d/init.d/affine-platform.sh and
    /usr/bin/affine-interrupts.sh were modified to not manipulate IRQ
    affinities to be able to clearly observe the effect of the changes
    in this patch.
  - The system configuration was changed so that all CPUs other than
    platform CPUs and two additional "application" CPUs were made
    "application-isolated" CPUs.
  - The system was unlocked, and, after the reboot, virtual function
    interfaces were set up.
  - The CPU affinities of all IRQs and IRQ threads were collected with a
    script.
  - iperf3 was used as a sanity test between a virtual function and
    physical function pair.

  Without the patch, the IRQ affinities of the network interfaces were
  spread across all CPUs, regardless of the isolated CPUs and the value
  of the irqaffinity= kernel command line argument.

  With the patch, it was observed that the IRQ affinities of the network
  interfaces aligned with the irqaffinity= command line argument's
  value, which coincides with the aforementioned "application" CPUs.

  iperf3 ran without issues for all tests, but we need to note that with
  this patch iperf3's throughput values were more varied depending on
  which CPUs the scheduler opted to place iperf3 on and the IRQ threads
  that were used by the device driver at the time.

- The device driver changes were also verified using an All-in-One
  simplex server that has Mellanox network interfaces managed by the
  mlx5 driver, but without changes to init scripts. (Tests with mlx5
  were carried out earlier, and we had not thought of modifying the init
  scripts at the time.) The tests were carried out in low-latency
  profile.

  The mlx5 driver appears to initialize all interrupts of an interface
  before it is brought up, which allows the init scripts to set the CPU
  affinities of all IRQs correctly, with and without this patch,
  assuming that the virtual functions are not set up after the
  completion of the init scripts.

  With this patch, we observed that some IRQ threads' CPU affinities
  would initially be set to the platform CPUs instead of aligning with
  the irqaffinity= argument's value, until each IRQ thread serviced at
  least one interrupt. Running iperf3, or bringing the interface down
  and up resulted in correct IRQ thread CPU affinities.

Change-Id: I9574de83738eaed1f03d79cbeb62e9e949cb85ac
Closes-Bug: 1958417
Link: https://lore.kernel.org/netdev/20210903152430.244937-1-nitesh@redhat.com/t/#u
Depends-On: Ibf47fd301a460638f3bb4c49865adc3b2429e06d
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
This commit is contained in:
M. Vefa Bicakci 2022-01-14 17:10:31 -05:00
parent 8baef94639
commit 7ded004316
9 changed files with 323 additions and 0 deletions

View File

@ -32,6 +32,7 @@ Source11: modules-load.conf
Patch01: i40e-Enable-getting-link-status-from-VF.patch
Patch02: i40e-add-more-debug-info-for-VFs-still-in-reset.patch
Patch03: i40e_main-Use-irq_update_affinity_hint.patch
%define kversion %(rpm -q kernel%{?bt_ext}-devel | sort --version-sort | tail -1 | sed 's/kernel%{?bt_ext}-devel-//')

View File

@ -0,0 +1,61 @@
From 2ae84a0ff5b9d12aac1394965ff21d636fc3162b Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
Date: Fri, 14 Jan 2022 17:25:25 -0500
Subject: [PATCH] i40e_main: Use irq_update_affinity_hint
This commit makes i40e_main use irq_update_affinity_hint instead of
irq_set_affinity_hint to set the CPU affinity hints. This is done
because the latter function sets the IRQ CPU affinities, whereas the
former does not, and this allows the use of the default IRQ affinity CPU
mask provided via the irqaffinity= kernel command line option.
This commit essentially replicates the i40e patch in the following
patch series:
https://lore.kernel.org/netdev/20210903152430.244937-1-nitesh@redhat.com/t/#u
The i40e patch has been mainlined as of this writing:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d34c54d1739c2cdf2e4437b74e6da269147f4987
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
---
src/i40e_main.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/i40e_main.c b/src/i40e_main.c
index 874644bc0c1a..0bb06d3172b7 100644
--- a/src/i40e_main.c
+++ b/src/i40e_main.c
@@ -4761,10 +4761,10 @@ int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
*
* get_cpu_mask returns a static constant mask with
* a permanent lifetime so it's ok to pass to
- * irq_set_affinity_hint without making a copy.
+ * irq_update_affinity_hint without making a copy.
*/
cpu = cpumask_local_spread(q_vector->v_idx, -1);
- irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ irq_update_affinity_hint(irq_num, get_cpu_mask(cpu));
#endif /* HAVE_IRQ_AFFINITY_HINT */
}
@@ -4779,7 +4779,7 @@ free_queue_irqs:
irq_set_affinity_notifier(irq_num, NULL);
#endif
#ifdef HAVE_IRQ_AFFINITY_HINT
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
#endif
free_irq(irq_num, &vsi->q_vectors[vector]);
}
@@ -5594,7 +5594,7 @@ static void i40e_vsi_free_irq(struct i40e_vsi *vsi)
#endif
#ifdef HAVE_IRQ_AFFINITY_HINT
/* remove our suggested affinity mask for this IRQ */
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
#endif
synchronize_irq(irq_num);
free_irq(irq_num, vsi->q_vectors[i]);
--
2.29.2

View File

@ -30,6 +30,8 @@ Source0: %{kmod_name}-%{version}.tar.gz
Source5: GPL-v2.0.txt
Source11: modules-load.conf
Patch01: iavf_main-Use-irq_update_affinity_hint.patch
%define kversion %(rpm -q kernel%{?bt_ext}-devel | sort --version-sort | tail -1 | sed 's/kernel%{?bt_ext}-devel-//')
%package -n kmod-iavf%{?bt_ext}

View File

@ -0,0 +1,61 @@
From 1b24525e2971c01eafe7ac0f950dfb3a012035cf Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
Date: Fri, 14 Jan 2022 17:39:52 -0500
Subject: [PATCH] iavf_main: Use irq_update_affinity_hint
This commit makes iavf_main use irq_update_affinity_hint instead of
irq_set_affinity_hint to set the CPU affinity hints. This is done
because the latter function sets the IRQ CPU affinities, whereas the
former does not, and this allows the use of the default IRQ affinity CPU
mask provided via the irqaffinity= kernel command line option.
This commit essentially replicates the iavf patch in the following
patch series:
https://lore.kernel.org/netdev/20210903152430.244937-1-nitesh@redhat.com/t/#u
The iavf patch has been mainlined as of this writing:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0f9744f4ed539f2e847d7ed41993b243e3ba5cff
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
---
src/iavf_main.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/iavf_main.c b/src/iavf_main.c
index 8af856576e34..65bf4e939ea3 100644
--- a/src/iavf_main.c
+++ b/src/iavf_main.c
@@ -435,10 +435,10 @@ iavf_request_traffic_irqs(struct iavf_adapter *adapter, char *basename)
#ifdef HAVE_IRQ_AFFINITY_HINT
/* Spread the IRQ affinity hints across online CPUs. Note that
* get_cpu_mask returns a mask with a permanent lifetime so
- * it's safe to use as a hint for irq_set_affinity_hint.
+ * it's safe to use as a hint for irq_update_affinity_hint.
*/
cpu = cpumask_local_spread(q_vector->v_idx, -1);
- irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ irq_update_affinity_hint(irq_num, get_cpu_mask(cpu));
#endif /* HAVE_IRQ_AFFINITY_HINT */
}
@@ -452,7 +452,7 @@ free_queue_irqs:
irq_set_affinity_notifier(irq_num, NULL);
#endif
#ifdef HAVE_IRQ_AFFINITY_HINT
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
#endif
free_irq(irq_num, &adapter->q_vectors[vector]);
}
@@ -508,7 +508,7 @@ static void iavf_free_traffic_irqs(struct iavf_adapter *adapter)
irq_set_affinity_notifier(irq_num, NULL);
#endif
#ifdef HAVE_IRQ_AFFINITY_HINT
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
#endif
free_irq(irq_num, &adapter->q_vectors[vector]);
}
--
2.29.2

View File

@ -32,6 +32,7 @@ Source11: modules-load.conf
Patch1: 0001-ice_xsk-Avoid-dependency-on-napi_busy_loop-with-PREE.patch
Patch2: 0002-ice_main-ice_lib-Use-irq_update_affinity_hint.patch
%define kversion %(rpm -q kernel%{?bt_ext}-devel | sort --version-sort | tail -1 | sed 's/kernel%{?bt_ext}-devel-//')
%define find() %(for f in %*; do if [ -e $f ]; then echo $f; break; fi; done)

View File

@ -0,0 +1,70 @@
From 2c0df5cef9bfdeb934102d18df38e4024381298f Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
Date: Fri, 14 Jan 2022 17:50:39 -0500
Subject: [PATCH] ice_main, ice_lib: Use irq_update_affinity_hint
This commit makes the ice device driver use the irq_update_affinity_hint
function instead of the irq_set_affinity_hint function. This is done
because the latter function sets the IRQ CPU affinities, whereas the
former does not, and this allows the use of the default IRQ affinity CPU
mask provided via the irqaffinity= kernel command line option.
Please note that this patch was not cherry-picked from an upstream
commit. The changes have been inspired by the i40e and iavf device
driver patches in the following patch series:
https://lore.kernel.org/netdev/20210903152430.244937-1-nitesh@redhat.com/t/#u
The aforementioned patches have been mainlined as of this writing with
the following merge commit by Linus Torvalds:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=147cc5838c0f5c76e908b816e924ca378e0d4735
And the i40e and iavf patches are accessible at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d34c54d1739c2cdf2e4437b74e6da269147f4987
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0f9744f4ed539f2e847d7ed41993b243e3ba5cff
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
---
src/ice_lib.c | 2 +-
src/ice_main.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/ice_lib.c b/src/ice_lib.c
index 889972052be7..6e50a9dc9ef4 100644
--- a/src/ice_lib.c
+++ b/src/ice_lib.c
@@ -2883,7 +2883,7 @@ void ice_vsi_free_irq(struct ice_vsi *vsi)
irq_set_affinity_notifier(irq_num, NULL);
/* clear the affinity_mask in the IRQ descriptor */
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
synchronize_irq(irq_num);
devm_free_irq(ice_pf_to_dev(pf), irq_num, vsi->q_vectors[i]);
}
diff --git a/src/ice_main.c b/src/ice_main.c
index 97e754bc5e11..802d4912a574 100644
--- a/src/ice_main.c
+++ b/src/ice_main.c
@@ -3384,8 +3384,8 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
irq_set_affinity_notifier(irq_num, affinity_notify);
}
- /* assign the mask for this irq */
- irq_set_affinity_hint(irq_num, &q_vector->affinity_mask);
+ /* assign the affinity hint for this irq */
+ irq_update_affinity_hint(irq_num, &q_vector->affinity_mask);
}
vsi->irqs_ready = true;
@@ -3397,7 +3397,7 @@ free_q_irqs:
irq_num = pf->msix_entries[base + vector].vector;
if (!IS_ENABLED(CONFIG_RFS_ACCEL))
irq_set_affinity_notifier(irq_num, NULL);
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
devm_free_irq(dev, irq_num, &vsi->q_vectors[vector]);
}
return err;
--
2.29.2

View File

@ -1,3 +1,4 @@
Support-TiS-system.patch
Introduce-devtoolset-8.patch
Fix-compile-issues-when-using-kernel-5.10.57.patch
mlx5-pci_irq-Use-irq_update_affinity_hint.patch

View File

@ -0,0 +1,35 @@
From a2cd8e1c28d9231611e738487ae9e1904942b094 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
Date: Fri, 14 Jan 2022 17:05:36 -0500
Subject: [PATCH] mlx5: pci_irq: Use irq_update_affinity_hint
(Please see the patch file for a description.)
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
---
SPECS/mlnx-ofa_kernel.spec | 2 ++
1 file changed, 2 insertions(+)
diff --git a/SPECS/mlnx-ofa_kernel.spec b/SPECS/mlnx-ofa_kernel.spec
index 0dde85950881..446edacf92f5 100644
--- a/SPECS/mlnx-ofa_kernel.spec
+++ b/SPECS/mlnx-ofa_kernel.spec
@@ -110,6 +110,7 @@ Group: System Environment/Base
Source: %{_basename}-%{_version}.tgz
Source100: modules-load.conf
Patch01: 0001-implicit-declaration-of-function-__is_constexpr.patch
+Patch02: 0002-mlx5-pci_irq-Use-irq_update_affinity_hint.patch
BuildRoot: %{?build_root:%{build_root}}%{!?build_root:/var/tmp/OFED}
Vendor: Mellanox Technologies
Obsoletes: kernel-ib
@@ -301,6 +302,7 @@ sed -s -i -e '1s|python\>|python3|' `grep -rl '^#!.*python' source/ofed_scripts`
mkdir obj
%patch01 -p1
+%patch02 -p1
%build
%if 0%{?rhel} == 7
--
2.29.2

View File

@ -0,0 +1,91 @@
From e5d69db1083481aef4ea64b504c294929c7422d9 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
Date: Fri, 14 Jan 2022 16:26:29 -0500
Subject: [PATCH] mlx5: pci_irq: Use irq_update_affinity_hint
This commit applies a patch that modifies the mlx5 driver so that it
uses the irq_update_affinity_hint function instead of the
irq_set_affinity_hint function. The former only sets the hint, whereas
the latter sets both the hint and the IRQ affinity.
The intent of the patch is to allow the user-specified IRQ affinity (via
the irqaffinity= command line argument) take effect for the IRQs set up
by the mlx5 device driver.
(Please see the description of the applied patch for more information.)
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
---
...pci_irq-Use-irq_update_affinity_hint.patch | 60 +++++++++++++++++++
1 file changed, 60 insertions(+)
create mode 100644 source/backports/1000-mlx5-pci_irq-Use-irq_update_affinity_hint.patch
diff --git a/source/backports/1000-mlx5-pci_irq-Use-irq_update_affinity_hint.patch b/source/backports/1000-mlx5-pci_irq-Use-irq_update_affinity_hint.patch
new file mode 100644
index 000000000000..f1122af264eb
--- /dev/null
+++ b/source/backports/1000-mlx5-pci_irq-Use-irq_update_affinity_hint.patch
@@ -0,0 +1,60 @@
+From 3884feaf05e9b1003ab83ab76fbfdf9a188c4a19 Mon Sep 17 00:00:00 2001
+From: "M. Vefa Bicakci" <vefa.bicakci@windriver.com>
+Date: Fri, 14 Jan 2022 16:26:29 -0500
+Subject: [PATCH] mlx5: pci_irq: Use irq_update_affinity_hint
+
+The StarlingX kernel was patched to deprecate irq_set_affinity_hint
+by cherry-picking the patches at:
+ https://lore.kernel.org/netdev/20210903152430.244937-1-nitesh@redhat.com/t/#u
+
+These patches have been mainlined as of this writing, with the following
+merge commit by Linus Torvalds:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=147cc5838c0f5c76e908b816e924ca378e0d4735
+
+This commit modifies the mlx5 driver so that it uses the
+irq_update_affinity_hint function instead of the irq_set_affinity_hint
+function. The former only sets the hint, whereas the latter sets both
+the hint and the IRQ affinity.
+
+Please note that this is a divergence from the aforementioned patch
+series, which make mlx5 use irq_set_affinity_and_hint, which currently
+behaves in the same way as irq_set_affinity_hint. The intent with
+diverging from mainline is to allow the user-specified IRQ affinity (via
+the irqaffinity= command line argument) take effect for the IRQs
+set up by the mlx5 device driver.
+
+The mlx5 commit in mainline is accessible at:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7451e9ea8e2055af39afe7ff39a5f68d8ec6b98d
+
+Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
+---
+ drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+index 09d0ce8061f3..db7472d10fb6 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+@@ -237,8 +237,8 @@ static int set_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i)
+ cpumask_set_cpu(cpumask_local_spread(i, mdev->priv.numa_node),
+ irq->mask);
+ if (IS_ENABLED(CONFIG_SMP) &&
+- irq_set_affinity_hint(irqn, irq->mask))
+- mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x",
++ irq_update_affinity_hint(irqn, irq->mask))
++ mlx5_core_warn(mdev, "irq_update_affinity_hint failed, irq 0x%.4x",
+ irqn);
+
+ return 0;
+@@ -261,7 +261,7 @@ static void clear_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i)
+ msix = priv->msix_arr;
+ irqn = msix[vecidx].vector;
+ #endif
+- irq_set_affinity_hint(irqn, NULL);
++ irq_update_affinity_hint(irqn, NULL);
+ free_cpumask_var(irq->mask);
+ }
+
+--
+2.29.2
+
--
2.29.2