79c4324644
Change-Id: I2d302dda68298877c65c99147f5bf22186a59aac
90 lines
3.6 KiB
Diff
90 lines
3.6 KiB
Diff
From dcebeb0f7acf549620faff1badf73baba04b2068 Mon Sep 17 00:00:00 2001
|
|
From: tangbinzy <tangbin_yewu@cmss.chinamobile.com>
|
|
Date: Fri, 17 Nov 2023 10:15:09 +0000
|
|
Subject: [PATCH] e1000: set RX descriptor status in a separate operation
|
|
mainline inclusion commit 034d00d4858161e1d4cff82d8d230bce874a04d3 category:
|
|
bugfix
|
|
|
|
---------------------------------------------------------------
|
|
|
|
The code of setting RX descriptor status field maybe work fine in
|
|
previously, however with the update of glibc version, it shows two
|
|
issues when guest using dpdk receive packets:
|
|
|
|
1. The dpdk has a certain probability getting wrong buffer_addr
|
|
|
|
this impact may be not obvious, such as lost a packet once in
|
|
a while
|
|
|
|
2. The dpdk may consume a packet twice when scan the RX desc queue
|
|
over again
|
|
|
|
this impact will lead a infinite wait in Qemu, since the RDT
|
|
(tail pointer) be inscreased to equal to RDH by unexpected,
|
|
which regard as the RX desc queue is full
|
|
|
|
Write a whole of RX desc with DD flag on is not quite correct, because
|
|
when the underlying implementation of memcpy using XMM registers to
|
|
copy e1000_rx_desc (when AVX or something else CPU feature is usable),
|
|
the bytes order of desc writing to memory is indeterminacy
|
|
|
|
We can use full-scale test case to reproduce the issue-2 by
|
|
https://github.com/BASM/qemu_dpdk_e1000_test (thanks to Leonid Myravjev)
|
|
|
|
I also write a POC test case at https://github.com/cdkey/e1000_poc
|
|
which can reproduce both of them, and easy to verify the patch effect.
|
|
|
|
The hw watchpoint also shows that, when Qemu using XMM related instructions
|
|
writing 16 bytes e1000_rx_desc, concurrent with DPDK using movb
|
|
writing 1 byte status, the final result of writing to memory will be one
|
|
of them, if it made by Qemu which DD flag is on, DPDK will consume it
|
|
again.
|
|
|
|
Setting DD status in a separate operation, can prevent the impact of
|
|
disorder memory writing by memcpy, also avoid unexpected data when
|
|
concurrent writing status by qemu and guest dpdk.
|
|
|
|
Links: https://lore.kernel.org/qemu-devel/20200102110504.GG121208@stefanha-x1.localdomain/T/
|
|
|
|
Reported-by: Leonid Myravjev <asm@asm.pp.ru>
|
|
Cc: Stefan Hajnoczi <stefanha@gmail.com>
|
|
Cc: Paolo Bonzini <pbonzini@redhat.com>
|
|
Cc: Michael S. Tsirkin <mst@redhat.com>
|
|
Cc: qemu-stable@nongnu.org
|
|
Tested-by: Jing Zhang <zhangjing@sangfor.com.cn>
|
|
Reviewed-by: Frank Lee <lifan38153@sangfor.com.cn>
|
|
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
|
|
Signed-off-by: Jason Wang <jasowang@redhat.com>
|
|
|
|
Signed-off-by: tangbinzy <tangbin_yewu@cmss.chinamobile.com>
|
|
---
|
|
hw/net/e1000.c | 5 ++++-
|
|
1 file changed, 4 insertions(+), 1 deletion(-)
|
|
|
|
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
|
|
index f5bc81296d..e26e0a64c1 100644
|
|
--- a/hw/net/e1000.c
|
|
+++ b/hw/net/e1000.c
|
|
@@ -979,7 +979,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
|
|
base = rx_desc_base(s) + sizeof(desc) * s->mac_reg[RDH];
|
|
pci_dma_read(d, base, &desc, sizeof(desc));
|
|
desc.special = vlan_special;
|
|
- desc.status |= (vlan_status | E1000_RXD_STAT_DD);
|
|
+ desc.status &= ~E1000_RXD_STAT_DD;
|
|
if (desc.buffer_addr) {
|
|
if (desc_offset < size) {
|
|
size_t iov_copy;
|
|
@@ -1013,6 +1013,9 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
|
|
DBGOUT(RX, "Null RX descriptor!!\n");
|
|
}
|
|
pci_dma_write(d, base, &desc, sizeof(desc));
|
|
+ desc.status |= (vlan_status | E1000_RXD_STAT_DD);
|
|
+ pci_dma_write(d, base + offsetof(struct e1000_rx_desc, status),
|
|
+ &desc.status, sizeof(desc.status));
|
|
|
|
if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN])
|
|
s->mac_reg[RDH] = 0;
|
|
--
|
|
2.27.0
|
|
|