Re: Kernel panic due to netback.c

From: Roger Pau Monné <roger.pau_at_citrix.com>
Date: Tue, 21 Mar 2023 12:07:34 UTC
On Mon, Mar 20, 2023 at 09:25:17PM +0100, Janis Abens wrote:
> I'm sorry, it's unreadable. I sen't it from the new webmail, that has a default setting to HTML. Fixing my error and resending previous message as text.
>  
> Hello,
> 
> From time to time a kernel panic occurs. Xen-kernel-4.15, dom0, FreeBSD 13.0-RELEASE.
> 
> "Fatal trap 12: page fault while in kernel mode"
> 
> I can not repeat it reliably, but eventually it happens. I have captured a stack trace (always the same on crash), relevant part is:
> ..
> #9  xnb_txpkt2gnttab (pkt=<optimized out>, pkt@entry=0xfffffe00c49fdac8, mbufc=<optimized out>, mbufc@entry=0xfffff8002f958500, gnttab=gnttab@entry=0xfffffe019ae94a70,
>     txb=txb@entry=0xfffffe019ae95480, otherend_id=6) at /usr/src/sys/dev/xen/netback/netback.c:1715
> #10 0xffffffff80a8d72a in xnb_recv (txb=0xfffffe019ae95480, otherend=6, mbufc=<optimized out>, ifnet=0xfffff80170f81000, gnttab=0xfffffe019ae94a70)
>     at /usr/src/sys/dev/xen/netback/netback.c:1851
> #11 xnb_intr (arg=0xfffffe019ae94000) at /usr/src/sys/dev/xen/netback/netback.c:1446
> ..
> 
> It seems netback.c has not changed in ages, same lines are valid in 13.2 RC3 as well.
> 
> relevant code around /usr/src/sys/dev/xen/netback/netback.c:1715
> ..
> xnb_txpkt2gnttab(const struct xnb_pkt *pkt, struct mbuf *mbufc,
> ..
>   while (size_remaining > 0) {
>     const netif_tx_request_t *txq = RING_GET_REQUEST(txb, r_idx);
>     const size_t mbuf_space = M_TRAILINGSPACE(mbuf) - m_ofs; /* PANIC happens here! */
>     
> ..
> 
> By analyzing the trace i've come to conclusion that mbuf is NULL, thus macro:
> #define M_TRAILINGSPACE(m) ((m)->m_maxlen - (m)->m_len)
> introduces panic.
> 
> The only way mbuf can become NULL is within this same loop at line:1751 mbuf = mbuf->m_next;
> It can not be NULL at the function call, because xnb_recv ensures that it is not NULL, before call.
> 
> The problem definiteley is because while condition is on size_remaining, but contents are accessed based on mbuf->m_next;
> 
> So my questions are:
> 1) would it be possible to add some function before the PANIC line (or mbuf->m_next) that dumps offending packet in error logs or something similar? The goal for this would be to find a way to reliably repeat this case and understand what is the cause? If there is no such a function, which variables would be relevant and hellpful in this case?



> 2) How could this code be modified so that it does not panic in this case, but just drops offending packet instead?

Likely, that would be a more graceful failure rather than a pointer
dereference.  I believe this is not supposed to happen in the first
place, and thus the deref is a result of a bug elsewhere.

I'm attaching a patch below that will print the relevant values from the
previous loop when m_next is NULL and there's still data from the
ring packet to copy.  I've also added a break in that case, but I'm
unsure that the rest of the logic can cope with this situation, it's
quite possible that you will get a deref or a panic elsewhere in
netback.

Let me know what output you get with this patch.

> A code snippet in xnb_recv has caught my eye:
>   if (*mbufc == NULL) {
>     /*
>      * Couldn't allocate mbufs.  Respond and drop the packet.  Do
>      * not consume the requests
>      */
>     xnb_txpkt2rsp(&pkt, txb, 1);
>     DPRINTF("xnb_intr: Couldn't allocate mbufs, num_consumed=%d\n",
>         num_consumed);
>     if_inc_counter(ifnet, IFCOUNTER_IQDROPS, 1);
>     return ENOMEM;
>   }
> 
> Could it be used in function xnb_txpkt2gnttab to avoid panic in this particular case as well?

Hm, not really, at least not without understanding what causes this
mismatch.

Regards, Roger.
---
diff --git a/sys/dev/xen/netback/netback.c b/sys/dev/xen/netback/netback.c
index ddd5218a8936..89b9de2a3c98 100644
--- a/sys/dev/xen/netback/netback.c
+++ b/sys/dev/xen/netback/netback.c
@@ -1749,6 +1749,11 @@ xnb_txpkt2gnttab(const struct xnb_pkt *pkt, struct mbuf *mbufc,
 			/* Must move to the next mbuf */
 			m_ofs = 0;
 			mbuf = mbuf->m_next;
+			if (mbuf == NULL && size_remaining > 0) {
+				printf("next mbuf == NULL size_remaining: %d r_ofs %d m_ofs %d mbuf_space %zu req_size %zu pkt_space %zu space %zu",
+				    size_remaining, r_ofs, m_ofs, mbuf_space, req_size, pkt_space, space);
+				break;
+			}
 		}
 	}