Re: panic: syncache: mbuf too small

From: Drew Gallatin <gallatin_at_netflix.com>
Date: Tue, 08 Feb 2022 22:34:10 UTC
I don't think the size has changed recently.  However, there is a size
difference for pkthdrs (and hence MHLEN) on 32-bit platforms vs 64-bit
platforms.

There are a number of bad ways to handle this.  Eg, don't permit Ipv6 on
these interfaces, make these interfaces chain their headers, assuming they
can do s/g dma, make them copy to a contiguous buffer.  Make mbufs bigger.
  All of the things I can think of are ugly.

On Tue, Feb 8, 2022 at 5:14 PM Bjoern A. Zeeb <
bzeeb-lists@lists.zabbadoz.net> wrote:

> On Tue, 8 Feb 2022, Drew Gallatin wrote:
>
> > I suspect that it's ic->ic_headroom, which seems to be driver dependent.
> > And that its going kaboom because of the combo of IPv6 plus some driver
> > with a large ic_headroom..
>
> Yeah, one of the Realtek drivers I was looking at sets it to 40/48
> depending on chipset.
>
> Others vendor drivers are in the order of 26/28-ish max which would be
> an exact fit (without UDP tunneling)...
>
> > It would be really unfortunate if we had to expand mbufs because of some
> > wifi driver.   Perhaps they could be taught to chain headers..
>
> Realtek is doing a few "funny" things there; a lot of being single
> segment DMAs up-to 12k-ish .. not being helpful at all.
>
> I'll go and see if I can figure it out for this one specifically
> then *sigh*.  For as long as no other drivers do similar things
> I am happy to work around it.
>
>
> Hmm  bwi(4)  is probably not much used anymore as from a quick glance
> that is also going big (82 by manual counting) and bwn(4) even more?
>
> So either our size massively shrunk in mbufs or that problem was there
> a decade ago already ... and we didn't notice?
>
>
> /bz
>
>
> > On Tue, Feb 8, 2022 at 2:45 PM Bjoern A. Zeeb <
> > bzeeb-lists@lists.zabbadoz.net> wrote:
> >
> >> On Tue, 8 Feb 2022, Bjoern A. Zeeb wrote:
> >>
> >>> On Tue, 8 Feb 2022, Drew Gallatin wrote:
> >>>
> >>>> Can you examine max_linkhdr?
> >>>
> >>> Yes, was still sitting in ddb (thankfully watchdog got disabled):
> >>>
> >>> db> x max_linkhdr
> >>> max_linkhdr:    58
> >>>
> >>> And for consistency checks:
> >>>
> >>> db> x max_hdr
> >>> max_hdr:        94
> >>> db> x max_datalen
> >>> max_datalen:    14
> >>> db> x max_protohdr
> >>> max_protohdr:   3c
> >>
> >> If I do the maths correctly:
> >>
> >> MHLEN = 168             (0x94  + 0x14)
> >>
> >> TCP_MAXHLEN = 60 - 24 = 36 TCP_MAXOLEN
> >>
> >> max_linkhdr =           88
> >>
> >> 168 - 88 - 36 = 44
> >>
> >> ipv6_hdr size = 40
> >>
> >> Leaves us with 4 for the tcp_header again?  Which would be 24?
> >>
> >>
> >> Why would this not go kaboom all the time?
> >>
> >> Hmm I assume it's ieee80211_proto.c .. it changes max_linkhdr ..
> >>
> >>
> >>
> >>
> >>
> >>> db> show reg
> >>> cs                        0x20
> >>> ds                        0x3b
> >>> es                        0x3b
> >>> fs                        0x13
> >>> gs                        0x1b
> >>> ss                        0x28
> >>> rax                       0x12
> >>> rcx                        0x1
> >>> rdx         0xffffffff811f6d0a
> >>> rbx         0xffffffff812e614c
> >>> rsp         0xfffffe0007fa15a0
> >>> rbp         0xfffffe0007fa15b0
> >>> rsi                       0x80
> >>> rdi         0xffffffff81e8cec0  cnputs_mtx
> >>> r8                        0x10
> >>> r9                       0x1d0
> >>> r10         0xffffffff81cfa820  vga_conssoftc
> >>> r11                       0x10
> >>> r12         0xffffffff812961ab
> >>> r13                       0x28
> >>> r14                      0x100
> >>> r15         0xfffffe000937a740
> >>> rip         0xffffffff80c545a7  kdb_enter+0x37
> >>> rflags                    0x86
> >>> kdb_enter+0x37: movq    $0,0x1283a5e(%rip)
> >>>
> >>> Found a console log;  the system was idle, right after a boot for a few
> >>> minutes.
> >>> It's a lab machine having booted off IPv4 (grml) but also having IPv6
> on
> >>> the network.
> >>>
> >>> According to terminal backlogs it was an incoming IPv6 ssh session
> likely
> >>> to have triggered this.  Always great if things are "idle" and only few
> >>> people
> >>> to ask.
> >>>
> >>> it is amd64;  main @ 773e3a71b2f11d422694495aca988d4c7143601b from Jan
> >> 31st.
> >>>
> >>> /bz
> >>>
> >>>
> >>>> Drew
> >>>>
> >>>> On Tue, Feb 8, 2022 at 1:58 PM Bjoern A. Zeeb <
> >>>> bzeeb-lists@lists.zabbadoz.net> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I just came to a console finding this.  The tree is from a few days
> >> ago;
> >>>>> is this known or should I investigate if it happens again?   I sadly
> >>>>> cannot
> >>>>> dump on this machine.
> >>>>>
> >>>>> /bz
> >>>>>
> >>>>> db> show panic
> >>>>> panic: syncache: mbuf too small
> >>>>> db> where
> >>>>> Tracing pid 0 tid 100014 td 0xfffffe000937a740
> >>>>> kdb_enter() at kdb_enter+0x37/frame 0xfffffe0007fa15b0
> >>>>> vpanic() at vpanic+0x1b0/frame 0xfffffe0007fa1600
> >>>>> panic() at panic+0x43/frame 0xfffffe0007fa1660
> >>>>> syncache_respond() at syncache_respond+0x777/frame 0xfffffe0007fa1730
> >>>>> syncache_add() at syncache_add+0xa71/frame 0xfffffe0007fa18c0
> >>>>> tcp_input_with_port() at tcp_input_with_port+0x14f5/frame
> >>>>> 0xfffffe0007fa1a20
> >>>>> tcp6_input_with_port() at tcp6_input_with_port+0x69/frame
> >>>>> 0xfffffe0007fa1a50
> >>>>> tcp6_input() at tcp6_input+0xb/frame 0xfffffe0007fa1a60
> >>>>> ip6_input() at ip6_input+0xc2f/frame 0xfffffe0007fa1b40
> >>>>> netisr_dispatch_src() at netisr_dispatch_src+0xaf/frame
> >> 0xfffffe0007fa1ba0
> >>>>> ether_demux() at ether_demux+0x16e/frame 0xfffffe0007fa1bd0
> >>>>> ether_nh_input() at ether_nh_input+0x3fc/frame 0xfffffe0007fa1c30
> >>>>> netisr_dispatch_src() at netisr_dispatch_src+0xaf/frame
> >> 0xfffffe0007fa1c90
> >>>>> ether_input() at ether_input+0x99/frame 0xfffffe0007fa1cf0
> >>>>> iflib_rxeof() at iflib_rxeof+0xcb3/frame 0xfffffe0007fa1e00
> >>>>> _task_fn_rx() at _task_fn_rx+0x7a/frame 0xfffffe0007fa1e40
> >>>>> gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame
> >>>>> 0xfffffe0007fa1ec0
> >>>>> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame
> >>>>> 0xfffffe0007fa1ef0
> >>>>> fork_exit() at fork_exit+0x80/frame 0xfffffe0007fa1f30
> >>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0007fa1f30
> >>>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>
> --
> Bjoern A. Zeeb                                                     r15:7
>