[Bug 259458] iflib_rxeof NULL pointer crash with vmxnet3 driver

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 01 Nov 2021 13:03:01 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259458

--- Comment #17 from Andriy Gapon <avg@FreeBSD.org> ---
Some additional observations from another crash of exactly the same kind.

There are 8 receive queues with 2 free lists per each.
As far as I can tell, all free lists had been initialized and had initial 128
credits. Just a single packet had been received.  It was on rxq0 and its
descriptor matched free list number 0.  So, a single credit was consumed from
that free list and its cidx was advanced to 1.  After that the free list was
topped up with credits.  Additionally the code was also topping up the other
free list for rxq0 and that's when the problem happened.  iflib_fl_refill() was
called with count of 1919 (2048 - 128 - 1) and it was able to fill 1247 credits
before the free list's bitmap became full somehow...

Free lists for receive queues other than zero look like this:
(kgdb) p *ctx->ifc_rxqs[$i++].ifr_fl@2
$52 = {{ifl_cidx = 0, ifl_pidx = 128, ifl_credits = 128, ifl_gen = 0 '\000',
ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002fb0c00, ifl_fragidx = 128,
ifl_size = 2048, ifl_buf_size = 2048, ifl_cltype = 1, 
    ifl_zone = 0xfffff800029c6000, ifl_sds = {ifsd_map = 0xfffffe00eac18000,
ifsd_m = 0xfffffe00eabfc000, ifsd_cl = 0xfffffe00eac10000, ifsd_ba =
0xfffffe00eac14000}, ifl_rxq = 0xfffffe00ea9f5300, ifl_id = 0 '\000', 
    ifl_buf_tag = 0xfffff80002fb0e00, ifl_ifdi = 0xfffff80002fc56a8,
ifl_bus_addrs = {6106355712, 6106349568, 6106351616, 6106345472, 6106347520,
6106361856, 6106363904, 6106357760, 6106359808, 6106353664, 6106376192, 
      6106370048, 6106372096, 6106365952, 6106368000, 6106382336, 6106384384,
6106378240, 6106380288, 6106374144, 6106396672, 6106390528, 6106392576,
6106386432, 6106388480, 6106402816, 6106404864, 6106398720, 6106400768, 
      6106394624, 6104950784, 6106415104}, ifl_rxd_idxs = {96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127}},
{ ifl_cidx = 0, ifl_pidx = 128, ifl_credits = 128, ifl_gen = 0 '\000',
ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002fb0b00, ifl_fragidx = 128,
ifl_size = 2048, ifl_buf_size = 4096, ifl_cltype = 3, 
    ifl_zone = 0xfffff800029c5000, ifl_sds = {ifsd_map = 0xfffffe00eac28000,
ifsd_m = 0xfffffe00eac1c000, ifsd_cl = 0xfffffe00eac20000, ifsd_ba =
0xfffffe00eac24000}, ifl_rxq = 0xfffffe00ea9f5300, ifl_id = 1 '\001', 
    ifl_buf_tag = 0xfffff80002fb0d00, ifl_ifdi = 0xfffff80002fc56d0,
ifl_bus_addrs = {8338677760, 8338681856, 8338685952, 8338690048, 8338620416,
8338624512, 8338628608, 8338632704, 6105874432, 6105878528, 6105882624, 
      6105886720, 6105890816, 6105894912, 6105899008, 6105903104, 6105907200,
6105911296, 6105915392, 6105919488, 6105923584, 6105927680, 6105931776,
6105935872, 6105939968, 6105944064, 6105948160, 6105952256, 6105956352, 
      6105960448, 6105964544, 6105968640}, ifl_rxd_idxs = {96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127}}}

And their bitmaps are all like this:
$62 = {0xffffffffffffffff, 0xffffffffffffffff, 0x0 <repeats 30 times>}

Here are the free lists of rxq0 at the time of the crash:
$51 = {{ifl_cidx = 1, ifl_pidx = 0, ifl_credits = 2047, ifl_gen = 0 '\000',
ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002faf400, ifl_fragidx = 0,
ifl_size = 2048, ifl_buf_size = 2048, ifl_cltype = 1, 
    ifl_zone = 0xfffff800029c6000, ifl_sds = {ifsd_map = 0xfffffe00eabd8000,
ifsd_m = 0xfffffe00eabcc000, ifsd_cl = 0xfffffe00eabd0000, ifsd_ba =
0xfffffe00eabd4000}, ifl_rxq = 0xfffffe00ea9f5000, ifl_id = 0 '\000', 
    ifl_buf_tag = 0xfffff80002faf600, ifl_ifdi = 0xfffff80002fc5728,
ifl_bus_addrs = {6101612544, 6101651456, 6101649408, 6101639168, 6101641216,
6101626880, 6101628928, 6101635072, 6101637120, 6101659648, 6101661696, 
      6101647360, 6101624832, 6101618688, 6101620736, 6101614592, 6101616640,
6101671936, 6101673984, 6101667840, 6101669888, 6101622784, 6101678080,
6101680128, 6101682176, 6101655552, 6101657600, 6101786624, 6101788672, 
      6101766144, 6101768192, 6101676032}, ifl_rxd_idxs = {2016, 2017, 2018,
2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030, 2031,
2032, 2033, 2034, 2035, 2036, 2037, 2038, 2039, 2040, 2041, 2042, 2043, 
      2044, 2045, 2046, 2047}},
{ifl_cidx = 0, ifl_pidx = 1344, ifl_credits = 1344, ifl_gen = 0 '\000',
ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002faf300, ifl_fragidx = 128,
ifl_size = 2048, ifl_buf_size = 4096, 
    ifl_cltype = 3, ifl_zone = 0xfffff800029c5000, ifl_sds = {ifsd_map =
0xfffffe00eabe8000, ifsd_m = 0xfffffe00eabdc000, ifsd_cl = 0xfffffe00eabe0000,
ifsd_ba = 0xfffffe00eabe4000}, ifl_rxq = 0xfffffe00ea9f5000, 
    ifl_id = 1 '\001', ifl_buf_tag = 0xfffff80002faf500, ifl_ifdi =
0xfffff80002fc5750, ifl_bus_addrs = {8347934720, 8347938816, 8347942912,
8347824128, 8347828224, 8347832320, 8347836416, 8347840512, 8347844608,
8347848704, 
      8347852800, 8347856896, 8347860992, 8347865088, 8347869184, 8347873280,
8347877376, 8347881472, 8347758592, 8347762688, 8347770880, 8347774976,
8347779072, 8347783168, 8347787264, 8347791360, 8347795456, 8347799552, 
      8347803648, 8347807744, 8347811840, 8347930624}, ifl_rxd_idxs = {1344,
1345, 1346, 1347, 1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357,
1358, 1359, 1360, 1361, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 
      1370, 1371, 1372, 1373, 2047, 1343}}}

And their bitmaps:
(kgdb) set $i=0
(kgdb) p/x *ctx->ifc_rxqs[$i/2].ifr_fl[$i++%2].ifl_rx_bitmap@32
$60 = {0xfffffffffffffffe, 0xffffffffffffffff <repeats 31 times>}
(kgdb) 
$61 = {0xffffffffffffffff <repeats 32 times>}

I am out of ideas what could have caused the full bitmap for fl 1 of rxq0 after
receiving just one packet.  All other fields in the free list do not appear to
be corrupt or inconsistent.  It's only ifl_rx_bitmap and ifl_rxd_idxs at
position 30.

-- 
You are receiving this mail because:
You are the assignee for the bug.