panic in RELENG_5 UMA - two new stack traces

Gleb Smirnoff glebius at FreeBSD.org
Fri Jul 1 20:38:07 GMT 2005


On Fri, Jul 01, 2005 at 04:32:38PM -0400, Gary Mu1der wrote:
G> >G> I can reproduce the crash within 60 seconds of firing off 30+ ping/arp 
G> >G> -d scripts, all running in parallel.
G> >G> 
G> >G> debug.mpsafenet=0 seems to have solved the problem. I'm running 100+ 
G> >G> instances of the above script and the system has been stable for over 
G> >an G> hour.
G> >
G> >Thanks! We definitely see that the bug is a race, not a broken logic. I am
G> >almost sure, that you are experiencing the same bug as I described in
G> >the beginning of the thread.
G> >
G> >Although there is no yet fix available for race between 'arp -d' and
G> >outgoing packet, there is one for race between incoming ARP reply and
G> >outgoing packet. We will probably commit it soon, after more review.
G> 
G> Is this bug specific to only using "arp -d", or does it look like the 
G> "arp -d" tests identify a bug that might cause TCP/IP related crashes 
G> with other types of real-world network traffic.
G> 
G> To rephrase: Does it look like fixing this bug may fix a lot of the 
G> network-related crashes a number of people have reported?

See above in the thread. We have two races: one that can fire anytime
in runtime, and we are going to fix it. The other with 'arp -d', not fixed
yet.

I am not sure how many reports on network related panics where related to
this race. Let's fix it and see. You can patch your boxes with the patch
and see whether they are more stable in runtime.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE


More information about the freebsd-stable mailing list