kern/172166: Deadlock in the networking code, possible due to a bug in the SCHED_ULE

Andriy Gapon avg at FreeBSD.org
Sun Sep 30 13:50:08 UTC 2012


The following reply was made to PR kern/172166; it has been noted by GNATS.

From: Andriy Gapon <avg at FreeBSD.org>
To: bug-followup at FreeBSD.org, eugen at eg.sd.rdtc.ru
Cc:  
Subject: Re: kern/172166: Deadlock in the networking code, possible due to
 a bug in the SCHED_ULE
Date: Sun, 30 Sep 2012 16:44:09 +0300

 on 30/09/2012 16:42 Andriy Gapon said the following:
 > on 30/09/2012 14:54 Andriy Gapon said the following:
 >>
 >> It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three.
 >> One of those threads is the syslogd thread that holds the lock, but the
 >> currently running thread is 'ipmi0: kcs' thread with tid 100118.
 >> It would interesting to examine what it is doing.
 >>
 > 
 > Looks like the kcs busy loops in here: kcs_loop -> kcs_read_byte ->
 > kcs_wait_for_obf.
 > Since this is a 6-CPU machine, steal threshold is set to 3 so other CPUs don't
 > try to take any work from CPU5. Not sure if this is smart actually.  Maybe it
 > would make sense to have a lower threshold or to allow stealing of real-time
 > threads at a lower threshold.
 > 
 > Since the kcs thread is a kernel thread with real-time priority (68) it doesn't
 > allow any other lower priority thread to run while it's not sleeping.
 > 
 > Also, it looks like rwlock does not take care to propagate waiters' priorities
 > in all cases.  Maybe priority propagation could have helped here, but not sure...
 > 
 
 In any case, the original trigger for this problem seems to be something in IPMI
 that keeps that thread running.
 
 -- 
 Andriy Gapon


More information about the freebsd-bugs mailing list