blockable sleep lock (sleep mutex) 16

Wed Feb 4 08:05:06 PST 2009

On 2 Feb 2009, at 19:09 , Julian Elischer wrote:

>>> It says "non-sleepable locks", yet it classifies click_instance  
>>> as  sleep mutex. I think witness code should emit messages which  
>>> are more  clear.
>> It is confusing, but you can't do an M_WAITOK malloc while holding  
>> a mutex.  Basically, sleeping actually means calling "*sleep()  
>> (such as mtx_sleep()) or cv_*wait*()".  Blocking on a mutex is not  
>> sleeping, it's "blocking".  Some locks (such as sx(9)) do "sleep"  
>> when you contest them.  In the scheduler, sleeping and blocking are  
>> actually quite different (blocking uses turnstiles that handle  
>> priority inversions via priority propagation, sleeping uses sleep  
>> queues which do not do any of that).  The underyling idea is that  
>> mutexes should be held for "short" periods of time, and that any  
>> sleeps are potentially unbounded.  Holding a mutex while sleeping  
>> could result in a mutex being held for a long time.
>
>
> the locking overview page
> man 9 locking
> tries to explain this..
> I've been pestering John to proofread it and make suggestiosn for a  
> while now.

Thanks John and Julian. I agree, man pages should be more clear :)

I've switched from using mtx to sx locks, since they offer sleeping  
while hold.

Unfortunately, I've ran into something really weird now, when I unload  
the module:
---8<---
#0  doadump () at pcpu.h:195
#1  0xffffffff8049ef98 in boot (howto=260) at /usr/src/sys/kern/ 
kern_shutdown.c:418
#2  0xffffffff8049f429 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:574
#3  0xffffffff8075cd26 in trap_fatal (frame=0xc, eva=Variable "eva" is  
not available.
) at /usr/src/sys/amd64/amd64/trap.c:764
#4  0xffffffff8075da62 in trap (frame=0xffffffff87699940) at /usr/src/ 
sys/amd64/amd64/trap.c:290
#5  0xffffffff80743bfe in calltrap () at /usr/src/sys/amd64/amd64/ 
exception.S:209
#6  0xffffffff8052a411 in strcmp (s1=0xffffffff80824a0c "sigacts",
     s2=0xffffffff877cd3a9 <Address 0xffffffff877cd3a9 out of bounds>)  
at /usr/src/sys/libkern/strcmp.c:45
#7  0xffffffff804d7c61 in enroll (description=0xffffffff80824a0c  
"sigacts", lock_class=0xffffffff80a19fe0)
     at /usr/src/sys/kern/subr_witness.c:1439
#8  0xffffffff804d7fb1 in witness_init (lock=0xffffff00016f4ca8) at / 
usr/src/sys/kern/subr_witness.c:618
#9  0xffffffff8049fd31 in sigacts_alloc () at /usr/src/sys/kern/ 
kern_sig.c:3280
#10 0xffffffff80481121 in fork1 (td=0xffffff0001384a50, flags=20,  
pages=Variable "pages" is not available.
) at /usr/src/sys/kern/kern_fork.c:453
#11 0xffffffff80481450 in fork (td=0xffffff0001384a50, uap=Variable  
"uap" is not available.
) at /usr/src/sys/kern/kern_fork.c:106
#12 0xffffffff8075d260 in syscall (frame=0xffffffff87699c80) at /usr/ 
src/sys/amd64/amd64/trap.c:907
#13 0xffffffff80743e0b in Xfast_syscall () at /usr/src/sys/amd64/amd64/ 
exception.S:330
#14 0x0000000800ca0a6c in ?? ()
--->8---

and in fra 7:
(kgdb) p *w
$5 = {w_name = 0xffffffff877cd3a9 <Address 0xffffffff877cd3a9 out of  
bounds>, w_class = 0xffffffff80a19fe0, w_list = {
     stqe_next = 0xffffffff80accce0}, w_typelist = {stqe_next =  
0xffffffff80accce0}, w_children = 0x0,
   w_file = 0xffffffff877d1fa0 <Address 0xffffffff877d1fa0 out of  
bounds>, w_line = 307, w_level = 0, w_refcount = 2,
   w_Giant_squawked = 0 '\0', w_other_squawked = 0 '\0',  
w_same_squawked = 0 '\0', w_displayed = 0 '\0'}
(kgdb) p *w->w_class
$6 = {lc_name = 0xffffffff808564e0 "sleep mutex", lc_flags = 9,  
lc_ddb_show = 0xffffffff80492e6b <db_show_mtx>,
   lc_lock = 0xffffffff804938be <lock_mtx>, lc_unlock =  
0xffffffff804933fc <unlock_mtx>}

This happens after modevent exists.

What puzzles me here is w_refcount of 2, while w_name is out of  
bounds. Locks I've created I properly destroyed (at least I think I  
did :)).

Cheers,
Nikola