Reference count race window

Gumpula, Suresh Suresh.Gumpula at netapp.com
Fri Jan 3 02:38:25 UTC 2014


Hi Alfred,
      I agree that there could have been an extra/invalid  crfree() which decremented the count  and  looks valid  crhold(acquire) from socket code  panic'ed in my case. As per your suggestion if we 
 replace  the assert with if condition in release,  we will end up panicing  when the actual  crfree() happens.  But we may not be knowing who crfree'ed() in the first and invalid place.  Am I correct?   I will try your suggestion.

Can you please bit more explain your array trick ?

Thanks
Suresh



-----Original Message-----
From: owner-freebsd-hackers at freebsd.org [mailto:owner-freebsd-hackers at freebsd.org] On Behalf Of Alfred Perlstein
Sent: Thursday, January 02, 2014 8:21 PM
To: Gumpula, Suresh; Julian Elischer; freebsd-hackers at freebsd.org
Subject: Re: Reference count race window


On 1/2/14, 3:53 PM, Gumpula, Suresh wrote:
>>> Without changing the return-value semantics of refcount_acquire, we 
>>> have introduced a panic if we detected a race as below.
>>> static __inline void
>>> refcount_acquire(volatile u_int *count) {
>>>           u_int old;
>>>
>>>           old = atomic_fetchadd_int(count, 1);
>>>           if (old == 0) {
>>>             panic("refcount_acquire race condition detected!\n");
>>>           }
>>>>>> so what is the stacktrace of the panic?
> It's from the socket code calling crhold.   It's a non debug build( NO INVARIANTS )
>
> #4  0xffffffff80331d34 in panic (fmt=0xffffffff805c1e60 
> "refcount_acquire race condition detected!\n") at 
> ../../../../sys/kern/kern_shutdown.c:1009
> #5  0xffffffff80326662 in refcount_acquire (count=<optimized out>) at 
> ../../../../sys/sys/refcount.h:65
> #6  crhold (cr=<optimized out>) at 
> ../../../../sys/kern/kern_prot.c:1814
> #7  0xffffffff803aa0d9 in socreate (dom=<optimized out>, 
> aso=0xffffff80345c1b00, type=<optimized out>, proto=0, 
> cred=0xffffff0017d7aa00, td=0xffffff000b294410) at 
> ../../../../sys/kern/uipc_socket.c:441
> #8  0xffffffff803b2e5c in socket (td=0xffffff000b294410, 
> uap=0xffffff80345c1be0) at ../../../../sys/kern/uipc_syscalls.c:201
> #9  0xffffffff80539ecb in syscall (frame=0xffffff80345c1c80) at 
> ../../../../sys/amd64/amd64/trap.c:1260
>
If it's a non-debug build then how do you know that someone isn't incorrectly lowering the refcount?

Please try some invariants or at least manually turn on the one KASSERT I mentioned.

Another trick would be to add a an array of char*+int for the last few places that decremented, you can use the returned refcount as an index to that array to track who may be doing the extra frees.

-Alfred

_______________________________________________
freebsd-hackers at freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"


More information about the freebsd-hackers mailing list