Advisory lock crashes.

Alfred Perlstein alfred at ixsystems.com
Tue Feb 19 08:07:17 UTC 2013


On 2/18/13 11:32 PM, Konstantin Belousov wrote:
> On Mon, Feb 18, 2013 at 11:08:34PM -0800, Alfred Perlstein wrote:
>> Hello Konstantin & Doug,
>>
>> We're getting a few crashes in what looks to be kern_lockf.c:
>>
>> fault address here is 0x360 which appears to mean that the "sx" owner
>> thread is NULL
> What is the version of FreeBSD ?
This is a releng 9.0 system.  (note, we have the most up to date version 
of this file with the exception of a cosmetic diff for MALLOC defines).

> What is the filesystem owning the file which was advlocked ?
I'm pretty sure that is going to be ZFS.

> Show the line number for lf_advlockasync+0x5d7.

> (kgdb) list *(lf_advlockasync+0x5d7)
> 0xffffffff80604fc7 is in lf_advlockasync (sx.h:152).
> 147     {
> 148             uintptr_t tid = (uintptr_t)td;
> 149             int error = 0;
> 150
> 151             if (!atomic_cmpset_acq_ptr(&sx->sx_lock, 
> SX_LOCK_UNLOCKED, tid))
> 152                     error = _sx_xlock_hard(sx, tid, opts, file, line);
> 153             else
> 154 LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(LS_SX_XLOCK_ACQUIRE,
> 155                         sx, 0, 0, file, line);
> 156
That may not be helpful so I've included this:
/usr/home/alfred # bc
ibase=16
5D7
1495

(kgdb) disasse lf_advlockasync
Dump of assembler code for function lf_advlockasync:
0xffffffff806049f0 <lf_advlockasync+0>: push   %rbp
0xffffffff806049f1 <lf_advlockasync+1>: mov    %rdx,%rcx
> 0xffffffff80604f70 <lf_advlockasync+1408>:      mov    -0x80(%rbp),%rdi
> 0xffffffff80604f74 <lf_advlockasync+1412>:      xor %ecx,%ecx
> 0xffffffff80604f76 <lf_advlockasync+1414>:      xor %edx,%edx
> 0xffffffff80604f78 <lf_advlockasync+1416>:      mov %rbx,%rsi
> 0xffffffff80604f7b <lf_advlockasync+1419>:      callq 
> 0xffffffff806246d0 <_sx_xunlock_hard>
> 0xffffffff80604f80 <lf_advlockasync+1424>:      jmpq 
> 0xffffffff80604c53 <lf_advlockasync+611>
> 0xffffffff80604f85 <lf_advlockasync+1429>:      mov -0x58(%rbp),%rcx
> 0xffffffff80604f89 <lf_advlockasync+1433>:      xor %r12d,%r12d
> 0xffffffff80604f8c <lf_advlockasync+1436>:      mov 0x18(%rcx),%edi
> 0xffffffff80604f8f <lf_advlockasync+1439>:      callq 
> 0xffffffff80603b90 <lf_clearremotesys>
> 0xffffffff80604f94 <lf_advlockasync+1444>:      jmpq 
> 0xffffffff80604c70 <lf_advlockasync+640>
> 0xffffffff80604f99 <lf_advlockasync+1449>:      lea 0xc8(%r13),%rdi
> 0xffffffff80604fa0 <lf_advlockasync+1456>:      xor %r8d,%r8d
> 0xffffffff80604fa3 <lf_advlockasync+1459>:      xor %ecx,%ecx
> 0xffffffff80604fa5 <lf_advlockasync+1461>:      xor %edx,%edx
> 0xffffffff80604fa7 <lf_advlockasync+1463>:      mov %rbx,%rsi
> 0xffffffff80604faa <lf_advlockasync+1466>:      callq 
> 0xffffffff8060a1f0 <_mtx_lock_sleep>
> 0xffffffff80604faf <lf_advlockasync+1471>:      jmpq 
> 0xffffffff80604f2e <lf_advlockasync+1342>
> 0xffffffff80604fb4 <lf_advlockasync+1476>:      mov -0x80(%rbp),%rdi
> 0xffffffff80604fb8 <lf_advlockasync+1480>:      xor %r8d,%r8d
> 0xffffffff80604fbb <lf_advlockasync+1483>:      xor %ecx,%ecx
> 0xffffffff80604fbd <lf_advlockasync+1485>:      xor %edx,%edx
> 0xffffffff80604fbf <lf_advlockasync+1487>:      mov %rbx,%rsi
> 0xffffffff80604fc2 <lf_advlockasync+1490>:      callq 
> 0xffffffff80624210 <_sx_xlock_hard>
> 0xffffffff80604fc7 <lf_advlockasync+1495>:      jmpq 
> 0xffffffff80604f15 <lf_advlockasync+1317>
> 0xffffffff80604fcc <lf_advlockasync+1500>:      lea 0xc8(%r13),%rdi
> 0xffffffff80604fd3 <lf_advlockasync+1507>:      xor %ecx,%ecx
> 0xffffffff80604fd5 <lf_advlockasync+1509>:      xor %edx,%edx
> 0xffffffff80604fd7 <lf_advlockasync+1511>:      xor %esi,%esi
> 0xffffffff80604fd9 <lf_advlockasync+1513>:      callq 
> 0xffffffff8060a040 <_mtx_unlock_sleep>
> 0xffffffff80604fde <lf_advlockasync+1518>:      jmpq 
> 0xffffffff80604f5c <lf_advlockasync+1388>
> 0xffffffff80604fe3 <lf_advlockasync+1523>:      mov %r15,(%rcx)
> 0xffffffff80604fe6 <lf_advlockasync+1526>:      mov %r15,%r14
> 0xffffffff80604fe9 <lf_advlockasync+1529>:      mov %gs:0x0,%rax
> 0xffffffff80604ff2 <lf_advlockasync+1538>:      lock cmpxchg 
> %rbx,0xe0(%r13)


>
> No, I never saw nothing similar in last 3 years.

Yes, I'd suspect we'd all see more things here.  We're very much capable 
of adding instrumentation to the OS/kernel to help track this down if 
you have ideas.

-Alfred


>> db>  bt
>> Tracing pid 5099 tid 101614 td 0xfffffe005d54e8c0
>> _sx_xlock_hard() at _sx_xlock_hard+0xb3
>> lf_advlockasync() at lf_advlockasync+0x5d7
>> lf_advlock() at lf_advlock+0x47
>> vop_stdadvlock() at vop_stdadvlock+0xb3
>> VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0x4a
>> closef() at closef+0x352
>> kern_close() at kern_close+0x172
>> amd64_syscall() at amd64_syscall+0x58a
>> Xfast_syscall() at Xfast_syscall+0xf7
>> --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8011651fc, rsp = 0x7fffffbfdd58, rbp = 0x807c3d6c0 ---
>>
>> (kgdb) list *(_sx_xlock_hard+0xb3)
>> 0xffffffff806242c3 is in _sx_xlock_hard
>> (/usr/home/jpaetzel/9.0.6-RELEASE-p1/FreeBSD/src/sys/kern/kern_sx.c:514).
>> 509                     x = sx->sx_lock;
>> 510                     if ((sx->lock_object.lo_flags & SX_NOADAPTIVE)
>> == 0) {
>> 511                             if ((x & SX_LOCK_SHARED) == 0) {
>> 512                                     x = SX_OWNER(x);
>> 513                                     owner = (struct thread *)x;
>> 514                                     if (TD_IS_RUNNING(owner)) {
>> 515                                             if
>> (LOCK_LOG_TEST(&sx->lock_object, 0))
>> 516 CTR3(KTR_LOCK,
>> 517                                                 "%s: spinning on %p
>> held by %p",
>> 518 __func__, sx, owner);
>>
>>
>> Another panic here, which we have less information is attached as an image.
>>
>> We're looking at using some INVARIANTS and WITNESS kernels, but was
>> wondering if y'all had any other suggestions to use please?
>>
>> thank you,
>> -Alfred
>



More information about the freebsd-fs mailing list