Duplicate free

David Cecil david.cecil at nokia.com
Sat Jun 23 09:43:01 UTC 2007


Hi Arne,

one of our QA engineers has produced the problem.  Our code is currently 
6.1 based, so we can't do the same thing with 6.2 or 7.0.  I see nothing 
in 6.2 to indicate it's fixed, based on the commit log or changes to 
geom_io.c (I should have looked at g_mirror.c too).  We have not changed 
GEOM.

I was planning to ask him about reproducability too.  As far as I know, 
it was primarily network load, thogh there was a relatively small amount 
of that.

Thanks,
Dave

ext Arne W?rner" <arne_woerner at yahoo.com>"@mgw-mx04.nokia.com wrote:
> How do u do that?
> Any special test program?
>
> Does it happen in R6.2 and/or 7-CUR, too?
>
> R there any exceptional events shortly before it happens?
>
> -Arne
>
>
> --- David Cecil <david.cecil at nokia.com> wrote:
>
>   
>> A little more information that I hope might trigger some thoughts.
>>
>> I have also seen a trace very similar to the one in my original mail, 
>> but instead of bio_done calling g_disk_done, it calls g_mirror_done.  
>> This time it panics because it tries to dereference the bio_from field.  
>> The common thread is that in both cases, the bio is freed prior to the 
>> done handler referencing it (well at least the bio_from field is NULL in 
>> the g_mirror_case, so I'm assuming it was freed).  I wonder if these 
>> handlers are not playing nicely together?  Some locking required between 
>> them that is missing maybe?
>>
>>
>> Unread portion of the kernel message buffer:
>> ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=46231338
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 1; apic id = 01
>> fault virtual address   = 0x0
>> fault code              = supervisor read, page not present
>> instruction pointer     = 0x20:0x8053318b
>> stack pointer           = 0x28:0xe1466c4c
>> frame pointer           = 0x28:0xe1466c54
>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>                         = DPL 0, pres 1, def32 1, gran 1
>> processor eflags        = interrupt enabled, resume, IOPL = 0
>> current process         = 19 (swi6: task queue)
>> trap number             = 12
>> panic: page fault
>> cpuid = 1
>> KDB: stack backtrace:
>> db_trace_self_wrapper(8075f7be) at db_trace_self_wrapper+0x25
>> kdb_backtrace(100,863ba190,28,e1466c0c,c,...) at kdb_backtrace+0x29
>> panic(807403d4,8077f4a5,0,fffff,863be29b,...) at panic+0x124
>> trap_fatal(e1466c0c,0,863ba190,0,c,...) at trap_fatal+0x2ce
>> trap_pfault(e1466c0c,0,0) at trap_pfault+0x1e7
>> trap(e1460008,28,28,8a0cb840,89b1a6b4,...) at trap+0x36d
>> calltrap() at calltrap+0x5
>> --- trap 0xc, eip = 0x8053318b, esp = 0xe1466c4c, ebp = 0xe1466c54 ---
>> g_mirror_done(89b1a6b4) at g_mirror_done+0xb
>> biodone(89b1a6b4) at biodone+0x8b
>> ad_done(8a0cb840) at ad_done+0x2a
>> ata_completed(8a0cb840,0) at ata_completed+0x534
>> taskqueue_run(8644bac0,e1466cec,80566279,0,0,...) at taskqueue_run+0xbd
>> taskqueue_swi_run(0) at taskqueue_swi_run+0xe
>> ithread_execute_handlers(863be200,86433000) at 
>> ithread_execute_handlers+0x139
>> ithread_loop(86468460,e1466d38) at ithread_loop+0x64
>> fork_exit(805662ec,86468460,e1466d38) at fork_exit+0x71
>> fork_trampoline() at fork_trampoline+0x8
>> --- trap 0x1, eip = 0, esp = 0xe1466d6c, ebp = 0 ---
>>
>>
>>
>> ext David Cecil wrote:
>>     
>>> Hi,
>>>
>>> I've encountered a duplicate free in 6.1-RELEASE-based code.  I have 
>>> noticed that the same stack trace was reported about two years ago on 
>>> a number of occasions, and some code was added to try and help debug 
>>> the situation.  However, I don't see any resolution.  Does anyone have 
>>> any more information on this before I try and debug it further?
>>>
>>> Any hints for trying to find who freed it first?  Maybe I should add 
>>> the KTR debug that went into 1.66 of geom_io.c.
>>>
>>> db> bt
>>> Tracing pid 17 tid 100016 td 0x86badaf0
>>> kdb_enter(80750631) at kdb_enter+0x2b
>>> panic(807786cd,8a778d80,81856780,80747d25,807786b1,...) at panic+0x137
>>> uma_dbg_free(81856780,0,8a778d80) at uma_dbg_free+0x110
>>> uma_zfree_arg(81856780,8a778d80,0) at uma_zfree_arg+0x66
>>> g_destroy_bio(8a778d80,805319b4,8a778d80,e1ca3c60,805c9620,...) at 
>>> g_destroy_bio+0x13
>>> g_disk_done(8a778d80) at g_disk_done+0x62
>>> biodone(8a778d80) at biodone+0x58
>>> ad_done(88042840) at ad_done+0x2a
>>> ata_completed(88042840,0,86c38cdc,0,80753b6d,...) at ata_completed+0x504
>>> taskqueue_run(86c38cc0,e1ca3cec,8056999a,0,0,...) at taskqueue_run+0x86
>>> taskqueue_swi_run(0) at taskqueue_swi_run+0xe
>>> ithread_execute_handlers(86c14000,86c29280) at 
>>> ithread_execute_handlers+0xfa
>>> ithread_loop(86c534f0,e1ca3d38,86c534f0,80569a10,0,...) at 
>>> ithread_loop+0x76
>>> fork_exit(80569a10,86c534f0,e1ca3d38) at fork_exit+0xa0
>>> fork_trampoline() at fork_trampoline+0x8
>>> --- trap 0x1, eip = 0, esp = 0xe1ca3d6c, ebp = 0 ---
>>>
>>> The panic string is:
>>> Duplicate free of item 0x8a778d80 from zone 0x81856780(g_bio)
>>>
>>> Thanks,
>>> Dave
>>> _______________________________________________
>>> freebsd-geom at freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-geom
>>> To unsubscribe, send any mail to "freebsd-geom-unsubscribe at freebsd.org"
>>>       
>> -- 
>> Software Engineer
>> Secure and Mobile Connectivity
>> Enterprise Solutions
>> Nokia
>> +61 7 5553 8307 (office)
>> +61 412 728 222 (cell)
>>
>> _______________________________________________
>> freebsd-geom at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-geom
>> To unsubscribe, send any mail to "freebsd-geom-unsubscribe at freebsd.org"
>>
>>     
>
>
>
>  
> ____________________________________________________________________________________
> 8:00? 8:25? 8:40? Find a flick in no time 
> with the Yahoo! Search movie showtime shortcut.
> http://tools.search.yahoo.com/shortcuts/#news
>   



More information about the freebsd-geom mailing list