still mbuf leak in 9.0 / 9.1?

dennis berger db at bsdsystems.de
Fri May 17 09:37:28 UTC 2013


Hi List,
I can confirm that it is the bug you mentioned steven.
Here is how I found it.

I recorded hourly zfskern and nfsd stats. like this.

echo "PROCSTAT" >> $reportname
pgrep -S "(zfskern|nfsd)" | xargs procstat -kk >> $reportname

luckily it crashed this night and logged this.

 1910 101508 nfsd             nfsd: service    mi_switch+0x186 sleepq_wait+0x42 _sleep+0x376 arc_lowmem+0x77 kmem_malloc+0xc1 uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0xb5 arc_read_nolock+0x1ec arc_read+0x93 dbuf_prefetch+0x12c dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x4a7 dmu_buf_hold_array_by_dnode+0x16b dmu_buf_hold_array+0x67 dmu_read_uio+0x3f zfs_freebsd_read+0x3e3 

Maybe it would be good to merge this fix into RELENG_9_1 and distribute a fix via freebsd-update what do you think?

best,
-dennis


Am 16.05.2013 um 11:42 schrieb dennis berger:

> This is indeed a ZFS+NFS system and I can see that istgt and nfs are stuck in some ZIO state. Maybe it's this. 
> Thank's for pointing out. 
> 
> Is it this ZFS+NFS deadlock?
> 
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c 
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c 
> @@ -3720,8 +3720,16 @@ arc_lowmem(void *arg __unused, int howto __unused) 
> 	mutex_enter(&arc_reclaim_thr_lock); 
> 	needfree = 1; 
> 	cv_signal(&arc_reclaim_thr_cv); 
> -	while (needfree) 
> -	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0); 
> + 
> +	/* 
> +	 * It is unsafe to block here in arbitrary threads, because we can come 
> +	 * here from ARC itself and may hold ARC locks and thus risk a deadlock 
> +	 * with ARC reclaim thread. 
> +	 */ 
> +	if (curproc == pageproc) { 
> +	 while (needfree) 
> +	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0); 
> +	} 
> 	mutex_exit(&arc_reclaim_thr_lock); 
> 	mutex_exit(&arc_lowmem_lock); 
> }
> 
> I'll try to crash our testsystem. I'll assume that stressing NFS backed with ZFS a lot might trigger this bug?
> 
> -dennis
> 
> 
> Am 16.05.2013 um 00:03 schrieb Steven Hartland:
> 
>> ----- Original Message ----- From: "dennis berger" <db at nipsi.de>
>>> FreeBSD  9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4 09:23:10 UTC 2012
>>> 
>>>> 3. Regarding this:
>>>>>> A clean shutdown isn't possible though. It hangs after vnode
>>>>>> cleaning, normally you would see detaching of usb devices here, or
>>>>>> other devices maybe?
>>>> Please don't conflate this with your above issue.  This is almost
>>>> certainly unrelated.  Please start a new thread about that if desired.
>>> 
>>> Maybe this is a misunderstanding normally this system will shutdown cleanly, of course.
>>> This hang only appears after the network problem above.
>> 
>> If this is a ZFS system, its a known issue which is fixed in current,
>> stable-9, stable-8 and the upcoming 8.4 release.
>> 
>> If not and you have USB devices see if the following sysctl helps:
>> hw.usb.no_shutdown_wait=1
>> 
>>  Regards
>>  Steve
>> 
>> ================================================
>> This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 
>> In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
>> or return the E.mail to postmaster at multiplay.co.uk.
>> 
>> _______________________________________________
>> freebsd-stable at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
> 
> 
> 
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"





More information about the freebsd-stable mailing list