RELENG_7/i386: ZFS constant panic on file system writes

Dmitry Morozovsky marck at
Wed Apr 8 08:32:38 PDT 2009

On Wed, 8 Apr 2009, Dmitry Morozovsky wrote:

DM> PJD> > DM> could you please help me a bit with *very* unpleasant situation: one of my 
DM> PJD> > DM> servers with very large ZFS reboots on most write requests to one (largest, 
DM> PJD> > DM> which effectively prohibits recreating) ZFS file system with
DM> PJD> > DM> 
DM> PJD> > DM> panic: avl_find() succeeded inside avl_add()
DM> PJD> > 
DM> PJD> > Is there a way I can clear the directory in question? Even the latest -current 
DM> PJD> > panics when I try to access the directory containing this file.
DM> PJD> Could you try running 'zpool scrub' on this pool? Nothing better comes
DM> PJD> to my mind, it looks like some kind of internal inconsistency and
DM> PJD> hopefully scrub will be able to find it. Could you also show 'zpool status'
DM> PJD> output?
DM> zpool status is showing everything ok:
DM> marck at moose:~> zpool status
DM>   pool: m
DM>  state: ONLINE
DM>  scrub: none requested
DM> config:
DM> 	m           ONLINE       0     0     0
DM> 	  raidz1    ONLINE       0     0     0
DM> 	    ad4h    ONLINE       0     0     0
DM> 	    ad6h    ONLINE       0     0     0
DM> 	    ad8h    ONLINE       0     0     0
DM> 	    ad10h   ONLINE       0     0     0
DM> 	    ad12h   ONLINE       0     0     0
DM> errors: No known data errors
DM> will try scrub, thank you!

Unfortunately, it does not help:

 scrub: scrub completed with 0 errors on Wed Apr  8 19:04:51 2009

and then

root at moose:~# ls -la /ar/nfstat/nfc/.bad/200807
total 9089
drwxr-xr-x  3 rscript  wheel        4 Nov  5 21:01 ./
d---------  3 root     wheel        3 Apr  7 14:29 ../
drwxr-xr-x  2 rscript  wheel       36 Apr  2 22:12 daily/
-rw-r--r--  1 rscript  wheel  9207828 Aug  1  2008 total.200807
root at moose:~# ls -la /ar/nfstat/nfc/.bad/200807/daily/

panic: avl_find() succeeded inside avl_add()
cpuid = 2
[-- marck at localhost detached -- Wed Apr  8 19:28:13 2009]
[-- marck at localhost attached -- Wed Apr  8 19:28:15 2009]
[halt sent]
KDB: enter: Line break on console
[thread pid 153 tid 100152 ]
Stopped at      kdb_enter_why+0x3a:     movl    $0,kdb_why
db> reboot
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 1

I can set up an account for you to serial console for this server, if it can 

