When will ZFS become stable?
    Kris Kennaway 
    kris at FreeBSD.org
       
    Sun Jan  6 08:03:49 PST 2008
    
    
  
Henri Hennebert wrote:
> Kris Kennaway wrote:
>> Ivan Voras wrote:
>>> On 06/01/2008, Peter Schuller <peter.schuller at infidyne.com> wrote:
>>>>> This number is not so large. It seems to be easily crashed by rsync,
>>>>> for example (speaking from my own experience, and also some of my
>>>>> colleagues).
>>>> I can definitely say this is not *generally* true, as I do a lot of
>>>> rsyncing/rdiff-backup:ing and similar stuff (with many files / large 
>>>> files)
>>>> on ZFS without any stability issues. Problems for me have been 
>>>> limited to
>>>> 32bit and the memory exhaustion issue rather than "hard" issues.
>>>
>>> It's not generally true since kmem problems with rsync are often hard
>>> to repeat - I have them on one machine, but not on another, similar
>>> machine. This nonrepeatability is also a part of the problem.
>>>
>>>> But perhaps that's all you are referring to.
>>>
>>> Mostly. I did have a ZFS crash with rsync that wasn't kmem related,
>>> but only once.
>>
>> kmem problems are just tuning.  They are not indicative of stability 
>> problems in ZFS.  Please report any further non-kmem panics you 
>> experience.
> 
> I encounter 2 times a deadlock during high I/O activity (the last one 
> during rsync + rm -r on a 5GB hierarchy (openoffice-2/work).
> 
> I was running with this patch:
> http://people.freebsd.org/~pjd/patches/zgd_done.patch
> db> show allpcpu
> Current CPU: 1
> 
> cpuid        = 0
> curthread    = 0xa5ebe440: pid 3422 "txg_thread_enter"
> curpcb       = 0xeb175d90
> fpcurthread  = none
> idlethread   = 0xa5529aa0: pid 12 "idle: cpu0"
> APIC ID      = 0
> currentldt   = 0x50
> 
> cpuid        = 1
> curthread    = 0xa56ab220: pid 47 "arc_reclaim_thread"
> curpcb       = 0xe6837d90
> fpcurthread  = none
> idlethread   = 0xa5529880: pid 11 "idle: cpu1"
> APIC ID      = 1
> currentldt   = 0x50
> 
> With the 2 times arc_reclaim_thread `running`
Backtraces of the affected processes (or just alltrace) are usually 
required to proceed with debugging, and lock status is also often vital 
(show alllocks, requires witness).  Also, in the case when threads are 
actually running (not deadlocked), then it is often useful to repeatedly 
break/continue and sample many backtraces to try and determine where the 
threads are looping.
Kris
    
    
More information about the freebsd-current
mailing list