zcolli (zcollide) state, what does znode dying means?

Attila Nagy bra at fsn.hu
Wed Sep 22 13:22:03 UTC 2010


  On 09/22/10 12:38, Attila Nagy wrote:
> I have a machine, which is heavily hammered with file system 
> operations, running a very recent 8-STABLE.
> The symptom is that everything works fine for a few minutes, then a 
> lot of processes get into zcolli state (according to top). At that 
> there there are two outcomes:
> 1. the disks calm down for a while (for long seconds, there is no, or 
> very small amount of IO, verified with gstat), top shows nearly 100% 
> system, a lot of processes are on the run queue (load is in the sky, 
> around 300 and 1000), all operations stop, top refreshes, but I can't 
> really execute new programs, then suddenly the zcolli states change 
> and the IO resumes and the run queue decreases.
> 2. the system remains in this state, after 5-10 minutes there is still 
> no change, only a reset helps (doesn't even react to CTRL-ALT-DEL, but 
> running programs, like top still refreshes, but no disk IO can be made)
It turned out that due to a restart prefetch got enabled. On this 
machine it made so much extra IO (it does mostly random reads) that it 
could livelock itself. The only thing I don't understand is why the IO 
ceased during the mass zcollide period, that seems to be a wait for 
something scenario (sometimes endlessly), which is bad.


More information about the freebsd-fs mailing list