panic in ffs (Re: hangs in nbufkv)

Wed Oct 13 15:38:07 PDT 2004

=:I don't know, how, but the bug seems triggered by upping the
=:net.inet.udp.maxdgram from 9216 (default) to 16384 (to match the NFS
=:client's wsize). Once I do that, the machine will either panic or just
=:hang a few minutes into the heavy NFS writing (Sybase database dumps
=:from a Solaris server). Happened twice already...

=    Interesting. That's getting a bit outside the realm I can help
=    with. NFS and the network stack have been issues in FreeBSD
=    recently so its probably something related.

Actually, that's not it. Even if I don't touch any sysctl's, but simply
proceed loading the machine with our backup scripts, it will eventually
either hang (after many complains about WRITE_DMA problems with the
disk, NFS clients write to) or panic with:

 initiate_write_inodeblock_ufs2: already started

(in /sys/ufs/ffs/ffs_softdep.c). As for the WRITE_DMA problems, after
going through two disks, two cables, and two different on-board SATA
connectors, we concluded, the problem is with the ata-driver (hence
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/72451). As for panics,
I set the BKVASIZE back down to 16Kb, rebuilt the kernel and recreated
the filesystem, that used to have the 64K-bsize.

Machine still either panics or hangs under load.

May be, I should give a bit more details about the load. The load is
produced by a script, which tells the Sybase server to dump one database
at a time over NFS to the "staging" disk (single SATA150 drive) and, as
each database is dumped, compresses it onto the RAID5 array for storage.

When the thing is working properly, the Sybase server writes at or close
to the wire speed (9-11Mb/second). Unfortunately, the staging disk soon
starts throwing the above mentioned WRITE_DMA errors. Fortunately, those
are usually recoverable. Unfortunately, the machine eventually hangs
anyway...

I changed the script to use the RAID5-partition as the staging area
as well (this is the filesystem, that used to have 64Kb bsize and
8Kb fsize -- it is over 1Tb large) and it seems to work for now, but
the throughput is much lower, than it used to be (limited by the
raid-controller's i/o).

Another observation, I can make, is that 'bufdaemon' often takes up
50-80% of the CPU time (on a 2.2 Opteron!) while this script is running.
Not sure if that's normal or not.

	-mi