hangs in nbufkv
dillon at apollo.backplane.com
Mon Oct 11 13:38:14 PDT 2004
:While investigating the server's hanging, I noticed some processes in
:the `nbufkv' state (even a graceful reboot becomes impossible: "some
:processes would not die..."). Quick search brought up links like:
:One of our file systems here does, indeed, use large block size (64K, I
:think, not sure, how to verify it) -- it is used for storing large
:database dumps. Are the bugs, Bruce and Matt are talking about, supposed
:to be gone by now (in which case, I can provide more debugging info), or
:does this remain a "known problem" and I should simply adopt the
:workaround suggested by Bruce in the first link above -- increase
:BKVASIZE? Should I also merge the patch posted by Bruce in the last of
:the links above, or are there good reasons, it is not in the official tree?
:In the former case, what would anyone need to know to help fix this
:In the latter -- what is a good BKVASIZE value for an amd64 opteron with
:2Gb of memory, intended, primarily, to keep database archives online and
Well, this sort of deadlock ought to be easy to debug if you can
obtain a kernel core (and have the associated kernel.debug), but
one of the FreeBSD developers would have to track it down, I'm
hip deep in other things.
The most likely scenario is that either vfs.lobufspace/vfs.hibufspace
needs tuning, or vfs.lofreebuffers/vfs.hifreebuffers needs tuning
to overcome the fragmentation issue. You could try reducing both
vfs.lobufspace and hibufspace somewhat plus increase their spread,
and you could also try increasing vfs.lofreebuffers and hifreebuffers
and increasing their spread. You can also try reducing
vfs.lodirtybuffers and vfs.hidirtybuffers but it is unlike that those
are the cause unless they were specifically tuned up.
But to be absolutely safe, I would follow Bruce's original suggestion
and increase BKVASIZE to 64K, for your particular system.
The only caveat with doing that is that is that it drastically reduces
the number of buffers available in the system. You can compensate
somewhat by increasing the number of buffers in the system (kern.nbuf
boot-time kernel environment variable), but then you may run the kernel
out of KVM (this is especially true on FreeBSD due to the fact that
kmem_map still exists).
Ultimately these will become non-issues once the buffer cache is moved
to a default non-mapping mode for situations where no mapping is needed
(e.g. file data buffer -> DMA to/from disk), but you'd have to ask PHK
about that vis-a-vie FreeBSD. I have similar plans for DragonFly but
nothing is finished yet.
<dillon at backplane.com>
More information about the freebsd-current