taskqueue timeout

Matthew Dillon dillon at apollo.backplane.com
Tue Jul 15 17:11:43 UTC 2008


:Hi everyone,
:
:I'm wondering if the problems described in the following link have been 
:resolved:
:
:http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2008-02/msg00211.html
:
:I've got four 500GB SATA disks in a ZFS raidz pool, and all four of them 
:are experiencing the behavior.
:
:The problem only happens with extreme disk activity. The box becomes 
:unresponsive (can not SSH etc). Keyboard input is displayed on the 
:console, but the commands are not accepted.
:
:Is there anything I can do to either figure this out, or work around it?
:
:Steve

    If you are getting DMA timeouts, go to this URL:

    http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting

    Then I would suggest going into /usr/src/sys/dev/ata (I think, on
    FreeBSD), locate all instances where request->timeout is set to 5,
    and change them all to 10.

	cd /usr/src/sys/dev/ata
	fgrep 'request->timeout' *.c
	... change all assignments of 5 to 10 ...

    Try that first.  If it helps then it is a known issue.  Basically
    a combination of the on-disk write cache and possible ECC corrections,
    remappings, or excessive remapped sectors can cause the drive to take
    much longer then normal to complete a request.  The default 5-second
    timeout is insufficient.

    If it does help, post confirmation to prod the FBsd developers to
    change the timeouts.

    --

    If you are NOT getting DMA timeouts then the ZFS lockups may be due
    to buffer/memory deadlocks.  ZFS has knobs for adjusting its memory
    footprint size.  Lowering the footprint ought to solve (most of) those
    issues.  It's actually somewhat of a hard issue to solve.  Filesystems
    like UFS aren't complex enough to require the sort of dynamic memory
    allocations deep in the filesystem that ZFS and HAMMER need to do.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the freebsd-stable mailing list