Giant deadlock related to twe
noackjr at alumni.rice.edu
Thu Aug 26 11:38:35 PDT 2004
Doug White wrote:
> On Mon, 23 Aug 2004, Vinod Kashyap wrote:
>>> Just got this on my amd64 box. A disk flaked out in my machine, which
>>> has a 3ware 8006-2LP with 2 80GB drives in a RAID0. My X session locked
>>> up and was able to break to ddb. Some ddb twiddling follows. It looks
>>> like, at first glance, some sort of deadlock against softupdates.
>> The messages indicate timeouts due to the drive continuously returning
>> BUSY to the firmware on the controller. This could be caused by the
>> the drive going bad, or even a one time disturbance like tugging of
>> cables, etc.
> Right, and a failing drive it was, but it shouldn't lock up the entire
> system when it happens.
Why not? If the drive is continuously returning BUSY, wouldn't the
requests just keep getting retried and a process just wait for them to
successfully complete? To the user, this would manifest itself as a
lockup because the process would block. X and company do a lot of
reading/writing of temporary files, so what you are seeing makes sense to
me. I see a similar lockup when the NFS server hosting my home directory
goes down (SMP -CURRENT so it's been a bit exciting lately...). As soon
as the NFS server comes back up X jumps to life again.
More information about the freebsd-current