Probably Hardware Trouble But What Is It?
Paul Pathiakis
pathiaki2 at yahoo.com
Sun Dec 7 17:08:20 UTC 2014
Drew,
Just trying to assist....
From the look of it, something is definitely failing and it is either
the controller or the disk. FreeBSD is trying to stay alive. (I've had
something similar happen in the past. When I rebooted, a disk showed to
be faulted and inaccessible.)
I'd theorize that the first line about the kernel maxfiles being
exceeded by root (borrowing you haven't changed the setting) is due to
the failure trying to allocate file handles to handle the requests that
can't be completed due to the failure.
If you have access to the console and another drive, you may want to
connect a second drive, configure it to mirror the first and hope that
it can mirror the first. If it works, great. BTW, don't forget to
install bootblocks if this is your boot drive.
Now, if it doesn't start to mirror the drive after being attached,
you're going to have to reboot. That's probably going to show you the
real failure. :-(
If the controller card is onboard, not much you can do. If it's a PCIe
bus card, try to re-seat it. Sometimes things get pulled on, or hit
inadvertently and aren't sitting in the slot correctly any more.
I agree with the other post in either replacing the connecting cables
and/or re-seating them.
If, after all this, it doesn't work, it's probably the disk itself.
Now, comes the patient part. If it's the drive, it's probably pretty
hot from failing and trying to do it's job. Don't laugh at this it's
worked for me 5 out of 7 times. Remove it from the machine, let it cool
to room temperature on anti-static bag. Once cool, put it in the bag,
put it in your freezer for at least three hours. Re-insert into the
machine. (At this point, you should have that other drive for the
mirror connected.) If the drive isn't a catastrophic loss, it will work
for a short time. I recommend you allow it to mirror. Ask the drive to
do NOTHING but let it sit and mirror while in single-user mode.
However, before going to that last 'iffy' part, check everything before
that.
P.
On 12/06/2014 19:58, Drew Tomlinson wrote:
> I'm running FBS 9.1 RELEASE that I built several years ago. It's
> mostly a Samba server and has "just worked" so I've never done much
> more with it. However recently, I find it "locked up" with thousands
> of these messages on the console:
>
> kernel: kern.maxfiles limit exceeded by uid 0, please see tuning(7)
>
> I've looked in /var/log/messages and also see lots of messages like
> these:
>
> Dec 6 13:55:53 vm kernel: siisch0: ... waiting for slots 18000000
> Dec 6 13:55:53 vm kernel: siisch0: Timeout on slot 28
> Dec 6 13:55:53 vm kernel: siisch0: siis_timeout is 00040000 ss
> 78000000 rs 78000000 es 00000000 sts 801b0000 serr 00000000
> Dec 6 13:55:53 vm kernel: siisch0: ... waiting for slots 08000000
> Dec 6 13:55:55 vm kernel: siisch0: Timeout on slot 27
> Dec 6 13:55:55 vm kernel: siisch0: siis_timeout is 00040000 ss
> 78000000 rs 78000000 es 00000000 sts 801b0000 serr 00000000
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): FLUSHCACHE48. ACB: ea
> 00 00 00 00 40 00 00 00 00 00 00
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): CAM status: Command
> timeout
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): Retrying command
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): READ_FPDMA_QUEUED.
> ACB: 60 01 fe d8 74 40 39 00 00 00 00 00
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): CAM status: Command
> timeout
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): Retrying command
> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): READ_FPDMA_QUEUED.
> ACB: 60 0a a5 7f 00 40 4c 00 00 00 00 00
>
> This machine uses zfs. I have two pools:
>
> # zpool list
> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
> zback 1.81T 848G 1008G 45% 1.00x ONLINE -
> zroot 1.81T 1.16T 666G 64% 1.00x ONLINE -
>
> Then I tried this and my ssh window is now stuck:
>
> # zpool status
> pool: zback
> state: ONLINE
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run 'zpool
> clear'.
> see: http://illumos.org/msg/ZFS-8000-HC
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> zback ONLINE 3 0 0
> ada0 ONLINE 4 0 0
>
> I opened another ssh window and tried 'zpool clear zback' as suggested
> but it appears stuck too.
>
> I'm sure I haven't provided all the relevant information so please ask
> and I will do so. I'd appreciate any guidance on how to take a proper
> backup of ada0 and what I should do next. I think this zback pool is
> just the one disk which is a 2TB drive. I'd like to know how to
> confirm that if possible since it seems the zpool commands aren't able
> to complete.
>
> I appreciate any suggestions or guidance.
>
> Thanks,
>
> Drew
>
More information about the freebsd-questions
mailing list