Server hang : fsck

Polytropon freebsd at edvax.de
Fri Jan 10 21:22:15 UTC 2014


On Fri, 10 Jan 2014 12:34:25 +0530, eras mus wrote:
> Dear List,
> 
> I tried
> 
> fsck -yf /usr yesterday evening at 6 pm.
> Images are here
> http://picpaste.com/img1-4zq2ytTk.jpg
> http://picpaste.com/img2-uXfJ8REF.jpg
> 
> Left it running and morning 10 a.m today found the message
> 
> FILE SYSTEM DIRTY
> FILE SYSTEM MODIFIED
> rerun fsck
> 
> Then went in setup and changed boot made APIC disabed.
> and went into boot option 2 boot ACPI disabled.

This is not what the message told you to do, so don't expect
a miracle. :-)

The TIMEOUT READ_DMA and FAILURE READ_DMA errors indicate
(as I did assume) a _severe_ hardware error. The disk is
dying right now. Also see the "CANNOT READ BLK" messages
from fsck (the white message are kernel-level errors).

You could try to use WD's tools to check the disk. The
UBCD - http://www.ultimatebootcd.com/ - probably has some
tools for that task, for example a S.M.A.R.T. diagnostic
tool as well as manufacturer tools for diagnostics and
low level format which, in _some_ cases, can revive the
disk, sometimes by the price of erasing it.

The errors you are seeing indicate that the disk is probably
out of "replacement sectors". When sectors become unreadable,
the disk internally re-arranges data without showing any
sign to the OS. When it can't do that anymore, the errors
start "bubbling up" as you can now see.

This kind of error is, in most cases, a physical one.



> It gave the following message:
> 
> The following filesystem HAD AN UNEXPECTED INCONSISTENCY
> ufs: /dev/ad4s1e(/usr)
> Automatic file system  check failed: help!
> Jan 10 16:16:59 init:/bin/sh on etc/rc terminated abnormally, going to
> single user mode

This is correct. The file system is not in a state where it
can be mounted.

Take this into mind when attempting to rescue data from that
partition: If you use the standard dump | restore approach,
you might end up with defective data.

My suggestion would be: Gather from the disk what you absolutely
have to. Use forensic tools if it really _really_ needs to be.
Usually, you will copy data from /home. You should also have
a look at configuration files in /etc (/ partition). Maybe also
have a look at /var/db/pkg to make a list of what software you
have installed.

Of course you should have all of them in your backup, so you
can restore from that.

Destroy the disk physically and dispose it properly.

Then install a new system on a new disk. Start from scratch,
it will probably be easier. You can use the list of software
mentioned above to install everything "as it was before". Then
bring your configuration back into place. Finally add your user
data.



> As advice by Polytropon burnt alive CD And ran fsck manually.
>  # fsck -yfv /dev/ad4s1a
>  # fsck -yfv /dev/ad4s1d
> 
> are successful.

Very good!



> But when ran
>  # fsck -yfv /dev/ad4s1e
> 
> It was messages as in
> 
> http://picpaste.com/img3-It4JOaph.jpg

Correct. Say goodbye to your /usr partition. To be honest,
it probably doesn't contain anything important. You could
try to

	# mount -o ro -t ufs /dev/ad4s1e /mnt

and then copy what you _really_ need, for example configuration
files for locally added software.



Again: The disk is dead. There's probably not much you can do
with it now. Get whatever data you need to get, and then get rid
of the disk. There's no way to magically repair a dead disk. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...


More information about the freebsd-questions mailing list