ad0: FAILURE - WRITE_DMA

Mikhail P. miha at ghuug.org
Sat Oct 9 10:00:48 PDT 2004


On Saturday 09 October 2004 16:23, Dag-Erling Smørgrav wrote:
> "Mikhail P." <miha at ghuug.org> writes:
> > On Saturday 09 October 2004 15:01, Dag-Erling Smørgrav wrote:
> > > A lot of them, or just one or two?  Some ATA drives will spin down at
> > > regular intervals to recalibrate, and you'll get a harmless timeout if
> > > you try to write to the disk while it's doing that.
> >
> > Unfortunately, all the drives (so far - four 200GB drives).
>
> I meant "a lot of timeouts", not "a lot of drives".  If you only get
> one or two timeouts per drive at regular intervals (say, once a
> month), they're just recalibrating and there's nothing to worry about.
>

Well, there is no pattern. Often it just happens by itself - system runs 3-10 
days fine (no warnings, no timeouts), and after that time I start seeing lots 
of these. To be more exact, for example I have user who's home dir 
is /home/user; user uses FTP to upload/download files under that directory. 
Let's say he has 5k files in total (ranging in size from 1kb to 20mb), so 
what happens is that when user tries to access certain files (either to 
continue upload, or continue download of the file), system spews lots of 
these timeouts and basically "input/ourput error" occurs. For example, 
yesterday it showed 360 of these messages during 12 hour period, and 
unfortunately during the time I was sleeping system has locked itself - last 
message in /var/log/messages was regarding ad0 failure.
I'm not exactly sure on which files it timed out yesterday, but I do know 
under which directory it happened - directory has 20k files in it (not in the 
single dir, but including subdirs). Maybe someone knows a quick way I could 
open every file in under that directory - this could probably help to 
identify exactly on which file timeouts happened.

Before replacing the drives, I had that server up for 120 days, and it did 
spew these messages (more and more with every day, started on about 90th day 
of uptime count). After rebooting system, it asked for fsck, which I did run, 
but it showed some softupdates inconsistencies, and refused to mount /home in 
rw.

By the way, I just ran fsck on rw mounted /home (that's where those timeouts 
occurred yesterday), and I have attached it's output.

I also got another message off-list, where author suggested to play with UDMA 
values. I switched from UDMA100 to UDMA66. System's uptime is 12 hours, and 
no timeouts so far.. but I'm quite sure they will get back in few days.

> BTW, are you using ataidle or anything similar?

nope, nothing.

>
> DES

regards,
M.
-------------- next part --------------
[root]@[beer]:/usr/local/etc/rc.d> fsck /home
** /dev/ad0s1g (NO WRITE)
** Last Mounted on /home
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
LINK COUNT FILE I=8715003  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715004  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715005  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715006  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715007  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715008  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715009  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715010  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715016  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715017  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715080  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715086  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715087  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715093  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715094  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715100  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715101  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715107  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715129  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715142  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715143  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715156  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715157  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

LINK COUNT FILE I=8715163  OWNER=noc MODE=0
SIZE=0 MTIME=Oct  9 09:50 2004  COUNT 0 SHOULD BE -1
ADJUST? no

** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? no

BLK(S) MISSING IN BIT MAPS
SALVAGE? no

ALLOCATED FRAGS 34852132-34852134 MARKED FREE
ALLOCATED FRAGS 34852264-34852268 MARKED FREE
ALLOCATED FRAGS 34852344-34852347 MARKED FREE
ALLOCATED FRAGS 34852376-34852380 MARKED FREE
ALLOCATED FRAGS 34852452-34852453 MARKED FREE
ALLOCATED FRAGS 34852512-34852513 MARKED FREE
ALLOCATED FRAGS 34852536-34852540 MARKED FREE
ALLOCATED FRAGS 34852544-34852545 MARKED FREE
ALLOCATED FRAGS 34852548-34852549 MARKED FREE
ALLOCATED FRAG 34852567 MARKED FREE
ALLOCATED FRAG 34852583 MARKED FREE
ALLOCATED FRAGS 34852594-34852599 MARKED FREE
ALLOCATED FRAGS 34852616-34852620 MARKED FREE
ALLOCATED FRAGS 34852757-34852758 MARKED FREE
ALLOCATED FRAGS 34852818-34852820 MARKED FREE
ALLOCATED FRAGS 34852824-34852827 MARKED FREE
ALLOCATED FRAG 34852906 MARKED FREE
ALLOCATED FRAGS 34852925-34852927 MARKED FREE
ALLOCATED FRAGS 34853136-34853140 MARKED FREE
ALLOCATED FRAGS 34853144-34853148 MARKED FREE
ALLOCATED FRAGS 34853152-34853156 MARKED FREE
ALLOCATED FRAGS 34853160-34853164 MARKED FREE
ALLOCATED FRAGS 34853168-34853172 MARKED FREE
ALLOCATED FRAGS 34853245-34853246 MARKED FREE
ALLOCATED FRAGS 34853280-34853284 MARKED FREE
ALLOCATED FRAGS 34853288-34853292 MARKED FREE
ALLOCATED FRAGS 34853304-34853308 MARKED FREE
ALLOCATED FRAGS 34853352-34853356 MARKED FREE
ALLOCATED FRAGS 34853365-34853366 MARKED FREE
ALLOCATED FRAGS 34853368-34853372 MARKED FREE
ALLOCATED FRAGS 34853400-34853404 MARKED FREE
ALLOCATED FRAGS 34853490-34853494 MARKED FREE
ALLOCATED FRAGS 34853496-34853500 MARKED FREE
ALLOCATED FRAGS 34853536-34853545 MARKED FREE
ALLOCATED FRAGS 34853568-34853572 MARKED FREE
ALLOCATED FRAGS 34853868-34853870 MARKED FREE
ALLOCATED FRAGS 34853949-34853951 MARKED FREE
ALLOCATED FRAGS 34854074-34854075 MARKED FREE
ALLOCATED FRAGS 34854934-34854935 MARKED FREE
ALLOCATED FRAGS 34855504-34855508 MARKED FREE
ALLOCATED FRAGS 34855776-34855777 MARKED FREE
ALLOCATED FRAGS 34855920-34855924 MARKED FREE
ALLOCATED FRAGS 34856856-34856857 MARKED FREE
ALLOCATED FRAGS 34857067-34857068 MARKED FREE
ALLOCATED FRAGS 34871843-34871847 MARKED FREE
ALLOCATED FRAGS 34879373-34879374 MARKED FREE
ALLOCATED FRAGS 37584536-37584551 MARKED FREE
ALLOCATED FRAGS 37601008-37601014 MARKED FREE
471717 files, 47373681 used, 38091807 free (33239 frags, 4757321 blocks, 0.0% fragmentation)
[root]@[beer]:/usr/local/etc/rc.d>


More information about the freebsd-hackers mailing list