ATA tag queuing broken...

Sean Chittenden sean at chittenden.org
Fri Apr 25 15:59:10 PDT 2003


> > Buildworld + mild disk load as an NFS server and it does okay.
> > CVSup with mild disk load is also okay.  But if you toss the three
> > together, you're sure to get the box to panic.
> 
> Numbers would be better :) systat -vm is probably good enough.
> 
> If the box is really that flakey you should be able to panic it with
> some other dummy load. I'd bet real money (as high as $5) that NFS
> panics the box when the underlying disk *goes away* so I'd thrash
> the living daylights out of the disks. I'd also bet that you knew
> this.
> 
> Anyway I'm rather partial to
> #ls -R / > /dev/null
> 
> and different device combinations of
> 
> #dd of=/dev/null bs='63*512' if=/dev/ad0
> 
> Anyway I have KDE running, typing this email. The ls in one xterm,
> dd'ing one of the raw disks in another xterm, and systat -vm in a
> third xterm and it's solid as a rock.
> 
> FYI, with just one dd I get 1500tps 45MB/s 97% usage.

I bet you're onto something here, actually...  local buildworlds when
NFS is more idle than normal zing along.  When NFS is busy and I'm
doing something locally (most commonly a nightly (portsdb -Uu), the
machine typically reboots itself.  :-/ Anyway, point being I wonder if
it's an NFS + ata tag's issue.

> > Of all of those, it could either be a bios setting or ram, but
> > that's if that's a problem.  The machine has been running for a
> > year and a half and the panics have only been recently (last
> > 6-9mo) or so.
> 
> Well that's not an exhaustive list, you'd want to add
> 
> o	The disks
> o	The cables (esp length)
> o	The controller
> o	The PSU
> o	everything else
> 
> The point I was trying to make was that I didn't immediately post to
> the list saying ATA tags are broken (or vinum for that matter). I
> devoted time, energy and cash to narrowing down the problem,
> eliminating alternate causes etc.
> 
> From what you've posted the evidence is "anecdotal" and no-one else
> has come forward to support it. IMHO that doesn't justify labeling
> ATA tags as broken.  My copy of man 7 tuning says that this is "new
> experimental". Isn't that enough?

Well, I'm largely going off of a conversation on IRC regarding ata
tag's and des flat out said they were broken and to not use tag
queuing.  I wasn't able to get much more context than that regarding
broken in what way, but suffice it to say, I must concur with his
conclusion.  :)

> > > Find a display card and run memtest86 for an hour or so. Take a
> > > note of the memory throughput (for BIOS tuning).
> >
> > Eh, not so wild about the prospect of the machine dumping given that
> > I'm 700+mi away from this particular box, but next time I'm in the
> > data center I will... but like I said, I don't think it's hardware.
> 
> Not sure what you mean here, re dumping.
> 
> So you'll have a serial console setup then? I've not tried it but
> memtest has serial console support. Is there a floppy disk and
> someone on-site with two brain cells?

Unfortunately no and no, which is the problem.  :-/ Next time I'm down
in the bay, however, I'll run memtest86.  -sc

-- 
Sean Chittenden


More information about the freebsd-stable mailing list