ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

Karl Denninger karl at denninger.net
Wed Aug 10 14:41:50 GMT 2005


On Wed, Aug 10, 2005 at 09:51:03AM -0400, Mike Tancsa wrote:
> At 09:31 AM 10/08/2005, Karl Denninger wrote:
> 
> >Also, I've yet to see a developer commit on the list that they WILL fix it 
> >if
> >such a controller board is forthcoming (and will return the board when 
> >they're
> >done) - I've got two of these cards here (choose between Adaptec and 
> >Bustek)
> >and would be happy to UPS one to someone IF I had a firm commitment that 
> >6.x
> >would NOT go out without this being addressed and that the board would be
> >returned to me when work was complete.
> 
> You demand to see support for this chipset fixed, yet, you cant pony up a 
> measly hundred bucks to donate the card to the developer who is not being 
> paid to develop anything.
> 
>         ---Mike 

I have "demanded" nothing Mike.

I have stated VERY clearly, however, that I've been one of FreeBSD's loudest
evangelists, to the point that I won't support the code I both give away and
sell on ANYTHING OTHER THAN FreeBSD - at least so far.

That evangalism dates back to the mid 90s when I ran my own ISP, and refused
multiple "requests" (sweetened with various offers) to run my back office on
all sorts of things. One of those "requests" came from Microsoft (to run on 
NT, specifcally for our back office functions.)  I assure you that their 
offer, made in confidence in my conference room, was quite "sweet" - and 
was flatly turned down.

I do not believe that it is unreasonable in any way, shape or form to expect
that hardware that is in mass circulation (that is, NOT deprecated stuff
that nobody sells or makes anymore) will NOT be broken from one release to
the next when it was working just fine previously.

I <DO NOT> expect FreeBSD to magically support every piece of crap that
comes down the pipe, and am well-aware that there's a lot of that sort of
thing out there.

But this isn't the case in this instance.  This is hardware that worked 
perfectly well on 4.x (and in fact still does) but broke immediately and 
severely with the newer ATA code.  The FreeBSD team KNOWS THIS, in that I 
filed a PR on it in February, yet there is nothing in the Errata or hardware 
notes (as of AUGUST!) warning people that they risk severe data corruption 
if they attempt to use these controllers on releases 5.4 and beyond.

In addition, the "fix" propounded upon for this problem (buy a 3ware card),
which I finally did after watching my PR sit unattended for six months,
led to a SECOND surprise - that the much-touted reason for the ATA-ng code
in the first place, that is, a more rich feature environment (specifically 
hot plug support) IS NOT IN THE FIX PROPOUNDED UPON!

That is, I get to go BACK to the 4.x feature set in pursuit of the fix!

Nonetheless I'm willing to SEND A BOARD to the developer IF I get in
return a commitment (in public, right here) from the development/release 
teams that 6.x WILL NOT GO OUT THE DOOR UNTIL THIS PROBLEM IS ADDRESSED.

I am requesting the board back after the fix is purportedly in the source
tree <SO I CAN VERIFY IT>, and drop the warnings from my product
documentation.  UPSing the board back to me (or sending it first class US
mail) will cost all of a couple of bucks.

What I get in return for this offer (I own the board, I will pay the UPS 
charge anywhere in the US to send it out) is a personal attack that I'm
too "miserly" to not pony up MONEY - when I've already offered to send what 
the money would BUY.

Is it REALLY true that the developers DO NOT HAVE a card that has this
problem?  

If so I have TWO which exhibit the problem (pick your brand, Adaptec or
Bustek) and have offered to send ONE of the two to a developer.

So why do they want MONEY, when I have and am offering better - a KNOWN 
TROUBLESOME BOARD?  What is the money going to BUY Mike?  A board - or 
beer?  If a board, I have one.  If beer, be honest enough to say so.

I ALSO (about two weeks ago) offered to give any developer who wanted to
work on this a login on my Sandbox machine, configured with the bad controller
and a disk exhibiting the problem in it, plus a boot disk on a "SAFE"
controller in the same box.  Gmirror appears to insure (from my testing)
that if/when the disk disconnects and blows chunks that operating system
integrity is not compromised.

Therefore, said developer(s) could work SAFELY on this problem (without risk
to their development machines - it'd be at MY risk to MY environment) until 
they are satisfied that its fixed.  At that point I will once again run my 
validation tests on the box and see if it remains stable, as a further 
verification that it truly is fixed.  If it passes THAT test, then of 
course the final verification would come from the community at large.

That machine has 6.0-BETA on it and they are free to have at it at will 
via SSH.  It also has a full CVS set on it so the FULL development
environment is available to whoever wishes to work on it and other than
sharing a network connection with production machines is entirely isolated -
so there is no risk to my production machines as a consequence of what
someone does to this box.  (That is, I won't be pissed off if you blow 
it up and I have to reload it from scratch - that's its PURPOSE.)  If it
gets wedged to the point that it can't be rebooted remotely a quick email,
phone call or SMS message to my cell phone will elicit a push on the big 
red switch for them.  Said box is a Dell PowerEDGE 400SC - plenty of beef
to get the job done (2.4ghz HT P4)

THAT offer drew no response - not a "no thanks", not a "sure, contact me to
setup a login and password" - nothing.

That says to me that there is no intent - or desire - to actually fix it.

Again, if the goal is to FIX IT, I've now offered TWO paths for the developers
to do so, at my expense - one a couple of weeks ago, and a second now (that 
I've gone out and bought a 3ware board, and thus have TWO of the troublesome 
cards around.)

What I get back is a request for MONEY instead of a known environment that
exhibits the problem?!  Uh, why do I think that's a request to throw money 
down a black hole instead of actually solving the problem?

I understand if the developers simply do not want to support the SiI chipset
AT ALL, and have declared it "broken".  That's cool, even though I believe
that such a step will lead to mass-defections among the desktop and
small-office server marketplace for FreeBSD.

BUT IF THIS IS THE APPROACH BEING TAKEN THE DEVELOPMENT AND RELEASE TEAMS 
HAVE AN OBLIGATION TO PROPERLY AND HONESTLY REPRESENT THIS IN THE RELEASE 
NOTES SO THAT I, AND OTHERS, DO NOT GET A "SURPRISE" AND CHASE OUR TAILS 
ASSUMING WE HAVE DEFECTIVE DISK DRIVES WHEN THE REAL PROBLEM IS UNSUPPORTED 
HARDWARE!

--
-- 
Karl Denninger (karl at denninger.net) Internet Consultant & Kids Rights Activist
http://www.denninger.net	My home on the net - links to everything I do!
http://scubaforum.org		Your UNCENSORED place to talk about DIVING!
http://genesis3.blogspot.com	Musings Of A Sentient Mind




More information about the freebsd-stable mailing list