Tyan k8sr lockups

Sven Willenberger sven at dmv.com
Wed Mar 30 21:32:30 PST 2005



Vivek Khera presumably uttered the following on 03/30/05 20:22:
> 
> On Mar 29, 2005, at 7:39 PM, Sten Spans wrote:
> 
>> There was an amr panic related to the management ioctls
>> which was fixed and backported to RELENG_5. You should have
>> this fix. amr controllers a supported quite well on freebsd
>> thanks to Scott's great work.
>>
> 
> Yes, that was part of the reason I cvsup'd again last week... to ensure 
> I had the latest fixes to the amr driver.
> 
>> The only way to get closer to solving these problems
>> is to dig and try to narrow it down:
>>
>> - Have you tried running with debugging ?
> 
> 
> any more than having the kernel debugger installed?  When the box locked 
> up I couldn't even drop into the kernel debugger from the serial 
> console. neither the BREAK signal nor the alt key sequence invoked it.
> 
>>
>> - Have you tried using other network cards ?
>>   ( yeah that sucks I know )
> 
> 
> Nope.  Machine is brand spanking new.  Was in service a whole of 5 days 
> before it locked up.  The twin of this machine also has issues with the 
> BIOS reporting "memory size changed" while the machine is running... so 
> I'm a bit concerned that there is some generic problem with the K8SR and 
> a megaraid controller.  But that one never had any complaints about the 
> ethernet, and the memory size error persisted across two motherboards.
> 
> I have yet to try the other ethernet port on this box as well.
> 
>>
>> - Are you absolutely sure that all the disks are working ?
>>   ( there have been reports of amr cards acting strange with
>>     silently failing disks )
> 
> 
> The megaraid bios showed all disks as active.  How would one tell if you 
> had a silently failing disk? :-(
> 
>> - Have you got the ufs fixes recently backported to releng_5 ?
> 
> 
> If it was prior to March 22, then yes I have them.  Where in cvsweb 
> might I look to test?
> 
>> These are the first I can think of. RELENG_5 seems to be
>> a bit of a moving target with some quite critical fixes
>> going in ( which is good offcourse :).
> 
> 
> Yes, it is good.... until you can't figure out if it is your hardware or 
> software flaking out on ya...
> 
> Thanks so much for responding.
> 
> For price-no-object, which vendor would you choose for an AMD system 
> today?  Same question if price is somewhat of a concern.  Thanks.
> 

For a piece of anecdotal evidence we are running a dual opteron k8s pro 
with 8GB of RAM and the 320-2x Megaraid controller (which controls all 
the harddrives) Except for a problem with fxp (which I haven't gone back 
to reinvestigate but rather just used the Broadcom gigE instead) the 
system so far has run fairly stable (running Postgres under a medium 
load at the moment). We had one issue of a "spontaneous" reboot that 
left no indication of what happened in messages; this box is still in 
our testing area so it is possible that it was the result of a 
power[cord] issue. This is running 5.4-PRERELEASE from 17 March. Also, 
we disabled the onboard adaptec controllers in the bios (as we don't use 
them).

Sven


More information about the freebsd-amd64 mailing list