about thumper aka sun fire x4500

peter h peter at hk.ipsec.se
Wed Jan 18 19:26:01 UTC 2012


On Wednesday 18 January 2012 18.15, Adam McDougall wrote:
> On 01/17/12 17:09, Jeremy Chadwick wrote:
> > On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote:
> >> I have been beating on of these a few days, i have udes freebsd 9.0 and 8.2
> >> Both fails when i engage>  10 disks, the system craches and messages :
> >> "Hyper transport sync flood" will get into the BIOS errorlog ( but nothing will
> >> come to syslog since reboot is immediate)
> >>
> >> Using a zfs radz of 25 disks and typing "zpool scrub" will bring the system down in seconds.
> >>
> >> Anyone using a x4500 that can comfirm that it works ? Or is this box broken ?
> >
> 
> I've seen what is probably the same base issue but on multiple x4100m2 
> systems running FreeBSD 7 or 8 a few years ago.  For me the instant 
> reboot and HT sync flood error happened when I fetched a ~200mb file via 
> HTTP using an onboard intel nic and wrote it out to a simple zfs mirror 
> on 2 disks.  I may have tried the nvidia ethernet ports as an 
> alternative but that driver had its own issues at the time.  This was 
> never a problem with FFS instead of ZFS.  I could repeat it fairly 
> easily by running fetch in a loop (can't remember if writing the output 
> to disk was necessary to trigger it).  The workaround I found that 
> worked for me was to buy a cheap intel PCIE nic and use that instead of 
> the onboard ports.  If a zpool scrub triggers it for you, I doubt my 
> workaround will help but I wanted to relate my experience.

The problem i had was most likley the disc-io itself. It was always there 
whenever a larger number of discs was in motion.It was never there as 
violent networking ( i even used myri2000 to increase traffic, never a problem)

A scrub on the 20-or-so zpool was all that was needed, andn when rebooting 
the scrub continued and whoops - a new reboot.

Sometimes the bios reported not even 16G mem but 10.5 ( which also freebsd noticed)

Right now i am torturing the box with same load ( minus myri2000) and sunk-os,
i'll report if it does show simular problems.


> 
> > Given this above diagram, I'm sure you can figure out how "flooding"
> > might occur.  :-)  I'm not sure what "sync flood" means (vs. I/O
> > flooding).
> 
> As I understand it, a sync flood is a purposeful reaction to an error 
> condition as somewhat of a last ditch effort to regain control over the 
> system (which ends up rebooting).  I'm pulling this out of my memory 
> from a few years ago.


> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
> 

-- 
        Peter Håkanson   

        There's never money to do it right, but always money to do it
        again ... and again ... and again ... and again.
        ( Det är billigare att göra rätt. Det är dyrt att laga fel. )


More information about the freebsd-stable mailing list