about thumper aka sun fire x4500

Thu Jan 19 00:23:25 UTC 2012

Hi Peter

On 18.01.2012, at 20:25, peter h wrote:

> On Wednesday 18 January 2012 18.15, Adam McDougall wrote:
>> On 01/17/12 17:09, Jeremy Chadwick wrote:
>>> On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote:
>>>> I have been beating on of these a few days, i have udes freebsd 9.0 and 8.2
>>>> Both fails when i engage>  10 disks, the system craches and messages :
>>>> "Hyper transport sync flood" will get into the BIOS errorlog ( but nothing will
>>>> come to syslog since reboot is immediate)
>>>> 
>>>> Using a zfs radz of 25 disks and typing "zpool scrub" will bring the system down in seconds.
>>>> 
>>>> Anyone using a x4500 that can comfirm that it works ? Or is this box broken ?
>>> 
>> 
>> I've seen what is probably the same base issue but on multiple x4100m2 
>> systems running FreeBSD 7 or 8 a few years ago.  For me the instant 
>> reboot and HT sync flood error happened when I fetched a ~200mb file via 
>> HTTP using an onboard intel nic and wrote it out to a simple zfs mirror 
>> on 2 disks.  I may have tried the nvidia ethernet ports as an 
>> alternative but that driver had its own issues at the time.  This was 
>> never a problem with FFS instead of ZFS.  I could repeat it fairly 
>> easily by running fetch in a loop (can't remember if writing the output 
>> to disk was necessary to trigger it).  The workaround I found that 
>> worked for me was to buy a cheap intel PCIE nic and use that instead of 
>> the onboard ports.  If a zpool scrub triggers it for you, I doubt my 
>> workaround will help but I wanted to relate my experience.
> 
> The problem i had was most likley the disc-io itself. It was always there 
> whenever a larger number of discs was in motion.It was never there as 
> violent networking ( i even used myri2000 to increase traffic, never a problem)
> 
> A scrub on the 20-or-so zpool was all that was needed, andn when rebooting 
> the scrub continued and whoops - a new reboot.
> 
> Sometimes the bios reported not even 16G mem but 10.5 ( which also freebsd noticed)
> 
> Right now i am torturing the box with same load ( minus myri2000) and sunk-os,
> i'll report if it does show simular problems.
> 
> 
>> 
>>> Given this above diagram, I'm sure you can figure out how "flooding"
>>> might occur.  :-)  I'm not sure what "sync flood" means (vs. I/O
>>> flooding).
>> 
>> As I understand it, a sync flood is a purposeful reaction to an error 
>> condition as somewhat of a last ditch effort to regain control over the 
>> system (which ends up rebooting).  I'm pulling this out of my memory 
>> from a few years ago.

As Adam has pointed out, a sync flood is a way to signal an error condition on the hyper transport. As I understand it, it's used as a last resort when less fatal means of error communication are no longer possible because of a problem on the transport or a device attached to it. The transport will not recover from this state until it's reset. On Sun AMD systems a reboot is triggered immediately when a sync flood is detected. The fact that it happened is mentioned during POST, but it should also appear in the machine's error logs (IPMI/iLOM), so if you haven't done this already, it might be worth checking them. Maybe you'll find additional information there.

You should be able to disable the automatic reset on sync flood in your BIOS settings. We did this on our Sun X4200M2 machines when we experienced sync flood errors. It allowed the kernel to catch an MCE, panic and print out information about the MCE. This might help you get more information about the cause.

Our problems with the X4200M2 have some similarties with your case, though in our case high IO (i.e. zpool scrub) did not reliably (read: within minutes or hours) trigger the MCE/sync flood. If we put load on the zpool _and_ the network (em) we could trigger it easily. An other similarity: an other OS (in our case Linux), did not show the symptoms. Even other FreeBSD branches did not trigger the sync flood. You'll find the thread here:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-July/057670.html

It's a rather long thread. Short version: If raid controller (mpt) interrupts were routed to the first cpu (cpu0) everything worked, if not, sync flood (or MCE) happened on heavy IO. It happens that Linux and even older and newer FreeBSD versions (7.x, 9.x) assigned different interrupt routes for mpt0 compared to the FreeBSD 8.1 we were testing on. So what seemed like a bug of a specific FreeBSD version, because it didn't happen using other FreeBSD versions and Linux, turned out to be a hardware problem after all. IIRC a change in some hardware clock code caused an additional IRQ to be registered on boot (or one less), which reshuffled interrupt assignments compared to older FreeBSD versions we had used successfully on those machines. So we fixed it by setting a tunable which restored old clock behavior and thus old interrupt assignments.

It impossible to tell wether you have the same problem. But if you don't see any problems with other operating systems, maybe it's worth to play around with interrupt assignments. Luckily, the routings are tunable at runtime through cpuset(1). For example:

# cpuset -c -l 0 -x 58

IRQ58 was used by mpt0. Rerouting it to cpu0 made all problems go away. Hope this helps you in some way.

Good luck,

-- 
Markus