8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2?

Markus Gebert markus.gebert at hostpoint.ch
Mon Jul 12 16:08:56 UTC 2010

On 12.07.2010, at 17:43, Adam McDougall wrote:

> I also get MCE on x4100m2 when causing significant disk activity in mpt
> while also downloading through em0 or em1.

Could you reproduce this on 6.x or 7.x? Because whatever we try here, we simply couldn't so far. A short test with Ubuntu also didn't show any sing of problems.

>  I was not able to trigger it
> while using nfe, however nfe locked up on me during normal DNS server
> traffic so that was a wash.

We had issues with nfe pre-8.x, that's why we have been using the em nics, which seem to be part of the problem now in 8.x.

>  What seemed to work for me was to add an
> Intel PCIE nic to the server and use it instead of the onboard NICS.

Thanks for the hint.

> For whatever reason I never experienced this problem until using ZFS.

We were able to reproduce it with UFS on 8.x. with just one disks (no gmirror), but I guess it's easier to trigger with ZFS especially in an mirror setup.

> I triggered it by downloading a 200m tgz file via http repeatedly
> over gigabit and it would reliably crash within a minute or two.

Our test case is basically:

1. fetch a large file using wget over em0 (100mbit link seems enough)
2. cp a large file locally to stress mpt
3. wait for MCE

> I ordered a dozen nics for probably around $20 each and was satisfied
> with this workaround given the age of the servers.  I'm pinched on time
> for work so I often don't get around to reporting issues where I've
> found a workaround, I'm glad you can get that started. 

"Glad" we're not the only ones :-)


