mfi timeouts

Jeremy Chadwick freebsd at jdc.parodius.com
Thu Oct 27 23:04:55 UTC 2011


On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote:
>     I've recently installed a new NAS at work which uses a rebranded LSI
> megaraid sas
> [root at banshee ~]# mfiutil show adapter
> mfi0 Adapter:
>     Product Name: Supermicro SMC2108
>    Serial Number:
>         Firmware: 12.12.0-0047
>      RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50
>   Battery Backup: present
>            NVRAM: 32K
>   Onboard Memory: 512M
>   Minimum Stripe: 8k
>   Maximum Stripe: 1M
> 
> I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives)
> 
> I'm seeing a lot of messages like
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 60 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 90 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 120 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 150 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 180 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 210 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 240 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 271 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 301 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 331 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 361 SECONDS
> mfi0: COMMAND 0xffffff8000b216c8 TIMEOUT AFTER 391 SECONDS
> mfi0: COMMAND 0xffffff8000b21b08 TIMEOUT AFTER 55 SECONDS
> mfi0: COMMAND 0xffffff8000b21b08 TIMEOUT AFTER 85 SECONDS
> 
> At which time I'm seeing IO stall on the array connected to the mfi
> adapter, this can continue for
> 20 minutes or so resuming randomly (or so it seems although a little
> more on this later on)
> 
> >From pciconf -lv
> mfi0 at pci0:5:0:0:        class=0x010400 card=0x070015d9 chip=0x00791000
> rev=0x04 hdr=0x00
>     vendor     = 'LSI Logic (Was: Symbios Logic, NCR)'
>     class      = mass storage
>     subclass   = RAID
> 
> >From dmesg
> mfi0: <LSI MegaSAS Gen2> port 0xe000-0xe0ff mem
> 0xfbd9c000-0xfbd9ffff,0xfbdc0000-0xfbdfffff irq 32 at device 0.0 on pci5
> mfi0: Megaraid SAS driver Ver 3.00
> mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host
> mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started
> (PCI ID 0079/1000/0700/15d9)
> mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235
> mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present
> mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047
> mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision
> 
> I have found this thread from a bit of googleing but it doesnt end too well.
> http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html
> Was this ever taken further?
> 
> One thing I have noticed is that the stall (and timeout messages) seem
> to go away if I query the card using mfiutil, I currently have a cron
> doing this every 2 minutes to see if this has been coincidence or not.
> 
> 
> Any suggestions welcome and i'm happy to provide more info if i can but
> I dont have a duplicate to do too much debugging on, I'm happy to try
> patches though.
> 
> Is this worth filing a PR?

Can you please provide uname -a output?  The version of FreeBSD you're
using matters greatly here.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list