mptutil(8) segfault on IBM xSeries 3550

John Baldwin jhb at freebsd.org
Wed Mar 17 15:27:54 UTC 2010


On Monday 15 March 2010 4:03:23 pm Charles Owens wrote:
> John Baldwin wrote:
> > On Friday 19 February 2010 1:01:38 pm Charles Owens wrote:
> >   
> >> John Baldwin wrote:
> >>     
> >>> On Monday 15 February 2010 5:25:15 pm Charles Owens wrote:
> >>>       
> >>>> Charles Owens wrote:
> >>>>         
> >>>>> Howdy,
> >>>>>
> >>>>> We're working with IBM hardware (xSeries 3550) that has an
> >>>>> mpt-based RAID controller... after initial success with testing the
> >>>>> mptutil utility, now operations other than "show adapter" and "show
> >>>>> volume" are resulting in segfaults.
> >>>>>
> >>>>> While it was working properly we created and removed volumes several
> >>>>> times, force-failed drives, and just generally put it through its
> >>>>> paces... and all seemed fine.  Then, after a reboot, it suddenly 
started
> >>>>> failing with segfault as described, and nothing we do has helped to 
get
> >>>>> it out of this state (including trying to use the LSI in-BIOS manager 
to
> >>>>> create/delete volumes -- which in and of itself works fine).
> >>>>>
> >>>>> We found recent thread
> >>>>> http://docs.freebsd.org/cgi/mid.cgi?4B56CD4C.80503 and hoped that it
> >>>>> might somehow relate... and even tried the patch that John Baldwin
> >>>>> posted, but to no avail.
> >>>>>
> >>>>> Has anyone seen this behavior and/or have a suggested fix or 
workaround?
> >>>>>
> >>>>>
> >>>>> Here's the output of "mptutil show adapter":
> >>>>>
> >>>>> mpt0 Adapter:
> >>>>>        Board Name: SR-BR10i
> >>>>>    Board Assembly: L3-25116-01H
> >>>>>         Chip Name: C1068E
> >>>>>     Chip Revision: UNUSED
> >>>>>       RAID Levels: RAID0, RAID1, RAID1E
> >>>>>     RAID0 Stripes: 64K
> >>>>>    RAID1E Stripes: 64K
> >>>>>  RAID0 Drives/Vol: 1-10
> >>>>>  RAID1 Drives/Vol: 2
> >>>>> RAID1E Drives/Vol: 3-10
> >>>>>
> >>>>>
> >>>>> This work is being done using FreeBSD 8.0-RELEASE-p2 + PAE.
> >>>>>   
> >>>>>           
> >>>> I should add that the RAID controller in question is the IBM
> >>>> ServeRAID-BR10i SAS/SATA Controller which is based on the LSI 1068E
> >>>> processor, as described here:
> >>>> http://www-01.ibm.com/common/ssi/rep_ca/4/872/ENUSAG09-0104/index.html
> >>>>         
> >>> Try this updated patch.  It should fix the problems with 'mptutil show 
drives' 
> >>> displaying all daX devices in the system rather than just the ones for 
the 
> >>> mptX bus.  I had incorrectly interpreted the XPT matches as being an AND 
> >>> rather than an OR.  This changes the code to first do a lookup for the 
logical 
> >>> "path" (SCSI bus) for mptX devices and then do a second lookup to fetch 
any 
> >>> daX devices on that path.  I tested it on a machine with an mpt 
controller and 
> >>> a USB disk.  Unfortunately I wasn't able to test any of the RAID stuff, 
just 
> >>> 'show drives'.  This mpt(4) controller doesn't support RAID either, so I 
was 
> >>> also able to verify the fix you had already tested for cleaning up 'show 
> >>> adapter' output in that case.
> >>>
> >>> [patch omitted]
> >>>       
> >> John,
> >>
> >> The patch appears to have resolved the problem.   We're still banging on
> >> it, but so far it looks very good!
> >>
> >> Thanks very much!
> >>     
> >
> > Excellent, thanks!  I've committed it to HEAD and will MFC it in a week or
> > so.  It is probably too late to make 7.3 however.
> >   
> 
> Again, thanks for the patch... overall it is working well... we're now
> able to successively do what we need to do with RAID system.  We are,
> though, seeing some sor of error messages:
> 
> # mptutil show volumes
> mpt0 Volumes:
>   Id     Size    Level   Stripe  State  Write-Cache  Name
> mptutil: mpt_query_disk got 4 matches, expected 2
>      0 (  279G) RAID-1          OPTIMAL   Disabled  
> 
> # mptutil show config 
> mpt0 Configuration: 1 volumes, 2 drives
> mptutil: mpt_query_disk got 4 matches, expected 2
>     volume 0 (279G) RAID-1 OPTIMAL spans:
>         drive 1 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA
>         drive 0 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA
>         spare pools: 0

Are you sure this is a fixed binary?  The new binary doesn't print out that 
message anymore, it only ways 'got %d matches, expected 1'.  Also, the 4 
instead of 2 is consistent with the old bug in that the two Linux virtual 
floppies (da1 and da2) would be reported as extra for 'mptutil show drives' in 
this case I think.

> We can certainly live with this, but I wanted to let you know in case
> you thought it was worth digging into.  Let me know if you need any
> additional debug info beyond this:
> 
> # camcontrol devlist
> <LSILOGIC Logical Volume 3000>     at scbus0 target 0 lun 0 (pass0,da0)
> <ATA WD3000BLFS-23YBU 4V04>        at scbus1 target 1 lun 0 (pass1)
> <Linux Virtual CD/DVD 0316>        at scbus2 target 0 lun 0 (pass2,cd0)
> <Linux Virtual Floppy 0316>        at scbus3 target 0 lun 0 (da1,pass3)
> <Linux Virtual Floppy 0316>        at scbus3 target 0 lun 1 (da2,pass4)

-- 
John Baldwin


More information about the freebsd-hardware mailing list