Dell PowerEdge 1750 and mpt

Thu Oct 16 20:19:26 PDT 2003

On Thu, Oct 16, 2003 at 09:20:33 -0400, David Sze wrote:
> At 11:31 PM 15/10/2003 -0600, Kenneth D. Merry wrote this to All:
> >On Wed, Oct 15, 2003 at 21:41:30 -0400, David Sze wrote:
> >> Notice how this snippet of code never directly sends 
> >XPT_GET_TRAN_SETTINGS,
> >> so the source of the junk pointer/CCB cannot be me.
> >
> >libcam sends the XPT_GET_TRAN_SETTINGS CCB, to fill in sync rate/bus width
> >fields in the cam_device structure.
> 
> Right, I see where that is in libcam.  So if I just want the serial, then I 
> should just open the appropriate /dev/passX and send a XPT_GDEV_TYPE, and 
> that shouldn't tickle the panic with mpt(4).

Yes, that would do it.

> >> Removing all traces of this serial # gathering code from our application
> >> has gotten rid of the panics.
> >
> >Since it works on other drivers and fails with the mpt(4) driver, it may
> >be a problem with the mpt(4) driver.
> 
> Or possibly with the hardware/firmware revision of the 53c1030 in the Dell 
> 1750.  I have three IBM eServer 345 boxes that also use mpt(4), and so far 
> they aren't showing the panic problem when running the same code.
> 
> 
> >> int main() {
> >>     struct cam_device   device;
> >>     char                
> >kpcSerials[sizeof(device.serial_num)*DEVICE_MAX+1];
> >>     unsigned int        unLen = 0;
> >>
> >>     for (int n = 0; n < DEVICE_MAX; ++n) {
> >>         if (NULL == ::cam_open_spec_device("pass", n, O_RDWR, &device))
> >>             break;
> >
> >You'd probably be better off going back to your original code that uses an
> >XPT_DEV_MATCH CCB.
> >
> >With the above code, you'll run into problems if you've got sparse unit
> >numbers.  e.g. if you've got a device hardwired, or if you rescan a bus or
> >device and it goes away.  (e.g. you've got pass0, pass1, pass2, and pass4)
> 
> Not to mention that a ::cam_open_spec_device() followed by a 
> ::cam_close_spec_device() seems to result in a descriptor leak.  I wrapped 
> the previous code in a while(1){}, and /dev/xpt0 wasn't being closed.

I don't see anywhere in the normal code path that would result in xpt0 not
being closed, although it looks like there is one error path where it won't
get closed.  I've attached a patch for -stable to fix that.

If you're using the above code, you will probably hit that error case,
which would result in the fd leak.

Anyway, try this patch for camlib (if you want) and see if it fixes that
problem.

Ken
-- 
Kenneth Merry
ken at kdm.org
-------------- next part --------------
==== //depot/FreeBSD-ken-RELENG_4/src/lib/libcam/camlib.c#3 - /usr/home/ken/perforce/FreeBSD-ken-RELENG_4/src/lib/libcam/camlib.c ====
*** /tmp/tmp.1088.0	Thu Oct 16 21:15:43 2003
--- /usr/home/ken/perforce/FreeBSD-ken-RELENG_4/src/lib/libcam/camlib.c	Thu Oct 16 21:15:19 2003
***************
*** 514,519 ****
--- 514,521 ----
  			 "%s: %s%s", func_name, func_name, strerror(errno),
  			 (errno == ENOENT) ? tmpstr : "");

+ 		close(fd);
+ 
  		return(NULL);
  	}