Strange CAM errors

Willem Jan Withagen wjw at digiware.nl
Mon Dec 17 21:45:07 UTC 2012


On 17-12-2012 20:16, Jim Harris wrote:
> 
> 
> On Mon, Dec 17, 2012 at 9:26 AM, Willem Jan Withagen <wjw at digiware.nl
> <mailto:wjw at digiware.nl>> wrote:
> 
>     On 2012-12-17 15:38, Steven Hartland wrote:
>     > Check the smart results of each disk in the array you may have a
>     failing
>     > disk.
>     > ----- Original Message ----- From: "Willem Jan Withagen"
>     <wjw at digiware.nl <mailto:wjw at digiware.nl>>
>     > To: "FreeBSD Stable Users" <freebsd-stable at freebsd.org
>     <mailto:freebsd-stable at freebsd.org>>
>     > Sent: Monday, December 17, 2012 10:58 AM
>     > Subject: Strange CAM errors
>     >
>     >
>     >> Hi,
>     >>
>     >> I have not noticed this before, but my system rebooted this
>     morning and
>     >> in the following security report I found a lot of messgaes in the
>     >> dmesg-part like:
>     >>
>     >> +(probe0:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0
>     >> +(probe0:arcmsr0:0:16:1): CAM status: Command timeout
>     >> +(probe0:arcmsr0:0:16:1): Retrying command
>     >> +(probe0:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0
>     >> +(probe0:arcmsr0:0:16:1): CAM status: Command timeout
>     >> +(probe0:arcmsr0:0:16:1): Retrying command
>     >>
>     >> And it seems that bus 16 is:
>     >> +pass6 at arcmsr0 bus 0 scbus0 target 16 lun 0
>     >> +pass6: <Areca RAID controller R001> Fixed Processor SCSI-0 device
>     >>
>     >> The system has been running
>     >> FreeBSD zfs.digiware.nl <http://zfs.digiware.nl> 9.1-PRERELEASE
>     FreeBSD 9.1-PRERELEASE #3: Wed
>     >> Nov 14 13:25:55 CET 2012
>     >> root at zfs.digiware.nl:/usr/obj/usr/srcs/src9/src/sys/ZFS  amd64
>     >> for already a while.
>     >>
>     >> Anybody suggestions as to why I have these messages?
>     >>
>     >> They are during the boot sequence, so no smartd talking to the
>     disks at
>     >> that moment.
>     >>
>     >> --WjW
>     >>
>     >> ps: dmesg, config, etc.... at:
> 
>     >> http://www.tegenbosch28.nl/FreeBSD/Systems/ZFS
>     >> ps2: upgrading to the most recent 9.1
> 
>     'mmm,
> 
>     Smartd seems to think otherwise...
> 
>     'camcontrol rescan all' actually delivers the same pack of errors.
> 
>     --WjW
> 
> 
> The timeouts are occurring on inquiry commands to non-zero LUNs. 
> arcmsr(4) is returning CAM_SEL_TIMEOUT instead of CAM_DEV_NOT_THERE for
> inquiry commands to this device and LUN > 0.  CAM_DEV_NOT_THERE is
> preferred to remove these types of warnings, and similar patches have
> gone into for other SCSI drivers recently.
> 
> Can you try this patch?
> 
> Index: sys/dev/arcmsr/arcmsr.c
> ===================================================================
> --- sys/dev/arcmsr/arcmsr.c     (revision 244190)
> +++ sys/dev/arcmsr/arcmsr.c     (working copy)
> @@ -2439,7 +2439,7 @@
>                 char *buffer=pccb->csio.data_ptr;
>  
>                 if (pccb->ccb_h.target_lun) {
> -                       pccb->ccb_h.status |= CAM_SEL_TIMEOUT;
> +                       pccb->ccb_h.status |= CAM_DEV_NOT_THERE;
>                         xpt_done(pccb);
>                         return;
>                 }
> 

Hi Jim,

The noise has gone down by a factor of 5, now I get:

(probe6:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0
(probe6:arcmsr0:0:16:1): CAM status: Unable to terminate I/O CCB request
(probe6:arcmsr0:0:16:1): Error 5, Unretryable error
(probe6:arcmsr0:0:16:2): INQUIRY. CDB: 12 40 0 0 24 0

Which is defined in sys/cam/cam.c ....
as CAM_UA_TERMIO, but that error is nowhere set in the arcmsr code....

So I clearly do not yet know enough to hellp in this.

--WjW


For all of the ports on the adapter.


More information about the freebsd-stable mailing list