kern/145714: [pmp][siis] removed SATA device on port multiplier resets entire channel loosing all other devices (8.0-stable)

Daniel Black daniel.subs at internode.on.net
Thu Apr 15 07:30:02 UTC 2010


>Number:         145714
>Category:       kern
>Synopsis:       [pmp][siis] removed SATA device on port multiplier resets entire channel loosing all other devices (8.0-stable)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 15 07:30:01 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Daniel Black
>Release:        8.0
>Organization:
OVEE
>Environment:
FreeBSD brm00.smartcars.in.nicta.com.au 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Apr 16 01:53:45 EST 2010     root at brm00.smartcars.in.nicta.com.au:/usr/obj/usr/src/sys/BRM  amd64

cvsup of stable as of a few hours ago
>Description:
A SATA harddrive was physically removed from one of the ports of a Silicon Image 3726 port multiplier. The kernel log appears to be reseting the entire port multiplier loosing 4 other devices. Even after the reset the other devices do not recover. 

# pciconf -lvc
atapci1 at pci0:0:31:2:	class=0x01018a card=0xb0021458 chip=0x3a208086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'SATA2(4Port2) (ICH10 Family)'
    class      = mass storage
    subclass   = ATA
    cap 01[70] = powerspec 3  supports D0 D3  current D0
    cap 13[b0] = PCI Advanced Features: FLR TP
none1 at pci0:0:31:3:	class=0x0c0500 card=0x50011458 chip=0x3a308086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'SMB controller  (50011458)'
    class      = serial bus
    subclass   = SMBus
atapci2 at pci0:0:31:5:	class=0x010185 card=0xb0021458 chip=0x3a268086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'SATA2(2Port2) (ICH10 Family)'
    class      = mass storage
    subclass   = ATA
    cap 01[70] = powerspec 3  supports D0 D3  current D0
    cap 13[b0] = PCI Advanced Features: FLR TP
siis0 at pci0:5:0:0:	class=0x010400 card=0x71321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image Inc (Was: CMD Technology Inc)'
    device     = 'PCI Express (1x) to 2 Port SATA300 (SiI 3132)'
    class      = mass storage
    subclass   = RAID
    cap 01[54] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 05[5c] = MSI supports 1 message, 64 bit 
    cap 10[70] = PCI-Express 1 legacy endpoint max data 128(1024) link x1(x1)
siis1 at pci0:6:0:0:	class=0x010400 card=0x71321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image Inc (Was: CMD Technology Inc)'
    device     = 'PCI Express (1x) to 2 Port SATA300 (SiI 3132)'
    class      = mass storage
    subclass   = RAID
    cap 01[54] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 05[5c] = MSI supports 1 message, 64 bit 
    cap 10[70] = PCI-Express 1 legacy endpoint max data 128(1024) link x1(x1)
atapci0 at pci0:7:0:0:	class=0x010185 card=0xb0001458 chip=0x2368197b rev=0x00 hdr=0x00
    vendor     = 'JMicron Technology Corp.'
    device     = 'JMB368 IDE Controller'
    class      = mass storage
    subclass   = ATA
    cap 01[68] = powerspec 2  supports D0 D3  current D0
    cap 10[50] = PCI-Express 1 legacy endpoint IRQ 2 max data 128(128) link x1(x1)

# camcontrol devlist
<ST32000542AS CC34>                at scbus0 target 0 lun 0 (pass0,ada0)
<ST32000542AS CC34>                at scbus0 target 1 lun 0 (pass1,ada1)
<ST32000542AS CC34>                at scbus0 target 2 lun 0 (pass2,ada2)
<ST32000542AS CC34>                at scbus0 target 3 lun 0 (pass3,ada3)
<ST32000542AS CC34>                at scbus0 target 4 lun 0 (pass4,ada4)
<Port Multiplier 37261095 1706>    at scbus0 target 15 lun 0 (pass5,pmp2)
<ST32000542AS CC34>                at scbus3 target 0 lun 0 (pass12,ada10)
<ST32000542AS CC34>                at scbus3 target 1 lun 0 (pass13,ada11)
<ST32000542AS CC34>                at scbus3 target 2 lun 0 (pass14,ada12)
<ST32000542AS CC34>                at scbus3 target 3 lun 0 (pass15,ada13)
<ST32000542AS CC34>                at scbus3 target 4 lun 0 (pass16,ada14)
<Port Multiplier 37261095 1706>    at scbus3 target 15 lun 0 (pass17,pmp1)


# vmstat -i
interrupt                          total       rate
irq1: atkbd0                           2          0
irq8: rtc                         649492        127
irq14: ata0                        62691         12
irq16: uhci0 siis0+               452808         89
irq17: siis1                     6932183       1365
irq18: uhci2 ehci0+                   18          0
cpu0: timer                      5072836        999
irq256: re0                        13586          2
cpu1: timer                      5071093        999
cpu2: timer                      5070729        999
cpu3: timer                      5070492        999
Total                           28395930       5595



dmesg:
Disk ada7 was removed. ada5,6,8

Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=262144 size=8192 error=6
Apr 16 03:53:42 brm00 kernel: (ada7:siisch2:0:2:0): lost device
Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=2000398319616 size=8192 error=6
Apr 16 03:53:42 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada7 offset=2000398581760 size=8192 error=6
Apr 16 03:53:52 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 03:53:52 brm00 kernel: siisch2: device ready timeout
Apr 16 03:53:52 brm00 kernel: siisch2: trying full port reset ...
Apr 16 03:53:52 brm00 kernel: (ada9:siisch2:0:
Apr 16 03:53:52 brm00 kernel: 4:0): lost device
Apr 16 03:53:52 brm00 kernel: 
Apr 16 03:53:52 brm00 kernel: (ada8:siisch2:0:3:0): lost device
Apr 16 03:53:52 brm00 kernel: (ada6:siisch2:0:1:0): lost device
Apr 16 03:53:52 brm00 kernel: (ada5:siisch2:0:0:0): lost device
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=262144 size=8192 error=6
Apr 16 03:53:52 brm00 kernel: (ada7:siisch2:0:2:0): Synchronize cache failed
Apr 16 03:53:52 brm00 kernel: 
Apr 16 03:53:52 brm00 kernel: (ada7:siisch2:0:2:0): removing device entry
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=2000398319616 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada9 offset=2000398581760 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=262144 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=2000398319616 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada8 offset=2000398581760 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=262144 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=2000398319616 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada6 offset=2000398581760 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=262144 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=2000398319616 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ada5 offset=2000398581760 size=8192 error=6
Apr 16 03:53:52 brm00 root: ZFS: zpool I/O failure, zpool=tank error=6
Apr 16 03:53:52 brm00 last message repeated 6 times
Apr 16 03:53:52 brm00 kernel: (pmp0:siisch2:0:15:0): lost device
Apr 16 03:53:52 brm00 root: ZFS: zpool I/O failure, zpool=tank error=6
Apr 16 03:53:53 brm00 root: ZFS: vdev failure, zpool=tank type=vdev.no_replicas
Apr 16 03:55:52 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 03:55:52 brm00 kernel: siisch2: device ready timeout
Apr 16 03:55:52 brm00 kernel: siisch2: trying full port reset ...
Apr 16 03:57:30 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 03:57:30 brm00 kernel: siisch2: device ready timeout
Apr 16 03:57:30 brm00 kernel: siisch2: trying full port reset ...
Apr 16 03:58:05 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 03:58:05 brm00 kernel: siisch2: device ready timeout
Apr 16 03:58:05 brm00 kernel: siisch2: trying full port reset ...
Apr 16 03:59:11 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 03:59:11 brm00 kernel: siisch2: device ready timeout
Apr 16 03:59:11 brm00 kernel: siisch2: trying full port reset ...
Apr 16 04:05:11 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 04:05:11 brm00 kernel: siisch2: device ready timeout
Apr 16 04:05:11 brm00 kernel: siisch2: trying full port reset ...
Apr 16 04:07:40 brm00 kernel: siisch2: port is not ready (timeout 10000ms) status = 001f2000
Apr 16 04:07:40 brm00 kernel: siisch2: device ready timeout
Apr 16 04:07:40 brm00 kernel: siisch2: trying full port reset ...


# zpool status -v
(froze - truss revealed no system calls)
>How-To-Repeat:
install 5 disks in a port multiplier.
put them in use (e.g. raidz2 configuration)
remove a disk
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list