kern/144026: confusion and failure to relogin to FC targets after loop down

Matt Jacob mjacob at FreeBSD.org
Wed Feb 17 01:40:02 UTC 2010


>Number:         144026
>Category:       kern
>Synopsis:       confusion and failure to relogin to FC targets after loop down
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Feb 17 01:40:01 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Matt Jacob
>Release:        FreeBSD 8.0-RC1 i386
>Organization:
Feral Software
>Environment:
Any recent FreeBSD system.

>Description:
This is for the ISP(4) HBA dreiver.

For a SAN topology, after a cable pull from the isp HBA to the Switch, a restoral
after Loop Down timouet leads to the inability to talk to the devices that were there before.
Apparently the QLogic f/w sees them as logged in still.

Here is an annotated set of messages that shows the problem:

1. Contents of the fabric:
....
isp1: Chan 0 WWPN 0x2101001b32af095d PortID 0x020900 N-Port Handle 0, Connection 'F Port'
isp1: Chan 0 PortID 0x0205da handle 0x1 role Target arrived at tgt 0 WWPN 0x21000004cf20f854
isp1: Chan 0 PortID 0x0205dc handle 0x2 role Target arrived at tgt 1 WWPN 0x2100002037fd9c23
isp1: Chan 0 PortID 0x0205e0 handle 0x3 role Target arrived at tgt 2 WWPN 0x2100002037fd8d49
isp1: Chan 0 PortID 0x0205e1 handle 0x4 role Target arrived at tgt 3 WWPN 0x2100002037fd94b4
isp1: Chan 0 PortID 0x0205e2 handle 0x5 role Target arrived at tgt 4 WWPN 0x2100002037fd8d5f
isp1: Chan 0 PortID 0x0205e4 handle 0x6 role Target arrived at tgt 5 WWPN 0x2100002037fd8e7e
isp1: Chan 0 PortID 0x0205e8 handle 0x7 role Target arrived at tgt 6 WWPN 0x2100002037fd8b22
isp1: Chan 0 PortID 0x0205ef handle 0x8 role Target arrived at tgt 7 WWPN 0x21000004cf0099b8
isp1: Chan 0 PortID 0xfffffe handle 0x7fe role (none) stayed WWPN 0x200900c0dd0c87bd

2. A *short* cable pull (less than Loop Down Timeout):
...
isp1: Chan 0 Name Server Database Changed
isp1: Chan 0 PortID 0x0205da handle 0x1 role Target gone zombie at tgt 0 WWPN 0x21000004cf20f854
isp1: Chan 0 PortID 0x0205dc handle 0x2 role Target gone zombie at tgt 1 WWPN 0x2100002037fd9c23
isp1: Chan 0 PortID 0x0205e0 handle 0x3 role Target gone zombie at tgt 2 WWPN 0x2100002037fd8d49
isp1: Chan 0 PortID 0x0205e1 handle 0x4 role Target gone zombie at tgt 3 WWPN 0x2100002037fd94b4
isp1: Chan 0 PortID 0x0205e2 handle 0x5 role Target gone zombie at tgt 4 WWPN 0x2100002037fd8d5f
isp1: Chan 0 PortID 0x0205e4 handle 0x6 role Target gone zombie at tgt 5 WWPN 0x2100002037fd8e7e
isp1: Chan 0 PortID 0x0205e8 handle 0x7 role Target gone zombie at tgt 6 WWPN 0x2100002037fd8b22
isp1: Chan 0 PortID 0x0205ef handle 0x8 role Target gone zombie at tgt 7 WWPN 0x21000004cf0099b8
isp1: Chan 0 PortID 0xfffffe handle 0x7fe role (none) stayed WWPN 0x200900c0dd0c87bd
isp1: Chan 0 Name Server Database Changed
isp1: Chan 0 PortID 0x0205da handle 0x9 role Target stayed at tgt 0 WWPN 0x21000004cf20f854
isp1: Chan 0 PortID 0x0205dc handle 0xa role Target stayed at tgt 1 WWPN 0x2100002037fd9c23
isp1: Chan 0 PortID 0x0205e0 handle 0xb role Target stayed at tgt 2 WWPN 0x2100002037fd8d49
isp1: Chan 0 PortID 0x0205e1 handle 0xc role Target stayed at tgt 3 WWPN 0x2100002037fd94b4
isp1: Chan 0 PortID 0x0205e2 handle 0xd role Target stayed at tgt 4 WWPN 0x2100002037fd8d5f
isp1: Chan 0 PortID 0x0205e4 handle 0xe role Target stayed at tgt 5 WWPN 0x2100002037fd8e7e
isp1: Chan 0 PortID 0x0205e8 handle 0xf role Target stayed at tgt 6 WWPN 0x2100002037fd8b22
isp1: Chan 0 PortID 0x0205ef handle 0x10 role Target stayed at tgt 7 WWPN 0x21000004cf0099b8
isp1: Chan 0 PortID 0xfffffe handle 0x7fe role (none) stayed WWPN 0x200900c0dd0c87bd

Everything is fine.

3. A *long* cable pull (> than Loop Down Timeout):

Devices go away....
.....
isp1: Chan 0: LIP Received
isp1: Chan 0: LOOP Down
isp1: Chan 0 PortID 0x0205da Departed from Target 0 because of Loop Down Timeout
(pass0:isp1:0:0:0): lost device
(pass0:isp1:0:0:0): removing device entry
(da0:isp1:0:0:0): lost device
(da0:isp1:0:0:0): removing device entry
isp1: Chan 0 PortID 0x0205dc Departed from Target 1 because of Loop Down Timeout
(pass1:isp1:0:1:0): lost device
(pass1:isp1:0:1:0): removing device entry
(da1:isp1:0:1:0): lost device
(da1:isp1:0:1:0): removing device entry
isp1: Chan 0 PortID 0x0205e0 Departed from Target 2 because of Loop Down Timeout
(pass2:isp1:0:2:0): lost device
(pass2:isp1:0:2:0): removing device entry
(da2:isp1:0:2:0): lost device
(da2:isp1:0:2:0): removing device entry
isp1: Chan 0 PortID 0x0205e1 Departed from Target 3 because of Loop Down Timeout
(pass3:isp1:0:3:0): lost device
(pass3:isp1:0:3:0): removing device entry
(da3:isp1:0:3:0): lost device
(da3:isp1:0:3:0): removing device entry
isp1: Chan 0 PortID 0x0205e2 Departed from Target 4 because of Loop Down Timeout
(pass4:isp1:0:4:0): lost device
(pass4:isp1:0:4:0): removing device entry
(da4:isp1:0:4:0): lost device
(da4:isp1:0:4:0): removing device entry
isp1: Chan 0 PortID 0x0205e4 Departed from Target 5 because of Loop Down Timeout
(pass5:isp1:0:5:0): lost device
(pass5:isp1:0:5:0): removing device entry
(da5:isp1:0:5:0): lost device
(da5:isp1:0:5:0): removing device entry
isp1: Chan 0 PortID 0x0205e8 Departed from Target 6 because of Loop Down Timeout
(pass6:isp1:0:6:0): lost device
(pass6:isp1:0:6:0): removing device entry
(da6:isp1:0:6:0): lost device
(da6:isp1:0:6:0): removing device entry
isp1: Chan 0 PortID 0x0205ef Departed from Target 7 because of Loop Down Timeout
(pass7:isp1:0:7:0): lost device
(pass7:isp1:0:7:0): removing device entry
(da7:isp1:0:7:0): lost device
(da7:isp1:0:7:0): removing device entry
isp1: Chan 0 Firmware State <Config Wait->Loss Of Sync>

....
Loop comes back up:

isp1: Chan 0: LOOP Reset
isp1: Chan 0: LIP Received
isp1: Chan 0: LIP Received
isp1: Chan 0: LOOP Reset
isp1: Chan 0: LIP Received
isp1: Chan 0 Loop UP
isp1: Chan 0 Port Database Changed
isp1: Chan 0 Firmware State <Config Wait->Ready>
isp1: Chan 0 2Gb link speed

Fabric begins re-evaluating:
....
isp1: Chan 0 WWPN 0x2101001b32af095d PortID 0x020900 N-Port Handle 0, Connection 'F Port'
isp1: Chan 0 PLOGX PortID 0x0205da to N-Port handle 0x11: already logged in with N-Port handle 0x9
isp1: Chan 0 new device 0x0205da at 0x9 disappeared
isp1: Chan 0 PLOGX PortID 0x0205dc to N-Port handle 0x11: already logged in with N-Port handle 0xa
isp1: Chan 0 Port Database Changed
isp1: Chan 0 Port Database Changed
isp1: Chan 0 PLOGX PortID 0x0205da to N-Port handle 0x11: already logged in with N-Port handle 0x9
isp1: Chan 0 new device 0x0205da at 0x9 disappeared
isp1: Chan 0 PLOGX PortID 0x0205dc to N-Port handle 0x11: already logged in with N-Port handle 0xa
isp1: Chan 0 new device 0x0205dc at 0xa disappeared
isp1: Chan 0 PLOGX PortID 0x0205e0 to N-Port handle 0x11: already logged in with N-Port handle 0xb
isp1: Chan 0 new device 0x0205e0 at 0xb disappeared
isp1: Chan 0 PLOGX PortID 0x0205e1 to N-Port handle 0x11: already logged in with N-Port handle 0xc
isp1: Chan 0 new device 0x0205e1 at 0xc disappeared
isp1: Chan 0 PLOGX PortID 0x0205e2 to N-Port handle 0x11: already logged in with N-Port handle 0xd
isp1: Chan 0 new device 0x0205e2 at 0xd disappeared
isp1: Chan 0 PLOGX PortID 0x0205e4 to N-Port handle 0x11: already logged in with N-Port handle 0xe
isp1: Chan 0 new device 0x0205e4 at 0xe disappeared
isp1: Chan 0 PLOGX PortID 0x0205e8 to N-Port handle 0x11: already logged in with N-Port handle 0xf
isp1: Chan 0 new device 0x0205e8 at 0xf disappeared
isp1: Chan 0 PLOGX PortID 0x0205ef to N-Port handle 0x11: already logged in with N-Port handle 0x10
isp1: Chan 0 new device 0x0205ef at 0x10 disappeared
isp1: Chan 0 PortID 0xfffffe handle 0x7fe role (none) stayed WWPN 0x200900c0dd0c87bd

4. Lossage

At this point *nothing* is entered in the isp driver's port database except the fabric name server.
The devices cannot be reattached.

>How-To-Repeat:
Pull the cable for 30 seconds. Put it back in.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list