svn commit: r270249 - head/sys/cam/ata
Warner Losh
imp at bsdimp.com
Fri Aug 22 18:56:41 UTC 2014
On Aug 22, 2014, at 12:07 PM, Neel Natu <neelnatu at gmail.com> wrote:
> Hi Warner,
>
> On Fri, Aug 22, 2014 at 6:13 AM, Warner Losh <imp at bsdimp.com> wrote:
>>
>> On Aug 21, 2014, at 11:58 PM, Neel Natu <neelnatu at gmail.com> wrote:
>>
>>> Hi Warner,
>>>
>>> On Thu, Aug 21, 2014 at 10:34 PM, Warner Losh <imp at bsdimp.com> wrote:
>>>>
>>>> On Aug 21, 2014, at 10:31 PM, Neel Natu <neelnatu at gmail.com> wrote:
>>>>
>>>>> Hi Warner,
>>>>>
>>>>> On Wed, Aug 20, 2014 at 3:58 PM, Warner Losh <imp at freebsd.org> wrote:
>>>>>> Author: imp
>>>>>> Date: Wed Aug 20 22:58:12 2014
>>>>>> New Revision: 270249
>>>>>> URL: http://svnweb.freebsd.org/changeset/base/270249
>>>>>>
>>>>>> Log:
>>>>>> Turns out that IDENTIFY DEVICE and IDENTIFY PACKET DEVICE return data
>>>>>> that's only mostly similar. Specifically word 78 bits are defined for
>>>>>> IDENTIFY DEVICE as
>>>>>> 5 Supports Hardware Feature Control
>>>>>> while a IDENTIFY PACKET DEVICE defines them as
>>>>>> 5 Asynchronous notification supported
>>>>>> Therefore, only pay attention to bit 5 when we're talking to ATAPI
>>>>>> devices (we don't use the hardware feature control at this time).
>>>>>> Ignore it for ATA devices. Remove kludge that papered over this issue
>>>>>> for Samsung SATA SSDs, since Micron drives also have the bit set and
>>>>>> the error was caused by this bad interpretation of the spec (which is
>>>>>> quite easy to do, since bits aren't normally overlapping like this).
>>>>>>
>>>>>> Modified:
>>>>>> head/sys/cam/ata/ata_xpt.c
>>>>>>
>>>>>> Modified: head/sys/cam/ata/ata_xpt.c
>>>>>> ==============================================================================
>>>>>> --- head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:39:26 2014 (r270248)
>>>>>> +++ head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:58:12 2014 (r270249)
>>>>>> @@ -458,12 +458,18 @@ negotiate:
>>>>>> 0, 0x02);
>>>>>> break;
>>>>>> case PROBE_SETAN:
>>>>>> - /* Remember what transport thinks about AEN. */
>>>>>> - if (softc->caps & CTS_SATA_CAPS_H_AN)
>>>>>> + /*
>>>>>> + * Only ATAPI defines this bit to mean AEN, but remember
>>>>>> + * what transport thinks about AEN.
>>>>>> + */
>>>>>> + if ((softc->caps & CTS_SATA_CAPS_H_AN) &&
>>>>>> + periph->path->device->protocol == PROTO_ATAPI)
>>>>>> path->device->inq_flags |= SID_AEN;
>>>>>> else
>>>>>> path->device->inq_flags &= ~SID_AEN;
>>>>>> xpt_async(AC_GETDEV_CHANGED, path, NULL);
>>>>>> + if (periph->path->device->protocol != PROTO_ATAPI)
>>>>>> + break;
>>>>>> cam_fill_ataio(ataio,
>>>>>> 1,
>>>>>> probedone,
>>>>>> @@ -750,14 +756,6 @@ out:
>>>>>> goto noerror;
>>>>>>
>>>>>> /*
>>>>>> - * Some Samsung SSDs report supported Asynchronous Notification,
>>>>>> - * but return ABORT on attempt to enable it.
>>>>>> - */
>>>>>> - } else if (softc->action == PROBE_SETAN &&
>>>>>> - status == CAM_ATA_STATUS_ERROR) {
>>>>>> - goto noerror;
>>>>>> -
>>>>>> - /*
>>>>>> * SES and SAF-TE SEPs have different IDENTIFY commands,
>>>>>> * but SATA specification doesn't tell how to identify them.
>>>>>> * Until better way found, just try another if first fail.
>>>>>>
>>>>>
>>>>> This change causes a panic for me on boot. Here is the boot log:
>>>>>
>>>>> ahci0: <Intel Patsburg AHCI SATA controller> port
>>>>> 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f
>>>>> mem 0xfbb21000-0xfbb217ff irq 18 at device 31.2 on pci0
>>>>> ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
>>>>> ahcich0: <AHCI channel> at channel 0 on ahci0
>>>>> ahcich1: <AHCI channel> at channel 1 on ahci0
>>>>> ahcich2: <AHCI channel> at channel 2 on ahci0
>>>>> ahcich3: <AHCI channel> at channel 3 on ahci0
>>>>> ahcich4: <AHCI channel> at channel 4 on ahci0
>>>>> ahcich5: <AHCI channel> at channel 5 on ahci0
>>>>> ahciem0: <AHCI enclosure management bridge> on ahci0
>>>>> ...
>>>>> xpt_action_default: CCB type 0xdeadc0de not supported
>>>>> ...
>>>>> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config
>>>>> panic: run_interrupt_driven_config_hooks: waited too long
>>>>> cpuid = 0
>>>>> KDB: stack backtrace:
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff81d92920
>>>>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffffff81d929d0
>>>>> vpanic() at vpanic+0x189/frame 0xffffffff81d92a50
>>>>> kassert_panic() at kassert_panic+0x139/frame 0xffffffff81d92ac0
>>>>> boot_run_interrupt_driven_config_hooks() at
>>>>> boot_run_interrupt_driven_config_hooks+0x111/frame 0xffffffff81d92b50
>>>>> mi_startup()fffff81d92b70
>>>>> btext() at btext+0x2c
>>>>> KDB: enter: panic
>>>>> [ thread pid 0 tid 100000 ]
>>>>> Stopped at kdb_enter+0x3e: movq $0,kdb_why
>>>>> db>
>>>>>
>>>>> The peripheral in question is a SATA attached CDROM:
>>>>>
>>>>> % camcontrol devlist
>>>>> <INTEL SSDSC2CW240A3 400i> at scbus0 target 0 lun 0 (pass0,ada0)
>>>>> <ATAPI iHAS524 C LL23> at scbus2 target 0 lun 0 (cd0,pass1)
>>>>> <WDC WD1000CHTZ-04JCPV0 04.06A00> at scbus3 target 0 lun 0 (pass2,ada1)
>>>>> <Corsair Neutron GTX SSD M306> at scbus4 target 0 lun 0 (pass3,ada2)
>>>>> <AHCI SGPIO Enclosure 1.00 0001> at scbus6 target 0 lun 0 (ses0,pass4)
>>>>>
>>>>> pass1 at ahcich2 bus 0 scbus2 target 0 lun 0
>>>>> pass1: <ATAPI iHAS524 C LL23> Removable CD-ROM SCSI-0 device
>>>>> pass1: Serial Number 3524472 2N8225501140
>>>>> pass1: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
>>>>>
>>>>> The following patch fixes the panic.
>>>>>
>>>>> Index: sys/cam/ata/ata_xpt.c
>>>>> ===================================================================
>>>>> --- sys/cam/ata/ata_xpt.c (revision 270249)
>>>>> +++ sys/cam/ata/ata_xpt.c (working copy)
>>>>> @@ -468,7 +468,8 @@
>>>>> else
>>>>> path->device->inq_flags &= ~SID_AEN;
>>>>> xpt_async(AC_GETDEV_CHANGED, path, NULL);
>>>>> - if (periph->path->device->protocol != PROTO_ATAPI)
>>>>> + if (periph->path->device->protocol != PROTO_ATAPI &&
>>>>> + periph->path->device->protocol != PROTO_SCSI)
>>>>> break;
>>>>> cam_fill_ataio(ataio,
>>>>> 1,
>>>>
>>>> I think the more proper test is == PROTO_ATA elsewhere, since that’s what
>>>> distinguishes the ATA_IDENTIFY from the ATAPI_IDENTIFY.
>>>>
>>>>> However, there seem to be a couple of issues with the original patch:
>>>>>
>>>>> 1. The 'periph->path->device->protocol' is not initialized to
>>>>> PROTO_ATAPI anywhere in the tree so the not-equal-to test is a no-op.
>>>>
>>>> We test here to determine which identify command to send:
>>>>
>>>> if (periph->path->device->protocol == PROTO_ATA)
>>>> ata_28bit_cmd(ataio, ATA_ATA_IDENTIFY, 0, 0, 0);
>>>> else
>>>> ata_28bit_cmd(ataio, ATA_ATAPI_IDENTIFY, 0, 0, 0);
>>>>
>>>> and that is working to send the right command.
>>>>
>>>
>>> Yes, but PROTO_ATA != PROTO_ATAPI :-)
>>>
>>> Since we never initialize 'periph->path->device->protocol' to
>>> 'PROTO_ATAPI' in -current:
>>
>> But this code appears to:
>>
>> case PROBE_RESET:
>> {
>> int sign = (done_ccb->ataio.res.lba_high << 8) +
>> done_ccb->ataio.res.lba_mid;
>> CAM_DEBUG(path, CAM_DEBUG_PROBE,
>> ("SIGNATURE: %04x\n", sign));
>> if (sign == 0x0000 &&
>> done_ccb->ccb_h.target_id != 15) {
>> path->device->protocol = PROTO_ATA;
>> PROBE_SET_ACTION(softc, PROBE_IDENTIFY);
>> } else if (sign == 0x9669 &&
>> done_ccb->ccb_h.target_id == 15) {
>> /* Report SIM that PM is present. */
>> bzero(&cts, sizeof(cts));
>> xpt_setup_ccb(&cts.ccb_h, path, CAM_PRIORITY_NONE);
>> cts.ccb_h.func_code = XPT_SET_TRAN_SETTINGS;
>> cts.type = CTS_TYPE_CURRENT_SETTINGS;
>> cts.xport_specific.sata.pm_present = 1;
>> cts.xport_specific.sata.valid = CTS_SATA_VALID_PM;
>> xpt_action((union ccb *)&cts);
>> path->device->protocol = PROTO_SATAPM;
>> PROBE_SET_ACTION(softc, PROBE_PM_PID);
>> } else if (sign == 0xc33c &&
>> done_ccb->ccb_h.target_id != 15) {
>> path->device->protocol = PROTO_SEMB;
>> PROBE_SET_ACTION(softc, PROBE_IDENTIFY_SES);
>> } else if (sign == 0xeb14 &&
>> done_ccb->ccb_h.target_id != 15) {
>> path->device->protocol = PROTO_SCSI;
>> PROBE_SET_ACTION(softc, PROBE_IDENTIFY);
>> } else {
>> if (done_ccb->ccb_h.target_id != 15) {
>> xpt_print(path,
>> "Unexpected signature 0x%04x\n", sign);
>> }
>> goto device_fail;
>> }
>>
>> what am I missing?
>>
>
> In the snippet above 'protocol' is set to one of PROTO_ATA,
> PROTO_SATAPM, PROTO_SEMB or PROTO_SCSI - none of which is PROTO_ATAPI
> :-)
>
> $ find sys -type f -exec grep -nH -w PROTO_ATAPI {} \;
> sys/cam/scsi/scsi_pass.c:355: if (cgd->protocol == PROTO_SCSI ||
> cgd->protocol == PROTO_ATAPI)
> sys/cam/cam_ccb.h:249: PROTO_ATAPI, /* AT Attachment Packetized Interface */
Uggg. OK. I’ll look...
> best
> Neel
>
>>> if (protocol != PROTO_ATAPI) equates to if (1)
>>> if (protocol == PROTO_ATAPI) equates to if (0)
>>>
>>> I was trying to say that any code that compares 'protocol' to
>>> PROTO_ATAPI probably deserves a second look (e.g., the original patch
>>> that triggered this panic).
>>
>> Yes, but I think you’re analysis was incorrect on this point :)
>>
>>>>> 2. It seems not right to break out of switch in 'probestart()' without
>>>>> providing a way for 'probedone()' to be called. I believe that this
>>>>> stops the state machine from making forward progress and results in
>>>>> 'xpt_config()' not completing.
>>>>
>>>> That’s a problem, you’re right. Let me rework.
>>>>
>>>>> If you need more information to debug this some more or test a proper
>>>>> fix then I am happy to help.
>>>>
>>>> Please try the one included here. I think it will address things. I’ve tried it on one system, and am trying it on others in parallel to sending this.
>>>>
>>>
>>> Yup, works fine. Thanks for the quick fix!
>>
>> Will push it in. Thanks.
>>
>> Warner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20140822/09e9d1f4/attachment.sig>
More information about the svn-src-head
mailing list