smartmontools panics 9.1-RELEASE on sunfire 240

Alexander Motin mav at FreeBSD.org
Tue Jan 8 09:39:08 UTC 2013


On 06.01.2013 05:12, Marius Strobl wrote:
> On Sat, Jan 05, 2013 at 06:19:23PM -0800, John-Mark Gurney wrote:
>> Marius Strobl wrote this message on Sat, Jan 05, 2013 at 02:52 +0100:
>>> On Fri, Jan 04, 2013 at 08:32:24PM -0500, Kurt Lidl wrote:
>>>> On Sat, Jan 05, 2013 at 12:53:36AM +0100, Marius Strobl wrote:
>>>>> Uhm, probably an userland buffer which isn't even 16-bit aligned.
>>>>> If that's the cause, the attached patch hopefully should at least
>>>>> prevent the panic. If it does, smartmontools still need to be fixed
>>>>> though.
>>>>
>>>> You patch prevents the panic from happening.
>>>> When I try to start smartd now, it reports:
>>>>
>>>> root at host-98: /usr/local/etc/rc.d/smartd onestart
>>>> Starting smartd.
>>>> smartd: cam_send_ccb: Invalid argument
>>>> /usr/local/etc/rc.d/smartd: WARNING: failed to start smartd
>>>>
>>>> I had updated the kernel on the machine to 9-STABLE, and
>>>> verified that without this patch, the crash still happened with
>>>> a 9-STABLE kernel, in addition to 9.1-RELEASE kernel.
>>>>
>>>> My kernel now identifies itself as:
>>>> FreeBSD 9.1-STABLE (GENERIC) #1 r245044:245048M: Fri Jan  4 20:19:50 EST 2013
>>>>
>>>> -Kurt
>>>>
>>>>> Index: cam_periph.c
>>>>> ===================================================================
>>>>> --- cam_periph.c	(revision 245046)
>>>>> +++ cam_periph.c	(working copy)
>>>>> @@ -744,6 +744,9 @@ cam_periph_mapmem(union ccb *ccb, struct cam_perip
>>>>>  		if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE)
>>>>>  			return(0);
>>>>>  
>>>>> +		if ((uintptr_t)ccb->ataio.data_ptr % sizeof(uint16_t) != 0)
>>>>> +			return (EINVAL);
>>>>> +
>>>>>  		data_ptrs[0] = &ccb->ataio.data_ptr;
>>>>>  		lengths[0] = ccb->ataio.dxfer_len;
>>>>>  		dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK;
>>>>
>>>
>>> Alexander, are you okay with this approach or do you have a better
>>> idea how to handle this? In any case, it doesn't seem to make sense
>>> to teach the kernel how to cope with arbitrarily aligned buffers for
>>> ATA.
>>
>> Shouldn't we make it dependant on the __NO_STRICT_ALIGNMENT define so
>> that it won't immediately break other arches?
>>
> 
> No, not doing so tremendously helps ensuring that the software is
> properly written (apart from compact flash, ATA devices really
> only support 16-bit and 32-bit accesses) and judging the history
> of the patches in the smartmontools port it apparently already
> has to care about proper alignment for SCSI anyway. It would also
> not be the first time the smartmontools port is blown out of the
> water :)

That patch would do the trick, but I can't say that I like it. Yes,
there are many things tied to 16-bit in ATA world: both legacy ATA DMA
and AHCI require 16-bit aligned data, legacy ATA DMA also require even
transfer size. On the other side, Silicon Image siis(4) chips have no
such limitations, that makes it chip-specific, not system-specific
problem. Also for other DMA cases alignment is handled by busdma code
via bounce buffers -- not free, but transparent for user. For PIO
transfers data alignment is just outside of the ATA specification and
depends on implementation.

I think better solution would be to implement support for misaligned PIO
in ata(4) driver. It would cost just about dozen or two lines and should
be quite trivial. Probably I should do it last time this alignment
problem appeared.

smartmontools is a good testing tool for ATA stack, as it uses so many
kinds of ATA commands, that many of them are never used by kernel.

-- 
Alexander Motin


More information about the freebsd-sparc64 mailing list