smartmontools panics 9.1-RELEASE on sunfire 240

Marius Strobl marius at alchemy.franken.de
Sun Jan 13 15:55:19 UTC 2013


On Tue, Jan 08, 2013 at 11:39:02AM +0200, Alexander Motin wrote:
> On 06.01.2013 05:12, Marius Strobl wrote:
> > On Sat, Jan 05, 2013 at 06:19:23PM -0800, John-Mark Gurney wrote:
> >> Marius Strobl wrote this message on Sat, Jan 05, 2013 at 02:52 +0100:
> >>> On Fri, Jan 04, 2013 at 08:32:24PM -0500, Kurt Lidl wrote:
> >>>> On Sat, Jan 05, 2013 at 12:53:36AM +0100, Marius Strobl wrote:
> >>>>> Uhm, probably an userland buffer which isn't even 16-bit aligned.
> >>>>> If that's the cause, the attached patch hopefully should at least
> >>>>> prevent the panic. If it does, smartmontools still need to be fixed
> >>>>> though.
> >>>>
> >>>> You patch prevents the panic from happening.
> >>>> When I try to start smartd now, it reports:
> >>>>
> >>>> root at host-98: /usr/local/etc/rc.d/smartd onestart
> >>>> Starting smartd.
> >>>> smartd: cam_send_ccb: Invalid argument
> >>>> /usr/local/etc/rc.d/smartd: WARNING: failed to start smartd
> >>>>
> >>>> I had updated the kernel on the machine to 9-STABLE, and
> >>>> verified that without this patch, the crash still happened with
> >>>> a 9-STABLE kernel, in addition to 9.1-RELEASE kernel.
> >>>>
> >>>> My kernel now identifies itself as:
> >>>> FreeBSD 9.1-STABLE (GENERIC) #1 r245044:245048M: Fri Jan  4 20:19:50 EST 2013
> >>>>
> >>>> -Kurt
> >>>>
> >>>>> Index: cam_periph.c
> >>>>> ===================================================================
> >>>>> --- cam_periph.c	(revision 245046)
> >>>>> +++ cam_periph.c	(working copy)
> >>>>> @@ -744,6 +744,9 @@ cam_periph_mapmem(union ccb *ccb, struct cam_perip
> >>>>>  		if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE)
> >>>>>  			return(0);
> >>>>>  
> >>>>> +		if ((uintptr_t)ccb->ataio.data_ptr % sizeof(uint16_t) != 0)
> >>>>> +			return (EINVAL);
> >>>>> +
> >>>>>  		data_ptrs[0] = &ccb->ataio.data_ptr;
> >>>>>  		lengths[0] = ccb->ataio.dxfer_len;
> >>>>>  		dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK;
> >>>>
> >>>
> >>> Alexander, are you okay with this approach or do you have a better
> >>> idea how to handle this? In any case, it doesn't seem to make sense
> >>> to teach the kernel how to cope with arbitrarily aligned buffers for
> >>> ATA.
> >>
> >> Shouldn't we make it dependant on the __NO_STRICT_ALIGNMENT define so
> >> that it won't immediately break other arches?
> >>
> > 
> > No, not doing so tremendously helps ensuring that the software is
> > properly written (apart from compact flash, ATA devices really
> > only support 16-bit and 32-bit accesses) and judging the history
> > of the patches in the smartmontools port it apparently already
> > has to care about proper alignment for SCSI anyway. It would also
> > not be the first time the smartmontools port is blown out of the
> > water :)
> 
> That patch would do the trick, but I can't say that I like it. Yes,
> there are many things tied to 16-bit in ATA world: both legacy ATA DMA
> and AHCI require 16-bit aligned data, legacy ATA DMA also require even
> transfer size. On the other side, Silicon Image siis(4) chips have no
> such limitations, that makes it chip-specific, not system-specific
> problem.

My reading of 3.2.9 of the ATA-7 specification actually is different
as it says that apart from CFA devices, both DMA and PIO transfers
are 16-bit and generally there's no guarantee that there's anything
like a DMA engine or busdma code that can handle unaligned transfers
in between, especially for PIO.

> Also for other DMA cases alignment is handled by busdma code
> via bounce buffers -- not free, but transparent for user.

The main purpose of DMA bounce buffers is address range translation
in case a device can't access the whole system memory, thus the
sparc64 busdma code doesn't use bounce buffers as that job is done
by the IOMMU. AFAICT, no MD bus_dmamap_load_uio() implementation
currently handles unaligned transfers as this is not just about
unaligned buffers in RAM but also the appropriate cache flushing
necessary for these and just happens to work on x86.

> For PIO
> transfers data alignment is just outside of the ATA specification and
> depends on implementation.

I don't agree here as it is hard to imagine how an ATA controller
should be able to handle unaligned PIO (assuming your statement
regarding siis(4) driven chips refers to DMA only and unaligned PIO
hasn't actually been implemented in some ATA controllers).

> 
> I think better solution would be to implement support for misaligned PIO
> in ata(4) driver. It would cost just about dozen or two lines and should
> be quite trivial. Probably I should do it last time this alignment
> problem appeared.

In the end I don't really care as long as it works but from a sheer
performance point of view it seems wrong to bounce PIO in the kernel
if it is trivial for applications to care about the alignment required
by the protocol they want to talk in the first place.

Marius



More information about the freebsd-sparc64 mailing list