svn commit: r291716 - in head: share/man/man4 sys/cam sys/cam/ata sys/cam/scsi sys/dev/md sys/geom sys/kern sys/pc98/include sys/sys usr.sbin usr.sbin/camdd
Roger Pau Monné
royger at FreeBSD.org
Mon Jan 11 17:29:27 UTC 2016
El 03/12/15 a les 21.54, Kenneth D. Merry ha escrit:
> Author: ken
> Date: Thu Dec 3 20:54:55 2015
> New Revision: 291716
> URL: https://svnweb.freebsd.org/changeset/base/291716
>
> Log:
> Add asynchronous command support to the pass(4) driver, and the new
> camdd(8) utility.
>
> CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
> completed CCBs may be retrieved via the CAMIOGET ioctl. User
> processes can use poll(2) or kevent(2) to get notification when
> I/O has completed.
>
> While the existing CAMIOCOMMAND blocking ioctl interface only
> supports user virtual data pointers in a CCB (generally only
> one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
> physical address pointers, as well as user virtual and physical
> scatter/gather lists. This allows user applications to have more
> flexibility in their data handling operations.
>
> Kernel memory for data transferred via the queued interface is
> allocated from the zone allocator in MAXPHYS sized chunks, and user
> data is copied in and out. This is likely faster than the
> vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
> configurations with many processors (there are more TLB shootdowns
> caused by the mapping/unmapping operation) but may not be as fast
> as running with unmapped I/O.
>
> The new memory handling model for user requests also allows
> applications to send CCBs with request sizes that are larger than
> MAXPHYS. The pass(4) driver now limits queued requests to the I/O
> size listed by the SIM driver in the maxio field in the Path
> Inquiry (XPT_PATH_INQ) CCB.
>
> There are some things things would be good to add:
>
> 1. Come up with a way to do unmapped I/O on multiple buffers.
> Currently the unmapped I/O interface operates on a struct bio,
> which includes only one address and length. It would be nice
> to be able to send an unmapped scatter/gather list down to
> busdma. This would allow eliminating the copy we currently do
> for data.
>
> 2. Add an ioctl to list currently outstanding CCBs in the various
> queues.
>
> 3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
> that.
>
> 4. Test physical address support. Virtual pointers and scatter
> gather lists have been tested, but I have not yet tested
> physical addresses or scatter/gather lists.
>
> 5. Investigate multiple queue support. At the moment there is one
> queue of commands per pass(4) device. If multiple processes
> open the device, they will submit I/O into the same queue and
> get events for the same completions. This is probably the right
> model for most applications, but it is something that could be
> changed later on.
>
> Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
> driver interface.
>
> This utility is intended to be a basic data transfer/copy utility,
> a simple benchmark utility, and an example of how to use the
> asynchronous pass(4) interface.
>
> It can copy data to and from pass(4) devices using any target queue
> depth, starting offset and blocksize for the input and ouptut devices.
> It currently only supports SCSI devices, but could be easily extended
> to support ATA devices.
>
> It can also copy data to and from regular files, block devices, tape
> devices, pipes, stdin, and stdout. It does not support queueing
> multiple commands to any of those targets, since it uses the standard
> read(2)/write(2)/writev(2)/readv(2) system calls.
>
> The I/O is done by two threads, one for the reader and one for the
> writer. The reader thread sends completed read requests to the
> writer thread in strictly sequential order, even if they complete
> out of order. That could be modified later on for random I/O patterns
> or slightly out of order I/O.
>
> camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
> the pass(4) driver and also to send request notifications internally.
>
> For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
> per CAM CCB on the reading side, and a scatter/gather list
> (CAM_DATA_SG) on the writing side. In addition to testing both
> interfaces, this makes any potential reblocking of I/O easier. No
> data is copied between the reader and the writer, but rather the
> reader's buffers are split into multiple I/O requests or combined
> into a single I/O request depending on the input and output blocksize.
>
> For the file I/O path, camdd(8) also uses a single buffer (read(2),
> write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list
> (readv(2), writev(2), preadv(2), pwritev(2)) on writes.
>
> Things that would be nice to do for camdd(8) eventually:
>
> 1. Add support for I/O pattern generation. Patterns like all
> zeros, all ones, LBA-based patterns, random patterns, etc. Right
> Now you can always use /dev/zero, /dev/random, etc.
>
> 2. Add support for a "sink" mode, so we do only reads with no
> writes. Right now, you can use /dev/null.
>
> 3. Add support for automatic queue depth probing, so that we can
> figure out the right queue depth on the input and output side
> for maximum throughput. At the moment it defaults to 6.
>
> 4. Add support for SATA device passthrough I/O.
>
> 5. Add support for random LBAs and/or lengths on the input and
> output sides.
>
> 6. Track average per-I/O latency and busy time. The busy time
> and latency could also feed in to the automatic queue depth
> determination.
>
> sys/cam/scsi/scsi_pass.h:
> Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue
> and fetch asynchronous CAM CCBs respectively.
>
> Although these ioctls do not have a declared argument, they
> both take a union ccb pointer. If we declare a size here,
> the ioctl code in sys/kern/sys_generic.c will malloc and free
> a buffer for either the CCB or the CCB pointer (depending on
> how it is declared). Since we have to keep a copy of the
> CCB (which is fairly large) anyway, having the ioctl malloc
> and free a CCB for each call is wasteful.
>
> sys/cam/scsi/scsi_pass.c:
> Add asynchronous CCB support.
>
> Add two new ioctls, CAMIOQUEUE and CAMIOGET.
>
> CAMIOQUEUE adds a CCB to the incoming queue. The CCB is
> executed immediately (and moved to the active queue) if it
> is an immediate CCB, but otherwise it will be executed
> in passstart() when a CCB is available from the transport layer.
>
> When CCBs are completed (because they are immediate or
> passdone() if they are queued), they are put on the done
> queue.
>
> If we get the final close on the device before all pending
> I/O is complete, all active I/O is moved to the abandoned
> queue and we increment the peripheral reference count so
> that the peripheral driver instance doesn't go away before
> all pending I/O is done.
>
> The new passcreatezone() function is called on the first
> call to the CAMIOQUEUE ioctl on a given device to allocate
> the UMA zones for I/O requests and S/G list buffers. This
> may be good to move off to a taskqueue at some point.
> The new passmemsetup() function allocates memory and
> scatter/gather lists to hold the user's data, and copies
> in any data that needs to be written. For virtual pointers
> (CAM_DATA_VADDR), the kernel buffer is malloced from the
> new pass(4) driver malloc bucket. For virtual
> scatter/gather lists (CAM_DATA_SG), buffers are allocated
> from a new per-pass(9) UMA zone in MAXPHYS-sized chunks.
> Physical pointers are passed in unchanged. We have support
> for up to 16 scatter/gather segments (for the user and
> kernel S/G lists) in the default struct pass_io_req, so
> requests with longer S/G lists require an extra kernel malloc.
>
> The new passcopysglist() function copies a user scatter/gather
> list to a kernel scatter/gather list. The number of elements
> in each list may be different, but (obviously) the amount of data
> stored has to be identical.
>
> The new passmemdone() function copies data out for the
> CAM_DATA_VADDR and CAM_DATA_SG cases.
>
> The new passiocleanup() function restores data pointers in
> user CCBs and frees memory.
>
> Add new functions to support kqueue(2)/kevent(2):
>
> passreadfilt() tells kevent whether or not the done
> queue is empty.
>
> passkqfilter() adds a knote to our list.
>
> passreadfiltdetach() removes a knote from our list.
>
> Add a new function, passpoll(), for poll(2)/select(2)
> to use.
>
> Add devstat(9) support for the queued CCB path.
>
> sys/cam/ata/ata_da.c:
> Add support for the BIO_VLIST bio type.
>
> sys/cam/cam_ccb.h:
> Add a new enumeration for the xflags field in the CCB header.
> (This doesn't change the CCB header, just adds an enumeration to
> use.)
>
> sys/cam/cam_xpt.c:
> Add a new function, xpt_setup_ccb_flags(), that allows specifying
> CCB flags.
>
> sys/cam/cam_xpt.h:
> Add a prototype for xpt_setup_ccb_flags().
>
> sys/cam/scsi/scsi_da.c:
> Add support for BIO_VLIST.
>
> sys/dev/md/md.c:
> Add BIO_VLIST support to md(4).
>
> sys/geom/geom_disk.c:
> Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size
> limiting code in g_disk_start() a bit.
>
> sys/kern/subr_bus_dma.c:
> Change _bus_dmamap_load_vlist() to take a starting offset and
> length.
>
> Add a new function, _bus_dmamap_load_pages(), that will load a list
> of physical pages starting at an offset.
(cannot comment inline because the diff is truncated, sorry).
This is bogus, and defeats the purpose of _bus_dmamap_load_ma by
reintroducing the mechanics already found in bus_dmamap_load_ma_triv.
See r290610, which was the commit that introduced a proper
implementation of bus_dmamap_load_ma in x86.
The problem with the approach taken both in _bus_dmamap_load_pages and
bus_dmamap_load_ma_triv is that they split the request buffer into
arbitrary chunks based on page boundaries, creating segments that no
longer have a size that's a multiple of the sector size. This breaks
drivers like blkfront (and probably other stuff).
The following patch solves the problem AFAICT, and I would like to
commit it ASAP:
Roger.
---
diff --git a/sys/kern/subr_bus_dma.c b/sys/kern/subr_bus_dma.c
index ae30276..cdf176b 100644
--- a/sys/kern/subr_bus_dma.c
+++ b/sys/kern/subr_bus_dma.c
@@ -131,28 +131,6 @@ _bus_dmamap_load_mbuf_sg(bus_dma_tag_t dmat, bus_dmamap_t map,
}
/*
- * Load tlen data starting at offset within a region specified by a list of
- * physical pages.
- */
-static int
-_bus_dmamap_load_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
- vm_page_t *pages, bus_size_t tlen, int offset, int *nsegs, int flags)
-{
- vm_paddr_t paddr;
- bus_size_t len;
- int error, i;
-
- for (i = 0, error = 0; error == 0 && tlen > 0; i++, tlen -= len) {
- len = min(PAGE_SIZE - offset, tlen);
- paddr = VM_PAGE_TO_PHYS(pages[i]) + offset;
- error = _bus_dmamap_load_phys(dmat, map, paddr, len,
- flags, NULL, nsegs);
- offset = 0;
- }
- return (error);
-}
-
-/*
* Load from block io.
*/
static int
@@ -168,8 +146,8 @@ _bus_dmamap_load_bio(bus_dma_tag_t dmat, bus_dmamap_t map, struct bio *bio,
}
if ((bio->bio_flags & BIO_UNMAPPED) != 0)
- return (_bus_dmamap_load_pages(dmat, map, bio->bio_ma,
- bio->bio_bcount, bio->bio_ma_offset, nsegs, flags));
+ return (_bus_dmamap_load_ma(dmat, map, bio->bio_ma,
+ bio->bio_bcount, bio->bio_ma_offset, flags, NULL, nsegs));
return (_bus_dmamap_load_buffer(dmat, map, bio->bio_data,
bio->bio_bcount, kernel_pmap, flags, NULL, nsegs));
More information about the svn-src-head
mailing list