speeding up ugen by an order of magnitude.
Julian Elischer
julian at elischer.org
Tue Jul 6 16:32:34 PDT 2004
So, we had a device that we access through ugen.
the manufacturer said we should get the transaction in 3 seconds
and wiindows and linux did, but FreeBSD got it in 15 seconds.
I suspect since the code is the same, NetBSD would get the same result..
lokking at it I noticed that ugen does everything in 1K bits,
which is ok for USB1, but a bit silly for USB2.
here is the proof-of-concept change that made FreeBSD get 2.8 seconds.
teraserver# cvs diff -u ugen.c
Index: ugen.c
===================================================================
RCS file: /repos/projects/mirrored/freebsd/src/sys/dev/usb/ugen.c,v
retrieving revision 1.38.2.10
diff -u -r1.38.2.10 ugen.c
--- ugen.c 2004/03/01 00:07:22 1.38.2.10
+++ ugen.c 2004/07/06 23:23:17
@@ -572,19 +572,21 @@
return (0);
}
+#define RBFSIZ 131072
Static int
ugen_do_read(struct ugen_softc *sc, int endpt, struct uio *uio, int
flag)
{
struct ugen_endpoint *sce = &sc->sc_endpoints[endpt][IN];
u_int32_t n, tn;
- char buf[UGEN_BBSIZE];
+ char * buf;
usbd_xfer_handle xfer;
usbd_status err;
int s;
int error = 0;
u_char buffer[UGEN_CHUNK];
+
DPRINTFN(5, ("%s: ugenread: %d\n", USBDEVNAME(sc->sc_dev),
endpt));
if (sc->sc_dying)
@@ -605,6 +607,8 @@
return (EIO);
}
+ buf = malloc(RBFSIZ, M_TEMP, M_WAITOK);
+
switch (sce->edesc->bmAttributes & UE_XFERTYPE) {
case UE_INTERRUPT:
/* Block until activity occurred. */
@@ -612,6 +616,7 @@
while (sce->q.c_cc == 0) {
if (flag & IO_NDELAY) {
splx(s);
+ free(buf, M_TEMP);
return (EWOULDBLOCK);
}
sce->state |= UGEN_ASLP;
@@ -645,9 +650,11 @@
break;
case UE_BULK:
xfer = usbd_alloc_xfer(sc->sc_udev);
- if (xfer == 0)
+ if (xfer == 0) {
+ free(buf, M_TEMP);
return (ENOMEM);
- while ((n = min(UGEN_BBSIZE, uio->uio_resid)) != 0) {
+ }
+ while ((n = min(RBFSIZ, uio->uio_resid)) != 0) {
DPRINTFN(1, ("ugenread: start transfer %d
bytes\n",n));
tn = n;
err = usbd_bulk_transfer(
@@ -676,6 +683,7 @@
while (sce->cur == sce->fill) {
if (flag & IO_NDELAY) {
splx(s);
+ free(buf, M_TEMP);
return (EWOULDBLOCK);
}
sce->state |= UGEN_ASLP;
@@ -711,8 +719,10 @@
default:
+ free(buf, M_TEMP);
return (ENXIO);
}
+ free(buf, M_TEMP);
return (error);
}
Notice that do_read and do_write use a STACK buffer.
not good when we are trying to shrink kernel stacks..
probably each pipe on a device should get a buffer allocated for its
use, but bigger than 1K :-)
Anyone have thoughts about what form the final patch should be?
I doubt that mallocing once per transfer is optimal, however
certainly devices that have a lot of endpoints may want a lot of
symultaneous xfers. Who should allocate teh buffers?
I see the same problem in do_write(), but I have not looked at other
device drivers.. I will go look at uscanner.c next just in case it does
the same thing..
Julian
More information about the freebsd-current
mailing list