From imp at bsdimp.com Sat Jan 3 06:29:12 2009 From: imp at bsdimp.com (M. Warner Losh) Date: Sat Jan 3 06:29:19 2009 Subject: fwcontrol patches Message-ID: <20090102.232630.1387162816.imp@bsdimp.com> Here's some minor patches to fwcontrol so that the DV capture mode is "nicer". I'd like to make the EAGAIN handling better, as well as offer automated rewind + play + error recoverty for at least my camera. The first step in that is to have it print an elapsed time instead of the single digit per frame. The second chunk is just removing dead code. Comments? Warner Index: fwdv.c =================================================================== --- fwdv.c (revision 186639) +++ fwdv.c (working copy) @@ -202,7 +202,9 @@ (dv->payload[0] & DV_DSF_12) == 0) dv->payload[0] |= DV_DSF_12; nb = nblocks[system]; - fprintf(stderr, "%d", k%10); + fprintf(stderr, "%d:%02d:%02d %2d\r", + k / (1800 * 60), (k / 1800) % 60, + (k / 30) % 60, k % 30); #if FIX_FRAME if (m > 0 && m != nb) { /* padding bad frame */ @@ -221,10 +223,6 @@ } #endif k++; - if (k % frame_rate[system] == 0) { - /* every second */ - fprintf(stderr, "\n"); - } fflush(stderr); m = 0; } Index: fwmpegts.c =================================================================== --- fwmpegts.c (revision 186639) +++ fwmpegts.c (working copy) @@ -195,10 +195,9 @@ if (len < 0) { if (errno == EAGAIN) { fprintf(stderr, "(EAGAIN) - push 'Play'?\n"); - if (len <= 0) - continue; - } else - err(1, "read failed"); + continue; + } + err(1, "read failed"); } ptr = (uint32_t *) buf; From bugmaster at FreeBSD.org Mon Jan 5 11:06:51 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jan 5 11:07:51 2009 Subject: Current problem reports assigned to freebsd-firewire@FreeBSD.org Message-ID: <200901051106.n05B6oXr002761@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/113785 firewire [firewire] dropouts when playing DV on firewire o kern/74238 firewire [firewire] fw_rcv: unknown response; firewire ad-hoc w 2 problems total. From sean.bruno at dsl-only.net Mon Jan 5 19:23:20 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Mon Jan 5 19:23:52 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost Message-ID: <1231181797.21260.8.camel@localhost.localdomain> Well ... this has really sent me down the rabbit hole the last couple of days. There is a need to audit all locking in the firewire stack right now and I have started that task. Essentially, threads, callouts, interrupts and task queues are all jumping around causing context to be switched from one thread to another. It's kind of bad in there and I need to sort it out. I'm working with a stable/7 tree, so I've started a patch for it. This patch has quite a few printf -> device_printf changes in it, so try to ignore thost for now. The meat of the patch is a judicious implementation of FW_GLOCK() in certain code areas. Note that sometimes the code is just trying to get the lock and then drops it immediately. This is not very optimal, but it does the trick. I'm still seeing a high level of broken log messages in /var/log/messages but this may help the issue you were seeing. Give it a spin. Sean -------------- next part -------------- A non-text attachment was scrubbed... Name: firewire.diff Type: text/x-patch Size: 9157 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-firewire/attachments/20090105/b632981f/firewire.bin From freebsd at sopwith.solgatos.com Tue Jan 6 05:19:01 2009 From: freebsd at sopwith.solgatos.com (Dieter) Date: Tue Jan 6 05:19:07 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost In-Reply-To: Your message of "Mon, 05 Jan 2009 10:56:37 PST." <1231181797.21260.8.camel@localhost.localdomain> Message-ID: <200901060152.BAA21475@sopwith.solgatos.com> In message <1231181797.21260.8.camel@localhost.localdomain>, Sean Bruno writes: > @@ -1388,6 +1387,7 @@ > struct fw_device *fwdev; > > s = splfw(); > + FW_GLOCK(fc); > fc->status = FWBUSEXPLORE; > > /* Invalidate all devices, just after bus reset. */ > @@ -1396,6 +1396,7 @@ > fwdev->status = FWDEVINVAL; > fwdev->rcnt = 0; > } > + FW_GUNLOCK(fc); > splx(s); > > wakeup((void *)fc); If the (null these days) spl calls have been replaced with FW_GLOCK/FW_GUNLOCK calls, can the spl calls be removed now? > @@ -1922,6 +1924,8 @@ > u_int i; > struct firewire_comm *fc = (struct firewire_comm *)sc; > > + FW_GLOCK(fc); > + FW_GUNLOCK(fc); > if (stat & OHCI_INT_DMA_IR) { > irstat = atomic_readandclear_int(&sc->irstat); > for(i = 0; i < fc->nisodma ; i++){ This needs to be reviewed by someone who understands what is protected by getting a lock and then immediately releasing it. I have no clue. > @@ -1969,8 +1973,8 @@ > OWRITE(sc, OHCI_LNKCTLCLR, OHCI_CNTL_CYCTIMER); > #endif > OWRITE(sc, FWOHCI_INTMASKCLR, OHCI_INT_CYC_LOST); > - device_printf(fc->dev, "too many cycle lost, " > - "no cycle master presents?\n"); > + device_printf(fc->dev, "%s: too many cycle lost, " > + "no cycle master presents?\n", __func__); > } Should this be "too many cycles lost, " and "no cycle master present?\n" > uint32_t fun; > > + FW_GLOCK(fc); > device_printf(fc->dev, "Initiate bus reset\n"); > sc = (struct fwohci_softc *)fc; > > @@ -2487,6 +2495,7 @@ > fun |= FW_PHY_ISBR | FW_PHY_RHB; > fun = fwphy_wrdata(sc, FW_PHY_ISBR_REG, fun); > #endif > + FW_GUNLOCK(fc); > } Does the lock need to protect the printf? From sean.bruno at dsl-only.net Tue Jan 6 06:20:06 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Tue Jan 6 06:20:12 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost In-Reply-To: <200901060152.BAA21475@sopwith.solgatos.com> References: <200901060152.BAA21475@sopwith.solgatos.com> Message-ID: <1231222803.21260.17.camel@localhost.localdomain> On Mon, 2009-01-05 at 17:52 +0000, Dieter wrote: > In message <1231181797.21260.8.camel@localhost.localdomain>, Sean Bruno writes: > > > @@ -1388,6 +1387,7 @@ > > struct fw_device *fwdev; > > > > s = splfw(); > > + FW_GLOCK(fc); > > fc->status = FWBUSEXPLORE; > > > > /* Invalidate all devices, just after bus reset. */ > > @@ -1396,6 +1396,7 @@ > > fwdev->status = FWDEVINVAL; > > fwdev->rcnt = 0; > > } > > + FW_GUNLOCK(fc); > > splx(s); > > > > wakeup((void *)fc); > > If the (null these days) spl calls have been replaced with FW_GLOCK/FW_GUNLOCK > calls, can the spl calls be removed now? > They most definitely can be removed. However, they are a roadmap to where locks SHOULD be. I'm using them as guideposts through the code to see if I can figure out what's missing. > > @@ -1922,6 +1924,8 @@ > > u_int i; > > struct firewire_comm *fc = (struct firewire_comm *)sc; > > > > + FW_GLOCK(fc); > > + FW_GUNLOCK(fc); > > if (stat & OHCI_INT_DMA_IR) { > > irstat = atomic_readandclear_int(&sc->irstat); > > for(i = 0; i < fc->nisodma ; i++){ > > This needs to be reviewed by someone who understands what is protected by > getting a lock and then immediately releasing it. I have no clue. > Indeed, it's gross. I only sent this patch over as a "test" of sorts to validate that the issue I'm looking into(locking in the firewire driver) is actually something related to what you reported in this issue. > > @@ -1969,8 +1973,8 @@ > > OWRITE(sc, OHCI_LNKCTLCLR, OHCI_CNTL_CYCTIMER); > > #endif > > OWRITE(sc, FWOHCI_INTMASKCLR, OHCI_INT_CYC_LOST); > > - device_printf(fc->dev, "too many cycle lost, " > > - "no cycle master presents?\n"); > > + device_printf(fc->dev, "%s: too many cycle lost, " > > + "no cycle master presents?\n", __func__); > > } > > Should this be "too many cycles lost, " and "no cycle master present?\n" > Who knows. I haven't reviewed this specific chunk of code to understand what it really means as per the FW specs. > > uint32_t fun; > > > > + FW_GLOCK(fc); > > device_printf(fc->dev, "Initiate bus reset\n"); > > sc = (struct fwohci_softc *)fc; > > > > @@ -2487,6 +2495,7 @@ > > fun |= FW_PHY_ISBR | FW_PHY_RHB; > > fun = fwphy_wrdata(sc, FW_PHY_ISBR_REG, fun); > > #endif > > + FW_GUNLOCK(fc); > > } > > Does the lock need to protect the printf? These are gross, overpowered, way to heavy handed locks that I'm playing with. I need to prevent pre-emption of certain events while they are in progress. One of these events is the firewire's assertion of "bus reset" on the firewire device. I see the h/w interrupt firing before this code can actually complete, causing the driver to be confused on occasion. Thanks for surveying this code with me, it helps to see what other people's eyes can pick up. I hope to have this in working order soon-ish. Sean From freebsd at sopwith.solgatos.com Wed Jan 7 05:13:17 2009 From: freebsd at sopwith.solgatos.com (Dieter) Date: Wed Jan 7 05:13:22 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost In-Reply-To: Your message of "Mon, 05 Jan 2009 22:20:03 PST." <1231222803.21260.17.camel@localhost.localdomain> Message-ID: <200901061733.RAA08944@sopwith.solgatos.com> In message <1231222803.21260.17.camel@localhost.localdomain>, Sean Bruno writes: > > > uint32_t fun; > > > > > > + FW_GLOCK(fc); > > > device_printf(fc->dev, "Initiate bus reset\n"); > > > sc = (struct fwohci_softc *)fc; > > > > > > @@ -2487,6 +2495,7 @@ > > > fun |= FW_PHY_ISBR | FW_PHY_RHB; > > > fun = fwphy_wrdata(sc, FW_PHY_ISBR_REG, fun); > > > #endif > > > + FW_GUNLOCK(fc); > > > } > > > > Does the lock need to protect the printf? > > These are gross, overpowered, way to heavy handed locks that I'm playing > with. I need to prevent pre-emption of certain events while they are in > progress. One of these events is the firewire's assertion of "bus > reset" on the firewire device. I see the h/w interrupt firing before > this code can actually complete, causing the driver to be confused on > occasion. I understand the basic concept of locking, or at least I did many many years ago when spls really were levels. Since then they changed spls to not be levels (at least that's what the man page says) and later replaced with mutex. I assume these changes are at least in part to better support SMP. What I don't understand is things like getting a lock and immediately releasing it, which appears to me to protect nothing. Or why the printf needs to be inside the locked section of code. I thought the goal was to hold a lock for as short a time as possible, and it is not clear to me why the printf needs to be protected. From sean.bruno at dsl-only.net Sat Jan 10 11:07:56 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Sat Jan 10 11:08:03 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost In-Reply-To: <200901061733.RAA08944@sopwith.solgatos.com> References: <200901061733.RAA08944@sopwith.solgatos.com> Message-ID: <1231613677.17580.6.camel@localhost.localdomain> On Tue, 2009-01-06 at 09:33 +0000, Dieter wrote: > In message <1231222803.21260.17.camel@localhost.localdomain>, Sean Bruno writes: > > > > > uint32_t fun; > > > > > > > > + FW_GLOCK(fc); > > > > device_printf(fc->dev, "Initiate bus reset\n"); > > > > sc = (struct fwohci_softc *)fc; > > > > > > > > @@ -2487,6 +2495,7 @@ > > > > fun |= FW_PHY_ISBR | FW_PHY_RHB; > > > > fun = fwphy_wrdata(sc, FW_PHY_ISBR_REG, fun); > > > > #endif > > > > + FW_GUNLOCK(fc); > > > > } > > > > > > Does the lock need to protect the printf? > > > > These are gross, overpowered, way to heavy handed locks that I'm playing > > with. I need to prevent pre-emption of certain events while they are in > > progress. One of these events is the firewire's assertion of "bus > > reset" on the firewire device. I see the h/w interrupt firing before > > this code can actually complete, causing the driver to be confused on > > occasion. > > I understand the basic concept of locking, or at least I did many many years > ago when spls really were levels. Since then they changed spls to not be levels > (at least that's what the man page says) and later replaced with mutex. I assume > these changes are at least in part to better support SMP. > > What I don't understand is things like getting a lock and immediately releasing > it, which appears to me to protect nothing. Or why the printf needs to be inside > the locked section of code. I thought the goal was to hold a lock for as short a > time as possible, and it is not clear to me why the printf needs to be protected. The printf DOES NOT need protection at all. You are absolutely correct. The lock should be moved lower in the function for sure. This is just my attempt to see if I am looking at the same problem you are reporting. Conceptually, I am trying to keep the h/w interrupt which fires during a bus reset from preempting this function and executing before this function is finished. The acquisition of the lock and release keeps the interrupt handler from executing before the setting of bus reset is finished. I hope to fine tune the locks after I have confirmed operational goodness. Sean From bugmaster at FreeBSD.org Mon Jan 12 03:06:51 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jan 12 03:07:48 2009 Subject: Current problem reports assigned to freebsd-firewire@FreeBSD.org Message-ID: <200901121106.n0CB6olk091963@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/113785 firewire [firewire] dropouts when playing DV on firewire o kern/74238 firewire [firewire] fw_rcv: unknown response; firewire ad-hoc w 2 problems total. From freebsd at sopwith.solgatos.com Mon Jan 12 21:07:24 2009 From: freebsd at sopwith.solgatos.com (Dieter) Date: Mon Jan 12 21:07:30 2009 Subject: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost In-Reply-To: Your message of "Sat, 10 Jan 2009 10:54:37 PST." <1231613677.17580.6.camel@localhost.localdomain> Message-ID: <200901122304.XAA14763@sopwith.solgatos.com> Sean, I tried your patch, but bad news: a bus reset still tromps on Ethernet. From bugmaster at FreeBSD.org Mon Jan 19 03:06:57 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jan 19 03:07:40 2009 Subject: Current problem reports assigned to freebsd-firewire@FreeBSD.org Message-ID: <200901191106.n0JB6ubZ062937@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/113785 firewire [firewire] dropouts when playing DV on firewire o kern/74238 firewire [firewire] fw_rcv: unknown response; firewire ad-hoc w 2 problems total. From sean.bruno at dsl-only.net Mon Jan 19 19:50:14 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Mon Jan 19 19:50:21 2009 Subject: Firewire patch Message-ID: <1232423412.8966.2.camel@localhost.localdomain> Sorry it's taking me so long to wrap my brain around this code folks. Here is a small-ish patch I'd like to get comitted soon. This removes the filter implementation of the bus reset interrupt. It moves some malloc's into a more appropriate area and fixes a potential deadlock in SBP. Take a peak and give it a spin on stable/7 Sean -------------- next part -------------- A non-text attachment was scrubbed... Name: firewire.diff Type: text/x-patch Size: 9258 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-firewire/attachments/20090120/f001787a/firewire.bin From freebsd at sopwith.solgatos.com Tue Jan 20 21:16:11 2009 From: freebsd at sopwith.solgatos.com (Dieter) Date: Tue Jan 20 21:16:18 2009 Subject: Firewire patch In-Reply-To: Your message of "Mon, 19 Jan 2009 19:50:12 PST." <1232423412.8966.2.camel@localhost.localdomain> Message-ID: <200901210010.AAA00037@sopwith.solgatos.com> > Hrm ... I don't seem to have sent this to you. Weird. That's ok, I'm subscribed to the -firewire@ list and saw it there. BTW, your ISP doesn't like me so I can't email you off-list. Probably the usual brain dead throw-the-baby-out-with-the-bathwater attempts at spam reduction. > Here is a small-ish patch I'd like to get comitted soon. This removes > the filter implementation of the bus reset interrupt. It moves some > malloc's into a more appropriate area and fixes a potential deadlock in > SBP. > > Take a peak and give it a spin on stable/7 I looked at it, but I don't understand it well enough to count as a code reviewer. I applied this patch to 7.0. (Still waiting for the 7.1 bloatfest to finish downloading...) Patch rejected the FWOHCI_INTFILT stuff, but I think the code does what you want. However I'm not sure what all has changed between 7.0 and whatever bleeding-edge bits you're working with, there could be something significant. Results: The hack to put the NEC controller into non-CYCLEMASTER mode still works, and video plays ok. So it looks like your patch probably doesn't break anything for me. The patch did not fix PR 118093 (bus reset still blocks Ethernet), or PR 113785 (the VIA controller still refuses to go into non-CYCLEMASTER mode). While I had the camcorder plugged into the VIA, I booted the kernel with your previous patch (2008-01-05), but no joy. I wonder if the VIA controller might be faster or slower than the NEC, so fixing the locking might fix PR 113785. If you'd like me to do further testing, such as with 7.1 once I get it installed, and/or I can add my workaround for PR 118093 (change some printfs to log(9)) and leave your patch in and test it for more than a minute or two, let me know. From bugmaster at FreeBSD.org Mon Jan 26 03:06:55 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jan 26 03:07:41 2009 Subject: Current problem reports assigned to freebsd-firewire@FreeBSD.org Message-ID: <200901261106.n0QB6slj024232@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/113785 firewire [firewire] dropouts when playing DV on firewire o kern/74238 firewire [firewire] fw_rcv: unknown response; firewire ad-hoc w 2 problems total. From sean.bruno at dsl-only.net Fri Jan 30 23:44:04 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Fri Jan 30 23:44:11 2009 Subject: Updates to -current Message-ID: <1233387842.5501.12.camel@localhost.localdomain> Here is an update to -current that I am working on. This update fixes the following: SBP was not properly detaching from CAM if a target is powered off or disconnected from the firewire bus. This would manifest itself by causing things like "camcontrol rescan all" to hang indefinitely. SBP would have panic'd in sbp_orb_pointer() if fw_asyreq() failed. Add locking during BUS_RESET conditions to keep the bus_reset handler from firing before the code was finished asserting the BUS_RESET condition. Remove FWOHCI_INTFILT as the bus reset handler really does WAY more than it should. Rework the generation indicator/flag to more properly adhere to the specifications. In this context, the generation flag should be 0 or 1 to indicate that the CROM has changed. It should not be incremented indefinitely without bounds. -------------- next part -------------- A non-text attachment was scrubbed... Name: firewire.diff Type: text/x-patch Size: 9552 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-firewire/attachments/20090131/b562039e/firewire.bin