dmtpps driver on beaglebone [was: Is it a good idea to use a usb-serial adapter for PPS input? Yes, it is.]
Ian Lepore
ian at freebsd.org
Mon Aug 19 15:52:08 UTC 2019
On Mon, 2019-08-19 at 18:22 +0300, Konstantin Belousov wrote:
> On Mon, Aug 19, 2019 at 08:47:46AM -0600, Ian Lepore wrote:
> > I first ran into the problem on a beaglebone a few years ago, not that
> > long after the number was reduced to two. The number 4 is pretty much
> > just a vague-ish memory of the debugging I did back then. When you
> > don't have enough timehands, the symptom is that you lose most of the
> > pps events (but enough get captured that ntpd will more or less run
> > properly on the reduced data).
> >
> > The busier the system is, the more events get captured succesfully,
> > because the loss is caused when the pps pulse happens while sleeping in
> > cpu_idle(); the path that unwinds out of there leads to calling
> > tc_windup 3 times, iirc, which means the value latched in the
> > timecounter hardware belongs to a set of timehands whose generation
> > count is not valid anymore because of all the tc_windup calls, and the
> > event gets discarded because of the generation mismatch.
>
> Is the problem specific to the platform then ?
I think it is specific to timecounters that implement pps capture using
the polling mechanism (tc_poll_pps pointer is non-NULL), and I remember
that there was something about the sequence of events that made me
think it would only be a problem on a single-core system. Or maybe
just way more likely on a single-core system (I've got similarly-
structured drivers on multicore systems that don't have this problem).
But I don't think it was so much platform-specific as specific to that
pair of conditions.
>
> The issue you see might come from tc_ticktock(), but it is strange.
> All calls to tc_windup() are serialized by spinlock. tc_ticktock(),
> potentially being called from interrupt context, uses try-lock to avoid
> spinning while other CPU does the update. But this is impossible on UP
> machine, because spinlock in top half disables interrupts, which means
> that event that triggers tc_ticktock() call and which trylock fails
> should be impossible, until the top half finishes.
>
> Having too many timehands is bad because reader get a higher chance to
> find obsoleted th pointer. With the rework of the generation counts for
> timehands reader would not use stalled data, but it might have to spin
> somewhat prolonged time.
>
> In fact, it might be worth trying reducing the timehands count from two
> to one, since then there is no non-current th at all.
The driver involved is arm/ti/am335x/am335x_dmptpps.c. The basic
sequence that occurs is:
The dmtpps_poll() function is called (via the timecounter.tc_poll_pps
pointer), and it calls pps_capture(), then schedules a taskqueue_fast
task to do the rest of the processing. That task function calls
pps_event(), and most of the time pps_event() does nothing because it
hits this early out:
/* If the timecounter was wound up underneath us, bail out. */
if (pps->capgen == 0 || pps->capgen !=
atomic_load_acq_int(&pps->capth->th_generation))
return;
I forget the exact sequence, but there was something about getting from
the point where the taskqueue task begins running to the point where it
calls pps_event() that almost always burned through 3 timehands
generations, but if the machine was busy enough, sometimes there would
only be one windup and the event would get processed properly.
I don't remember why I used a taskqueue task to separate the capture
and event processing. Maybe so I could use a mutex to protect the pps
event state, and it was written before I extended the internal pps api
stuff to let drivers use a spin mutex with pps_ioctl().
-- Ian
More information about the freebsd-arm
mailing list