svn commit: r227778 - head/sys/net

Sat Dec 3 04:30:59 UTC 2011

On 03/12/2011, at 12:02 PM, Jung-uk Kim wrote:

[snip]

>>>>> 
>>>>> - The snippet of code at the beginning of catchpacket() that
>>>>> was manipulating the struct bintime derived from bpf_gettime()
>>>>> was gross and has been removed in favour of selecting the
>>>>> right {get}bin{up}time() function call in bpf_gettime().
>>>> 
>>>> I did that to reduce branching.  Since you are introducing more
>>>> branches, it warrants a function pointer now.
>> 
>> There's another reason, BTW.  If mbufs are tagged with timestamps
>> (where only monotonic timestamps are allowed), they must be
>> converted within the bpf.c.  I forgot all the details. :-(
>> 
>>> I see, thanks for the explanation. Could you elaborate a bit more
>>> about how you would implement the function pointer idea? I'm also
>>> curious in the !FFCLOCK case just how much overhead having the
>>> 2-layer nested if/else adds. I'm not an very optimisation savvy
>>> person, but I'm wondering if it's actually worth micro-optimising
>>> this code.
>>> 
>>> My initial thoughts about your function pointer idea lead to
>>> adding a function pointer in the bpf_d and setting it to the
>>> appropriate function to get the timestamp from at bpf_d creation
>>> or ioctl time. Whilst I like this idea, I can't see how it would
>>> work given that the various functions involved in time/ffcounter
>>> stamp generation all have different signatures.
>>> 
>>> We could have multiple variants of bpf_gettime() which each call
>>> the appropriate underlying functions to generate the appropriate
>>> stamp. Would add quite a lot of code but would reduce the
>>> overhead of calling bpf_gettime() to an indirect function call +
>>> the underlying stamp generation function call. This also solves
>>> the problem of multiple function signatures.
>> 
>> Please see my patch:
>> 
>> http://people.freebsd.org/~jkim/bpf_ffclock.diff
> 
> I booted up the kernel and found it just crashes because of stupid 
> typos.  Now a new patch is uploaded in place.  Sorry.
> 
> Anyway, I see no regression nor ABI breakage. :-)
> 
> Jung-uk Kim
> 
>> This is what I originally intended to propose (just
>> compile-tested). I did not use union *intentionally* to make the
>> patch simpler. Instead, I added BPF_FFCOUNT() macro.  It's kinda
>> ugly but I don't see any compelling reason to introduce the union
>> (yet).
>> 
>>> We would also need to add the function pointer to the bpf_d
>>> struct which I guess breaks the ABI, something we're trying to
>>> avoid with this patch as we want to MFC it.
>> 
>> It does not affect user applications as bpfdesc.h is only meant for
>> kernel uses.  Also, it is very hard for me to imagine any
>> third-party drivers that use struct bpf_d directly.
>> 
>> Jung-uk Kim

Jung-uk, thanks for the patch, I understand much better.

Overall I am definitely not against this patch. I see two minor issues that require some attention though.

Using the higher level ffclock_[get]binuptime() instead of ffclock_abstime() adds an extra function call. My performance metric of interest is timestamping latency, because all things being equal, the variablity of the timestamping delay gives a less accurate picture of the network and affects the synchronisation daemon and the final quality of the clock.
Although I agree that the patch we proposed with Lawrence can be improved on this front, adding one more level of abstraction to the timestamping function goes the opposite direction. The earlier the timestamp is created, the better the results.

I am definitely not very savvy on the code optimisation front, and I need to be educated there. I am wondering if removing the branching in the code is worth this extra variability in latency. 

Looking ahead to the next patch we want to propose, the BPF header will contain a timestamp in seconds from one of the two clocks and a feed-forward counter. Here is a very simplistic proof of concept based on Lawrence's patch. I just put it there to support the discussion:

http://www.cubinlab.ee.unimelb.edu.au/~jrid/patches/r228191/bpfffcounter_concept_v2.patch

Since it is essential to read the hardware counter only once to produce the timestamp in second and the ffcounter, the prototype of the timestamping function stored in the bpf_d would have to be changed (if the kernel is compiled with the FFCLOCK option, at least).
For example, if BPF_T_FFCLOCK is set, this change will make use of ffclock_abstime() and solves the performance issue I raised above.
If the clock flag is not set, then the [get]binuptime() will need a wrapper function, and they will bear the timestamping latency cost.

Is this acceptable? If yes, then I don't have any objections to the patch.

Julien