svn commit: r227778 - head/sys/net

Sun Dec 4 12:31:19 UTC 2011

On 12/03/11 12:02, Jung-uk Kim wrote:
> On Friday 02 December 2011 07:27 pm, Jung-uk Kim wrote:
>> On Thursday 01 December 2011 11:43 pm, Lawrence Stewart wrote:
>>> On 12/02/11 03:43, Jung-uk Kim wrote:
>>>> On Thursday 01 December 2011 10:11 am, Lawrence Stewart wrote:
>>>>> On 11/30/11 05:09, Jung-uk Kim wrote:
>>>>>> On Tuesday 29 November 2011 11:13 am, Lawrence Stewart wrote:
[snip]
>>>>>>> Here's the first of the patches:
>>>>>>>
>>>>>>> http://people.freebsd.org/~lstewart/patches/misc/ffclock_bpf
>>>>>>> _i nt act abi_10.x.r228130.patch
>>>>>>
>>>>>> I only glanced at it but it looks very close to what I wanted
>>>>>> to suggest.
>>>>>
>>>>> Final candidate patch is at:
>>>>>
>>>>> http://people.freebsd.org/~lstewart/patches/misc/ffclock_bpf_i
>>>>> nt act abi_10.x.r228180.patch
>>>>>
>>>>> Assuming it passes the "make tinderbox" build I'm currently
>>>>> running and no further input is received from interested
>>>>> parties, I plan to commit it in ~10 hours.
>>>>>
>>>>> Changes since the r228130 patch I sent previously:
>>>>>
>>>>> - The new flags in bpf.h are added unconditionally so that
>>>>> they can always be referenced at compile time and a decision
>>>>> made at runtime as to whether a flag will be set or not. Using
>>>>> one of the new flags when the kernel doesn't have FFCLOCK
>>>>> compiled in results in the flag being ignored. An app should
>>>>> check for the existence of the "ffclock" kernel feature or the
>>>>> "kern.sysclock" sysctl tree before attempting to use the
>>>>> flags.
>>>>>
>>>>> - This patch will hopefully be MFCed at some point, so I added
>>>>> a CTASSERT to bpf.c to ensure that the ABI of structs
>>>>> bpf_hdr32, bpf_hdr and bpf_xhdr remains intact when FFCLOCK is
>>>>> enabled and the union of a ffcounter and struct
>>>>> timeval32/timeval/bpf_ts is switched in.
>>>>>
>>>>> - bpf_gettime() more comprehensively covers all the possible
>>>>> cases of flag combination and does sensible things for each
>>>>> case (even though some cases are rather silly).
>>>>>
>>>>> - The snippet of code at the beginning of catchpacket() that
>>>>> was manipulating the struct bintime derived from bpf_gettime()
>>>>> was gross and has been removed in favour of selecting the
>>>>> right {get}bin{up}time() function call in bpf_gettime().
>>>>
>>>> I did that to reduce branching.  Since you are introducing more
>>>> branches, it warrants a function pointer now.
>>
>> There's another reason, BTW.  If mbufs are tagged with timestamps
>> (where only monotonic timestamps are allowed), they must be
>> converted within the bpf.c.  I forgot all the details. :-(

We should document this knowledge in some code comments.

>>> I see, thanks for the explanation. Could you elaborate a bit more
>>> about how you would implement the function pointer idea? I'm also
>>> curious in the !FFCLOCK case just how much overhead having the
>>> 2-layer nested if/else adds. I'm not an very optimisation savvy
>>> person, but I'm wondering if it's actually worth micro-optimising
>>> this code.
>>>
>>> My initial thoughts about your function pointer idea lead to
>>> adding a function pointer in the bpf_d and setting it to the
>>> appropriate function to get the timestamp from at bpf_d creation
>>> or ioctl time. Whilst I like this idea, I can't see how it would
>>> work given that the various functions involved in time/ffcounter
>>> stamp generation all have different signatures.
>>>
>>> We could have multiple variants of bpf_gettime() which each call
>>> the appropriate underlying functions to generate the appropriate
>>> stamp. Would add quite a lot of code but would reduce the
>>> overhead of calling bpf_gettime() to an indirect function call +
>>> the underlying stamp generation function call. This also solves
>>> the problem of multiple function signatures.
>>
>> Please see my patch:
>>
>> http://people.freebsd.org/~jkim/bpf_ffclock.diff
>
> I booted up the kernel and found it just crashes because of stupid
> typos.  Now a new patch is uploaded in place.  Sorry.
>
> Anyway, I see no regression nor ABI breakage. :-)

struct bpf_d being part of the ABI was the main thing I was concerned 
about, so given that it isn't I like your approach a lot.

As noted by Julien, this approach does introduce problems with respect 
to the follow up patch that adds a permanent bh_ffcounter member to the 
bpf header. I thought the fromclock() API would sufficiently meet the 
needs of consumers like BPF, but if we were to proceed with something 
like Jung-uk's proposed patch I don't think that's true anymore.

We decided to bite the bullet and devise an API that is more compact and 
can return all appropriate time information from all underlying 
sysclocks in an efficient manner - something like a more generic version 
of ffclock_abstime(). Julien and I spent quite some time today nutting 
out details, and Julien has done a proof of concept patch against my 
proposed BPF patch which looks good as a starting point for discussion:

http://www.cubinlab.ee.unimelb.edu.au/~jrid/patches/r228254/sysclock_snap.patch

It needs a bit more work and should be split into two patches (one 
introducing the API and the other being the BPF changes).

With something like this though, the BPF patch becomes radically 
simplified in the FFCLOCK case. We don't even need a function pointer 
cached in the bpf_d anymore, but could cache a struct sysclock_snap 
there instead if we really wanted to minimise overhead in bpf_gettime().

We'll have a go at refining the patch tomorrow hopefully, but wanted to 
put this out there for early consideration.

Cheers,
Lawrence