svn commit: r227778 - head/sys/net

Wed Nov 23 06:42:19 UTC 2011

On 23/11/2011, at 1:00 AM, Lawrence Stewart wrote:

> On 11/23/11 00:30, John Baldwin wrote:
>> On Monday, November 21, 2011 2:28:10 am Lawrence Stewart wrote:
>>> On 11/21/11 17:18, Julien Ridoux wrote:
>>>> 
>>>> On 21/11/2011, at 4:39 PM, Lawrence Stewart wrote:
>>>> 
>>>>> On 11/21/11 16:12, Ben Kaduk wrote:
>>>>>> On Sun, Nov 20, 2011 at 11:17 PM, Lawrence
>>>>>> Stewart<lstewart at freebsd.org>    wrote:
>>>>>>> Author: lstewart Date: Mon Nov 21 04:17:24 2011 New Revision:
>>>>>>> 227778 URL: http://svn.freebsd.org/changeset/base/227778
>>>>>>> 
>>>>>>> Log: - When feed-forward clock support is compiled in, change
>>>>>>> the BPF header to contain both a regular timestamp obtained
>>>>>>> from the system clock and the current feed-forward ffcounter
>>>>>>> value. This enables new possibilities including
>>>>>> 
>>>>>> Is it really necessary to make the ABI dependent on a kernel
>>>>>> configuration option?  This causes all sorts of headaches if
>>>>>> loadable modules ever want to use that ABI, something that we
>>>>>> just ran into with vm_page_t and friends and had a long thread on
>>>>>> -current about.
>>>>> 
>>>>> Fair question. Julien, if pcap and other consumers will happily
>>>>> ignore the new ffcount_stamp member in the bpf header, is there any
>>>>> reason to conditionally add the ffcounter into the header struct?
>>>> 
>>>> It is a valid question indeed. The feedback I have received so far
>>>> was to not have the feed-forward clock support be a default kernel
>>>> configuration option. What follows is based on this assumption.
>>>> 
>>>> The commit (r227747) introduces sysctl that are conditioned by the
>>>> same "FFCLOCK" kernel configuration option. If a loadable module
>>>> tests for the presence of this sysctl, it will know if the
>>>> ffcount_stamp member is available. Is it too much of a hack?
>>>> 
>>>> Alternatively, if the ffcounter is added to the bpf header
>>>> unconditionally, the ffcount_stamp member can be set to 0. Loadable
>>>> modules will then see a consistent ABI but will retrieve a
>>>> meaningless value.
>>>> 
>>>> I am not sure which option makes more sense, any preference?
>>> 
>>> If I understand the issues correctly, I think the appropriate path
>>> forward is to remove the conditional change to the bpf header and have
>>> ffcount_stamp become a permanent member of the struct. We'll just leave
>>> the member uninitialised in the !FFCLOCK case. This change will make the
>>> patch un-MFCable, but I think that's ok.
>>> 
>>> As to the issue of how a kernel module would detect if it's being loaded
>>> into a FFCLOCK enabled kernel, why wouldn't we expect modules to
>>> "#include opt_ffclock.h" and conditionally compile code based on FFCLOCK
>>> being defined? Is there a use case for run-time (as opposed to
>>> compile-time) module detection of feed-forward clock capabilities?
>> 
>> Think of standalone modules that are not built as part of a kernel (e.g.
>> 3rd party device drivers).  In general we should avoid having structures
>> change size for kernel options, especially common structures.  It just adds
>> lots of pain and suffering and complexity.  We are stuck with it for PAE on
>> i386 (which causes pain), and for LOCK_PROFILING (but that is sufficiently
>> rare and expensive it seems to be ok).  I think 8 bytes for bpf packet is
>> not sufficiently expensive to justify the extra headache.  Just always leave
>> the new field in.
> 
> hmm... Julien almost has a patch finished which accomplishes what my most recent email in this thread describes. Julien, I suggest we get it finished and follow up to this thread with a pointer to the patch for people to look at. If there's still a strong feeling that what it does is going to bring pain we can do away with the new BPF_FFCOUNTER config option and have the bpf header struct just grow by 8 bytes.
> 
> Stay tuned...

Thanks all for the feedback. With some delay, I have a patch against r227871 that implements what Lawrence proposed. You can find it here:
http://www.cubinlab.ee.unimelb.edu.au/~jrid/patches/ffclock-bpf-header-r227871.patch

I have tested this under a few typical scenario, it works as expected but already brings some headaches (hence the long delay mentioned above :-)).

I thought a bit more of user cases. I believe many of them call for having both feed-forward counter and its conversion in second be present in the BPF header. For example, this allows to have absolute packet departure/arrival times (as per usual), but also provides the opportunity to compute inter-arrival times accurately using the difference clock.
There are other examples I can think of, and if one believe the feed-forward clock approach becomes more popular, such usages will be more and more common.

Assuming the BPF header grows by 8 bytes independent of any kernel option, I admit that the current implementation is a bit ugly. The BPF structure is not nicely packed and looks clunky. Ideally, the feed-forward counter should be placed just below the bh_tstamp member, but this would require libpcap and all ports depending on it to be recompiled after this change.

What is your favourite option?

Julien