[RFC] BPF timestamping

Jung-uk Kim jkim at FreeBSD.org
Fri Jun 11 01:24:17 UTC 2010


On Thursday 10 June 2010 05:45 am, Bruce Evans wrote:
> On Wed, 9 Jun 2010, Jung-uk Kim wrote:
> > bpf(4) can only timestamp packets with microtime(9).  I want to
> > expand it to be able to use different format and resolution.  The
> > patch is here:
> >
> > http://people.freebsd.org/~jkim/bpf_tstamp.diff
> >
> > With this patch, we can select different format and resolution of
> > the timestamps.  It is done via ioctl(2) with BIOCSTSTAMP
> > command. Similarly, you can get the current format and resolution
> > with BIOCGTSTAMP command.  Currently, the following functions are
> > available:
> >
> > 	BPF_T_MICROTIME		microtime(9)
> > 	BPF_T_NANOTIME		nanotime(9)
> > 	BPF_T_BINTIME		bintime(9)
> > 	BPF_T_MICROTIME_FAST	getmicrotime(9)
> > 	BPF_T_NANOTIME_FAST	getnanotime(9)
> > 	BPF_T_BINTIME_FAST	getbintime(9)
> > 	BPF_T_NONE		ignore time stamps
>
> This has too many timestamp types, yet not one timestamp type which
> is any good except possibly BPF_T_NONE, and not one monotonic
> timestamp type.  Only external uses and compatibility require use
> of CLOCK_REALTIME.
>
> I recently tried looking at timeout resolution on FreeBSD cluster
> machines using ktrace, and found ktrace unusable for this.  At
> first I blamed the slowness of the default misconfiguered
> timecounter ACPI-fast, but the main problem was that I forgot my
> home directory was on nfs, and nfs makes writing ktrace records
> take hundreds of times longer than on local file systems. 
> ACPI-fast seemed to be taking nearly 1000 uS, but it was nfs taking
> that long.
>
> Anyway, ACPI-fast takes nearly 1000 nS, which is many times too
> long to be good for timestamping individual syscalls or packets,
> and makes sub-microseconds resolution useless.  The above non-get
> *time() interfaces still use the primary timecounter, and this
> might be slow even if it is not misconfigured.  The above
> get*time() interfaces are fast only at the cost of being broken. 
> Among other bugs, their times only change at relatively large
> intervals which should become infinity with tickless kernels. 
> (BTW, icmp timestamps are still broken on systems with hz < 100. 
> Someone changed microtime() to getmicrotime(), but getmicrotime()
> cannot deliver the resolution of 1 mS supported by icmp timestamps
> unless these intervals are <= 1 mS.)

Please note that I am not trying to solve timecounter issues here.  
The current BPF timestamping is not too good because of two main 
reasons; 1) it is too slow with some timecounter hardware as you have 
noted and 2) we have no API to change timestamp resolution, accuracy, 
format, offset, or whatever *at all*.

The most common trick for the first problem is using getmicrotime(9) 
instead of microtime() if the users don't care much about its 
accuracy.  For those people who want to collect as many packets as 
possible without spending fortunes, it works pretty well.  However, 
suppose you have multiple interfaces.  You want good timestamps from 
a slower controller (LAN side) and less accurate timestamps from a 
super fast controller (WAN side), but you can't.  My patch solves 
this problem by assigning time stamping function per descriptor.  So, 
you can use the same resolution but different accuracies, for 
example.

The second problem is little bit harder for us without breaking 
libpcap and its consumers as it expects struct timeval and nothing 
else.  That's why I had to introduce new header format with compat 
shims.  In fact, struct bpf_hdr (and struct pcap_sf_pkthdr) is really 
obsolete and people have been talking about pcap NG for many years, 
which can store timestamps in variable resolutions and offsets.  
However, we can only use the default resolution even if libpcap gets 
the new format because we are stuck with struct bpf_hdr[1].

BTW, I updated my patch, which includes monotonic clocks now.

	BPF_T_MICROTIME_MONOTONIC	microuptime(9)
	BPF_T_NANOTIME_MONOTONIC	nanouptime(9)
	BPF_T_BINTIME_MONOTONIC		binuptime(9)
	BPF_T_MICROTIME_MONOTONIC_FAST	getmicrouptime(9)
	BPF_T_NANOTIME_MONOTONIC_FAST	getnanouptime(9)
	BPF_T_BINTIME_MONOTONIC_FAST	getbinuptime(9)

http://people.freebsd.org/~jkim/bpf_tstamp2.diff

Thanks for the hint, Bruce, although you may say there are more bogus 
clock types now. ;-)

Enjoy,

Jung-uk Kim

[1] libpcap added limited support for the pcap NG format since 1.1.0 
and my patch was written with the format in mind.  If my patch gets 
committed, I am going to submit a libpcap patch upstream to introduce 
new struct bpf_xhdr.


More information about the freebsd-net mailing list