Implementation of Sampling for BPF

Peter Wood peter at alastria.net
Sun Jan 6 14:01:55 PST 2008


Evening,

> I don't think that modifying bpf.c is good solution, as userland is not 
> the only consumer of BPF, think, for example, about ng_bpf. Moreover, 
> what is the purpose of sampling, after all? BPF was never intended to be 
> reliable every-packet solution. 

Certainly other things do use BPF, however in my case I'm not using them, and in 
the 1 in X solution I have developed so far it can be turned on and off and if 
it's of huge concern could be put between defines and a kernel config option be 
required to include it.

I'm not looking to transform BPF into a solution to reliably sample every 
packet, I am looking at attempting to define which packets it discards so that 
there is an equal chance of sampling something that happens, rather then an 
unknown/unpredictable chance.

I wanted to stop the packet being sent to BPF as high up the kernel chain as 
possible as to save as much CPU time as possible. There's no point in capturing 
everything we can and then having the user land program selectively chuck stuff 
when it could be done before all the various copying/switching/etc.

Additionally it would be nice to limit the number of packets that are processed 
through sampling, running some of our servers at 100% load is not ideal (see 
point 2 bellow).

> If you are monitoring in userland, Snort 
> of course will not have enough time to process all of your data, so why 
> not simply put at least two machines in parallel, one for each mirrored 
> line?

1) This doesn't scale, in the next six to twelve months I'm going to be 
presented with a 10Gb uplink to our regional network. Now I know I'm going to 
have issues when that link reaches ~40% capacity anyway, but one thing at a time.

2) We don't have the machine room heat or power capacity spare to run more 
servers, and there are other projects that require capacity that are in the 
waiting list way ahead of mine.

3) Because of our constraints we are satisfied with sampled data, we don't need 
full streams, but we would like controlled sampled data.

I'd love to buy a commercial hardware solution, unfortunately my budget is short 
by about $750k. So here I am with my favourite OS instead. God knows I've 
benefited from using FreeBSD, as has the institute I work for, at least if I do 
it properly I can say "guys, it's yours if you want it".

So if anyone wouldn't mind having a quick look at my initial email that'd be great.

P.
-- 
Peter Wood <peter at alastria.net>


More information about the freebsd-net mailing list