Implementation of Sampling for BPF
Peter Wood
peter at alastria.net
Mon Jan 7 08:31:27 PST 2008
Good Afternoon,
> It's the question of doing things correctly(tm) so they are appropriate
> for inclusion into the main src tree of the FreeBSD Project - this must
> be universal enough to meet other people needs and to be supported. You
> of course are free to do any patches at your locals site for your
> individual needs - many people do that customization on their own.
Indeed, and the later part of your statement is what my primary goal is, however
I'm unfamiliar with this part of the kernel and could do with a few pointers
about what the correct way would be from a programmatic point of view.
> So what if a malicious packet will be skipped due sampling, packet which
> is by other means undistinguishable from others before detailed analysis?
If this case happens it is unfortunate and it slips through the net, however
malicious problems that I look for are more often flows rather then individual
packets. We drop most protocols at the border that would give us an issue with
one packet. There is a greater chance of managing to sample at least one packet
of a malicious flow.
> Low in chain instead of high, you mean? That's of course no point to
> sort out things in userland, but that's properties of given BPF program
> to filter - how much the userland program wants to receive before
> detailed analysis.
Please forgive my use of low and high, it seems to depend on which end of the
stack you're looking from :). I meant as close to it coming into the kernel as
possible, yes.
> Putting as many servers as needed does scale well if you need only
> sampled data - just put an appropriate sampler/load balancer before
> them. And using FreeBSD on that servers will be cheaper than commercial
> hardware solution, too.
Again, no ability to buy a sampler/load balancer, nor any space/heat/power to
run one in. My available equipment consists of two core networking devices, some
fibre, two Intel gig optical cards and one powerful(ish) Dell server currently
running FreeBSD 6.X, which needs bumping to 7.0 when it's released. The kit at
the other end of these optical links is either busy or incapable of sampling.
> Why sample is enough to you? What exactly do you need? May be you'd
> rather write some simpler expressions for in-kernel filtering instead of
> heavy-weighted Snort?
I'm afraid I will not discuss our exact requirements in an open forum, this
seems unwise from a security point of view.
I would be happy to implement this as a BPF filter, but I'm unaware of how
sample in the filter language and count with variables, rather then look at
fields in a packet.
More additional uses I could possibly foresee:
* NetFlow Generation - For which sampling is perfectly acceptable, although we
currently do this in hardware.
* Statistics Generation - What are our users using our network for, etc. Now of
course a lot of this data can be obtained from NetFlow (as we do at current) but
there are aspects that can't, like average packet sizes per protocol, etc,
things like that.
* Research - I'm regularly asked for sampled data from our network from
researchers (which currently I turn down) but I'm assuming that they think
sampled data is quite suitable.
I can understand your hesitation about including something like this in the
project as a whole, but as I've said this is primarily for our purposes.
If others would find it useful that's great and I'll maintain a patch on a
webserver, if the project as a whole would find it useful that's great too.
It would be nice at least from a academic point of view for FreeBSD to support
other research too, for example the work being done to separate the congestion
control to permit easier testing of different methods.
P.
--
Peter Wood <peter at alastria.net>
More information about the freebsd-net
mailing list