Re: prison_flag() check in hot path of in_pcblookup()

From: James Gritton <jamie_at_freebsd.org>
Date: Tue, 13 Dec 2022 18:00:24 UTC
On 2022-12-13 09:18, Andrew Gallatin wrote:

> I was trying to improve the performance of in_pcblookup(), as it is a 
> very hot path for us (Netflix). One thing I noticed was the 
> prison_flag() check in in_pcblookup_hash_locked() can cause a cache 
> miss just by deref'ing the cred pointer, and it can also cause multiple 
> misses in tables with collisions by causing us to walk the entire chain 
> even after finding a perfect match.
> 
> I'm curious why this check is needed.  Can you explain it to me?  It 
> originated in this commit:
> 
> commit 413628a7e3d23a897cd959638d325395e4c9691b
> Author: Bjoern A. Zeeb <bz@FreeBSD.org>
> Date:   Sat Nov 29 14:32:14 2008 +0000
> 
> MFp4:
> Bring in updated jail support from bz_jail branch.
> 
> This enhances the current jail implementation to permit multiple
> addresses per jail. In addtion to IPv4, IPv6 is supported as well.
> 
> My thinking is that a jail will either use the host IP, and share its 
> port space, or it will have its own IP entirely (but I know nothing 
> about jails).  In either case, a perfect 4-tuple match should be enough 
> to uniquely identify the connection.
> 
> Even if this somehow is not the case and we have multiple connections 
> somehow sharing the same 4-tuple, how does checking the prison flag 
> help us?  It would prefer the jailed connection over the non jailed, 
> but that would shadow a host connection.  And if we had 2 jails sharing 
> the same 4-tuple, the first jail would win.
> 
> I can't see how this check is doing anything useful, so I'd very much 
> like to remove this check if possible.   Untested patch attached.

For a complete 4-tuple, it should indeed be the case that a match would 
only ever identify a single prison.  The later part of the function that 
examines wildcards definitely needs the check.  I don't get the XXX 
comment about both being bound with SO_REUSEPORT, because I would only 
expect that to apply to listening, not to full connections. But I also 
expect Bjoern to know more than I do here...

- Jamie