Switch pfil(9) to rmlocks

Sun Nov 25 08:40:59 PST 2007

Max Laier wrote:
> On Sunday 25 November 2007, Darren Reed wrote:
>> Max Laier wrote:
>>> On Sunday 25 November 2007, Darren Reed wrote:
>>>> Max Laier wrote:
>>>>> On Friday 23 November 2007, Robert Watson wrote:
>>>>>> On Fri, 23 Nov 2007, Max Laier wrote:
>>>>>>> attached is a diff to switch the pfil(9) subsystem to
>>>>>>> rmlocks, which are more suited for the task.  I'd like some
>>>>>>> exposure before doing the switch, but I don't expect any
>>>>>>> fallout.  This email is going through the patched pfil
>>>>>>> already - twice.
>>>>>> Max,
>>>>>>
>>>>>> Have you done performance measurements that show rmlocks to be
>>>>>> a win in this scenario?  I did some patchs for UNIX domain
>>>>>> sockets to replace the rwlock there but it appeared not to have
>>>>>> a measurable impact on SQL benchmarks, presumbaly because the
>>>>>> read/write blend wasn't right and/or that wasnt a significant
>>>>>> source of overhead in the benchmark.  I'd anticipate a much
>>>>>> more measurable improvement for pfil, but would be interested
>>>>>> in learning how much is seen?
>>>>> I had to roll an artificial benchmark in order to see a
>>>>> significant change (attached - it's a hack!).
>>>>>
>>>>> Using 3 threads on a 4 CPU machine I get the following results:
>>>>> null hook: ~13% +/- 2
>>>>> mtx hook: up to 40% [*]
>>>>> rw hook: ~5% +/- 1
>>>>> rm hook: ~35% +/- 5
>>>> Is that 13%/5%/35% faster or slower or improvement or degradation?
>>>> If "rw hook" (using rwlock like we have today?) is 5%, whas is the
>>>> baseline?
>>>>
>>>> I'm expecting that at least one of these should be a 0%...
>>> Sorry for the sparse explanation.  All numbers above are gain with
>>> rmlocks i.e. rmlocks are faster in all scenarios.  The test cases are
>>> different hook functions.  Every hook has a DELAY(1) and a
>>> lock/unlock call around it of the respective lock type.  read lock
>>> acquisitions for rw and rm. Please look at the code I posted a bit
>>> later for more details.
>> Thanks for the clarification.
>> That makes rmlocks very interesting.
>> And the kind of lock that both ipf and ipfw could benefit from,
>> especially since you're planning on changing the pfil locks to be
>> this way.  Are you (or is someone else?) intending on following
>> up moving ipfw to them, where appropriate?
> 
> It's unclear yet, where they are appropriate.  As the name suggests (read 
> mostly locks) they are for cases where you acquire a write lock only once 
> in a blue moon.  As such the write lock acquisition is a very expensive 
> operation.  This might be appropriate for some cases of IPFW rulesets, 
> but certainly not the general case.  Things like address tables and 
> dynamic rules (states) take a lot of writes.  It's not yet clear where 
> the balance for rmlocks really is and my benchmark doesn't answer that.

it's certainly feasible for the main ipfw lock.

As for stats, I think there are other ways to solve that problem..
I've been playing around with per-cpu stats. it seems to work ok, but
one does have to wonder about what happens on a 64 cpu machine. (*). 
I guess that we expect 64 cpu machines to have a LOT of memory. 

(*) yes I know we don't support > 32 right now.

dynamic rules require more work, but not every rule is a keep-state, where
every rule has stats.

> 
> For pfil the case is obvious as you hardly ever change the hooks.
> 
>> I'm tempted to suggest them to other platforms...although I'd expect
>> some amount nay-saying because of NIH, it would be good if others
>> also picked up on them, if the benefits are this clear cut...
> 
> Again ... only for a select set of cases.  But they are really great for 
> cases where you don't want to use proper synchronization because of the 
> slow down and the relative small chance of ever hitting the race.  Now we 
> have a tool to properly protect against these races without sacrificing 
> performance.
>