Locking netatm

Attilio Rao asmrookie at gmail.com
Mon May 29 06:26:03 PDT 2006


2006/5/29, Skip Ford <skip.ford at verizon.net>:

[snip]

> So my question is, were network interrupts disabled when mucking
> with the atm_timeq list because a generated interrupt can modify
> structures within the list?  This use is probably very
> netatm-specific.  I'm still studying the timeout code to
> understand what it's doing.

This is the whole point of locking primitives: atomic operations.
Interrupting the code while getting a lock could result in a race condition.

If you have to protect datas through a TSL (Test and Set Lock)
operation, what happen is something like (hypotetically):

if (lock == 0)
    lock = 1;
else
   thread_sleep();

Than you immagine threadA verify comparisons as true, get interrupted
by interrupt timer and threadB is scheduled to run. threadB executes
the whole procedure (setting lock = 1) than threadA is scheduled again
to run and thinks the critical section is free (since its comparison
was true) so sets again lock=1 and go on. At this point you have an
incorrect behaviour for your OS (a race condition). This is why
interrupts are painful for atomic operations and why they're never
allowed.

> A second situation where network interrupts were disabled was for
> netatm memory allocation for devices:
>
> in atm_dev_alloc()
>
>        s = splimp();
>
>        FOREACH(atm_mem_head)
>                ...
>        malloc (...)
>
>        (void) splx(s);
>
> and in atm_dev_free()
>
>        s = splimp();
>
>        FOREACH(atm_mem_head)
>                ...
>        free (...);
>
>        (void) splx(s);
>
> I'm not sure how these should be protected.  Presumably, we don't
> want to receive interrupts until the netatm memory for the
> device is allocated.  Would a global subsystem lock protect these
> calls?  I can protect atm_mem_head, so maybe that'd be enough?

The better way to protect a list is a rwlock.
Even if they're still incomplete, actually, they're the only primitive
which can assure a two-levels protection and a sort of priority
propagation, so it would be preferable using it (holding according to
your operations).

I think that a lot of queues that we have protected by a sx lock might
switch to rwlocks...

> Another use is to protect calls to other subsystems.  For
> example:
>
> within atm_nif_attach(struct atm_nif *nip)
>
>        ifp = nip->nif_ifp;
>
>        s = splimp();
>
>        if_attach(ifp);
>        bpfattach(ifp, DLT_ATM_CLIP, T_ATM_LLC_MAX_LEN);
>
>        (void) splx(s);
> }
>
> and within atm_nif_detach(struct atm_nif *nip)
>
>        ifp = nip->nif_ifp;
>
>        s = splimp();
>
>        bpfdetach(ifp);
>        if_detach(ifp);
>        if_free(ifp);
>
>        (void) splx(s);
>
> Holding a new netatm subsystem lock won't protect those calls so
> I'm not sure how to handle those.  Other non-netatm code in the
> tree seems to not do any locking at all around those calls.

Here a mutex(9) is enough.

> These are really the only uses I've yet to convert so if someone
> can provide some pointers, I'd appreciate it.  I'm pretty new to
> FreeBSD locking, either the old way or the new way.  I'm still
> studying the code, including other network stacks and the netatm
> stack itself, but a pointer or two would be appreciated.  I feel
> like it's mostly converted, though I've done no testing at all
> yet.  Once I finish removing splimp(), I can test with the single
> subsystem lock, then move on to finer-grained locking where
> necessary.

There is not a lot of update documentation about it, I can say you in
particular:
http://www.freebsd.org/doc/en/books/arch-handbook/locking.html
http://www.lemis.com/grog/SMPng/Singapore/paper.pdf
http://people.freebsd.org/~fsmp/SMP/SMP.html

BTW, some of this is not currently applied into the kernel so look at
the code as usual :P

Attilio

-- 
Peace can only be achieved by understanding - A. Einstein


More information about the freebsd-atm mailing list