SMP problem with uma_zalloc
brandt at fokus.fraunhofer.de
Mon Jul 21 00:03:05 PDT 2003
On Sat, 19 Jul 2003, Bosko Milekic wrote:
BM>On Sat, Jul 19, 2003 at 08:31:26PM +0200, Lara & Harti Brandt wrote:
BM>> Well the problem is, that nothing is starved. I have an idle machine and
BM>> a zone that I have limited to 60 or so items. When allocating the 2nd
BM>> item I get block on the zone limit. Usually I get unblocked whenever I
BM>> free an item. This will however not happen, because I have neither
BM>> reached the limit nor is there memory pressure in the system to which I
BM>> could react. I simply may be blocked forever.
BM> UMA_ZFLAG_FULL is set on the zone prior to the msleep(). This means
BM> that the next free will result in your wakeup, as the next free will
BM> be sent to the zone internally, and not the pcpu cache.
But there is no free to come. To explain where we have the problem:
the HARP ATM code uses a zone in the IP code to allocate control blocks
for VCCs. The zone is limited to 100 items which evaluates to 1 page.
When I start an interface, first the signalling vcc=5 is opened. This
allocates one item from the zone, all the other items go into the CPU
cache. Next I start ILMI. ILMI tries to open its vcc=16. While this works
on UP machines (the zone allocator will find a free item in the CPU
cache), on my 2-proc machine half of the time ILMI gets blocked on the
zonelimit. And it blocks there forever, because, of course nobody is going
to free the one and only allocated item. On a four processor machine the
blocking probability will be 75%.
So in order to be able to get out N items from a zone (given that there is
no shortage of memory) one has to set the limit to N + nproc *
items_per_allocation, which one cannot do because he doesn't know
BM>> That makes the limit feature for zones rather useless, because I cannot
BM>> predict how many of the items I can really allocate (this depends on the
BM>> number of processors, the page size and the configuration of UMA itself).
BM>> Perhaps we could make the behaviour dependent on the maximum number of
BM>> items. When it is rather low (a couple of pages worth) and I would block
BM>> on the zone limit and I have free items in another CPU's cache then
BM>> drain one of the caches.
BM>> Or I could simply remove the limits.
brandt at fokus.fraunhofer.de, harti at freebsd.org
More information about the freebsd-current