panic: uma_zone_slab is looping
Peter Holm
peter at holm.cc
Thu Dec 23 02:30:06 PST 2004
On Wed, Dec 22, 2004 at 05:15:40PM -0500, Bosko Milekic wrote:
>
> On Wed, Dec 22, 2004 at 10:05:53PM +0100, Peter Holm wrote:
> > On Mon, Dec 20, 2004 at 06:41:04PM -0500, Bosko Milekic wrote:
> > >
> > > I realize it's been a while.
> > >
> > > Anyway, what I *think* is going on here is that slab_zalloc() is
> > > actually returning NULL even when called with M_WAITOK. Further
> > > inspection in slab_zalloc() reveals that this could come from several
> > > places. One of them is kmem_malloc() itself, which I doubt will ever
> > > return NULL if called with M_WAITOK. If this assumption is indeed
> > > correct, then the NULL must be being returned by slab_zalloc() itself,
> > > or due to a failed uma_zalloc_internal() call. It is also possible
> > > that slab_zalloc() returns NULL if the init that gets called for the
> > > zone fails. However, judging from the stack trace you provided, the
> > > init in question is mb_init_pack() (kern_mbuf.c). This particular
> > > init DOES perform an allocation and CAN in theory fail, but I believe
> > > it should be called with M_WAITOK as well, and so it should also never
> > > fail in theory.
> > >
> > > Have you gotten any further with the analysis of this particular
> > > trace? If not, I would suggest adding some more printf()s and
> > > analysis into slab_zalloc() itself, to see if that is indeed what is
> > > causing the infinite looping in uma_zone_slab() and, if so, attempt to
> > > figure out what part of slab_zalloc() is returning the NULL.
> >
> > OK, did that: http://www.holm.cc/stress/log/freeze03.html
>
> OK, well, I think I know what's happening. See if you can confirm
> this with me.
>
> I'll start with your trace and describe the analysis, please bear with
> me because it's long and painful.
>
> Your trace indicates that the NULL allocation failure, despite a call
> with M_WAITOK, is coming from slab_zalloc(). The particular thing
> that should also be mentionned about this trace, and your previous
> one, is that they both show a call path that goes through an init
> which performs an allocation, also with M_WAITOK. Currently, only the
> "packet zone" does this. It looks something like this:
>
> 1. UMA allocation is performed for a "packet." A "packet" is an mbuf
> with a pre-attached cluster.
>
> 2. UMA dips into the packet zone and finds it empty. Additionally, it
> determines that it is unable to get a bucket to fill up the zone
> (presumably there is a lot of memory request load). So it calls
> uma_zalloc_internal on the packet zone (frame 18).
>
> 3. Perhaps after some blocking, a slab is obtained from the packet
> zone's backing keg (which coincidentally is the same keg as the
> mbuf zone's backing keg -- let's call it the MBUF KEG). So now
> that an mbuf item is taken from the freshly allocated slab obtained
> from the MBUF KEG, uma_zalloc_internal() needs to init and ctor it,
> since it is about to return it to the top (calling) layer. It
> calls the initializer on it for the packet zone, mbuf_init_pack().
> This corresponds to frame 17.
>
> 4. The packet zone's initializer needs to call into UMA again to get
> and attach an mbuf cluster to the mbuf being allocated, because mbufs
> residing within the packet zone (or obtained from the packet zone)
> MUST have clusters attached to them. It attempts to perform this
> allocation with M_WAITOK, because that's what the initial caller
> was calling with. This is frame 16.
>
> 5. Now the cluster zone is also completely empty and we can't get a
> bucket (surprise, surprise, the system is under high memory-request
> load). UMA calls uma_zalloc_internal() on the cluster zone as well.
> This is frame 15.
>
> 6. uma_zalloc_internal() calls uma_zone_slab(). Its job is to find a
> slab from the cluster zone's backing keg (a separate CLUSTER KEG)
> and return it. Unfortunately, memory-request load is high, so it's
> going to have a difficult time. The uma_zone_slab() call is frame
> 14.
>
> 7. uma_zone_slab() can't find a locally cached slab (hardly
> surprising, due to load) and calls slab_zalloc() to actually go to
> VM and get one. Before calling, it increments a special "recurse"
> flag so that we do not recurse on calling into the VM. This is
> because the VM itself might call back into UMA when it attempts to
> allocate vm_map_entries which could cause it to recurse on
> allocating buckets. This recurse flag is PER zone, and really only
> exists to protect the bucket zone. Crazy, crazy shit indeed.
> Pardon the language. This is frame 13.
>
> 8. Now slab_zalloc(), called for the CLUSTER zone, determines that the
> cluster zone (for space efficiency reasons) is in fact an OFFPAGE
> zone, so it needs to grab a slab header structure from a separate
> UMA "slab header" zone. It calls uma_zalloc_internal() from
> slab_zalloc(), but it calls it on the SLAB HEADER zone. It passes
> M_WAITOK down to it, but for some reason IT returns NULL and the
> failure is propagated back up which causes the uma_zone_slab() to
> keep looping. Go back to step 7.
>
> This is the infinite loop 7 -> 8 -> 7 -> 8 -> ... which you seem to
> have caught.
>
> The question now is why does the uma_zalloc_internal() fail on the
> SLAB HEADER zone, even though it is called with M_WAITOK.
> Unfortunately, your stack trace does not provide enough depth to be
> able to continue an accurate deductive analysis from this point on
> (you would need to sprinkle MORE KASSERTs).
>
> However, here are some hypotheses.
>
> The uma_zalloc_internal() which ends up getting called also ends up
> calling uma_zone_slab(), but uma_zone_slab() eventually fails (this is
> a fact, this is the only reason that the uma_zalloc_internal() could
> in turn fail for the SLAB HEADER zone, which doesn't have an init or a
> ctor).
>
> So why does the uma_zone_slab() fail with M_WAITOK on the slab header
> zone? Possibilities:
>
> 1. The recurse flag is at some point determined non-zero FOR THE SLAB
> HEADER backing keg. If the VM ends up getting called from the
> subsequent slab_zalloc() and ends up calling back into UMA for
> whatever allocations, and "whatever allocations" are also
> potentially offpage, and a slab header is ALSO required, then we
> could also be recursing on the slab header zone from VM, so this
> could cause the failure.
>
> if (keg->uk_flags & UMA_ZFLAG_INTERNAL && keg->uk_recurse != 0) {
> /* ADD PRINTF HERE */
> printf("This zone: %s, forced fail due to recurse non-null",
> zone->uz_name);
> return NULL;
> }
>
The printf didn't really fly. It seems to be called early in
boot:
ck_flags(0,0,c07e8890,882) at _mtx_lock_flags+0x24
uma_zalloc_internal(c09284a0,c0c20c84,2) at
uma_zalloc_internal+0x2d
uma_zcreate(c07e8b1e,40,0,0,0,0,3,2000) at uma_zcreate+0x57
uma_startup(c103d000,c103d000,28000,c0c20d78,ff00000) at
uma_startup+0x2ae
vm_page_startup(c1065000,c0c20d98,c05ec857,0,c08525d0) at
vm_page_startup+0x109
vm_mem_init(0,c08525d0,c1ec00,c1e000,c28000) at vm_mem_init+0x13
mi_startup() at mi_startup+0xb3
begin() at begin+0x2c
so I just sprinkled some more asserts. I'm trying to see if I can
provoke this problem more consistently, based on your analysis.
It usually takes me a day or two of testing to get there.
> If you get the print to trigger right before the panic (last one
> before the panic), see if it is on the SLAB HEADER zone. In
> theory, it should only happen for the BUCKET ZONE.
>
> 2. M_WAITOK really isn't set. Unlikely.
>
> If (1) is really happening, we'll need to think about it a little more
> before deciding how to fix it. As you can see, due to the recursive
> nature of UMA/VM, things can get really tough when resources are
> scarce.
>
> Regards,
> --
> Bosko Milekic
> bmilekic at technokratis.com
> bmilekic at FreeBSD.org
--
Peter Holm
More information about the freebsd-current
mailing list