Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini()
Andriy Gapon
avg at FreeBSD.org
Wed Dec 4 09:48:32 UTC 2019
On 04/12/2019 00:28, Peter Eriksson wrote:
> … and X levels deeper it gives this (not all filesystems mounted - I rebooted fairly quickly):
>
> keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds [20021 loops]
> zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 seconds
> zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds
> zone_free_item(zone=UMA Zones): zone->uz_dtor() took 14 seconds
> uma_zdestroy(zio_buf_12288) took 14 seconds
> kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] took 14 seconds
> kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds
>
>
> Hmm… I wonder if it isn’t this code I /usr/src/sys/sys/queue.h:
>
>> #define SLIST_REMOVE(head, elm, type, field) do { \
>> QMD_SAVELINK(oldnext, (elm)->field.sle_next); \
>> if (SLIST_FIRST((head)) == (elm)) { \
>> SLIST_REMOVE_HEAD((head), field); \
>> } \
>> else { \
>> QUEUE_TYPEOF(type) *curelm = SLIST_FIRST(head); \
>> while (SLIST_NEXT(curelm, field) != (elm)) \
>> curelm = SLIST_NEXT(curelm, field); \
>> SLIST_REMOVE_AFTER(curelm, field); \
>> } \
>> TRASHIT(*oldnext); \
>> } while (0)
>
> Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain():
>
>> finished:
>> KEG_UNLOCK(keg);
>>
>> while ((slab = SLIST_FIRST(&freeslabs)) != NULL) {
>> SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink);
>> keg_free_slab(keg, slab, keg->uk_ipers);
>> }
>
> (The keg_drain print above is the time for the while() loop).
>
> If I’m reading this right it looks like it’s potentially doing a linear search through a (long) linked list, for every entry in the free slabs list. 20000 entries in 14 seconds is 1400 entries per second.
It's not a search. The loop is removing *all* items. There is no other way to
do it :-)
Expanding your calculations, it takes on the order of a millisecond to process a
single slab. I bet that the time is spent in keg_free_slab() and it's a lot of
time.
I wonder if the OR should actually be AND in this condition:
if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize * i)) ||
keg->uk_fini != trash_fini)
Looks like there is a "ton" of memory reading done here.
--
Andriy Gapon
More information about the freebsd-fs
mailing list