Behavior of madvise(MADV_FREE)

Marcel Moolenaar marcel at xcllnt.net
Fri Oct 12 21:00:29 UTC 2012


On Oct 12, 2012, at 10:58 AM, Alan Cox <alc at rice.edu> wrote:

>> Now on to the questions:
>> 1.  madvise(MADV_FREE) marks the pages as clean and moves
>>     them to the inactive queue. Why isn't the reference
>>     state cleared on either the page or the TLB?
> 
> It is, at least 31 out of 32 times that vm_page_dontneed() is called.  From vm_page_dontneed(), which is called by madvise(MADV_FREE):
> 
>        /*
>         * Clear any references to the page.  Otherwise, the page daemon will
>         * immediately reactivate the page.
>         *
>         * Perform the pmap_clear_reference() first.  Otherwise, a concurrent
>         * pmap operation, such as pmap_remove(), could clear a reference in
>         * the pmap and set PGA_REFERENCED on the page before the
>         * pmap_clear_reference() had completed.  Consequently, the page would
>         * appear referenced based upon an old reference that occurred before
>         * this function ran.
>         */
>        pmap_clear_reference(m);
>        vm_page_aflag_clear(m, PGA_REFERENCED);

Ah... I missed this. I didn't look in vm_page_dontneed() for
this. I thought current FreeBSD behaved the same as 6.1-ish.

>> 2.  Why aren't the pages moved to the cache queue in the
>>     first place?
> 
> Because this would make madvise(MADV_FREE) considerably more expensive, for example, the pages would have to be unmapped.  Your situation may be different, but more often than not, people call madvise(MADV_FREE) when memory is plentiful, and there is no need to do anything.  In other words, the page daemon isn't going to need to run anytime soon.  For example, when madvise(MADV_FREE) is used in implementations of malloc() and free(), the vast majority of calls to madvise(MADV_FREE) are pointless.  They are pointless in that soon after the madvise(MADV_FREE) call by the free() implementation, either (1) the application turns around and allocates more memory causing the MADV_FREE'd memory to be used once again or (2) the process terminates before the page daemon runs.  Consequently, the implementation of madvise(MADV_FREE) does the minimal necessary work so that if memory does become scarce and the page daemon has to run, that the MADV_FREE'd pages are first in line for reclamation.

Understood. Thanks.

>> Ad 2:
>> MADV_DONTNEED is there to signal that the pages contain
>> valid data, but that the page is not needed right now.
>> Using this, pages get moved to the inactive queue. That
>> makes sense. But MADV_FREE signals that there's no valid
>> data anymore and that the page may be demand zeroed on
>> next reference. The page is not inactive. It's free. If
>> the paged was zeroed before calling MADV_FREE, the page
>> really caches contents that that can be recreated later
>> (the demand zero).
> 
> There is also another way of looking at it.  By leaving the pages allocated and mapped, you are saving time, i.e., CPU cycles, for the all to common case that the MADV_FREE'd pages are used again in the near future.
> 
> It wouldn't be illogical to have to two variants of MADV_FREE.  One for use by folks like yourself who can say definitively that the pages won't be accessed again and should really be freed, and the current implementation for more speculative uses like in the malloc() and free() implementation.  Better yet, the second case would be replaced by a notification from the kernel to the process when memory is actually becoming scarce so that we won't waste cycles on any pointless madvise() calls by the process.

This aligns with phk at s suggestion of MADV_RECYCLE. We may
want to play with this.

-- 
Marcel Moolenaar
marcel at xcllnt.net




More information about the freebsd-arch mailing list