msync() differences between Linux and FreeBSD

Peter Jeremy peterjeremy at optushome.com.au
Fri Jul 4 23:54:20 UTC 2008


On 2008-Jul-02 05:00:02 -0700, Marcus Reid <marcus at blazingdot.com> wrote:
>  It seems that in FreeBSD, msync() waits for bits to be
>committed to disk even when MS_ASYNC is specified.

Your previous ktrace output suggests that, at least for the way
rrdtool is using mmap(2), physical I/O is being performed by msync(2).
It's not clear whether FreeBSD is ignoring the MS_ASYNC flag (the code
suggests it isn't), is blocking on previously queued I/O or is blocking
for some other reason.

>First off, I don't know how frequently msync() is used, and whether changing
>its behavior would impact the performance of many things.

The behaviour of msync(2) is defined in the Single Unix Specification
and FreeBSD adheres to SUS unless there is a very good reason.

>    media... i.e. issue real I/O.  So msync() can't be a NOP if you go by
>    the OpenGroup specification.
>
>Is there a spec that FreeBSD is adhering to that prevents msync() with
>MS_ASYNC from being a NOP, seeing as munmap() does the job?

As per Matt's response that you quoted, yes.

>  And does this
>really matter for the real-world performance of some apps?

IMO, rrdtool is using mmap()/madvise()/msync()/munmap() in an unusual
fashion and it should be fixed, rather than changing FreeBSD to match
rrdtool.  I believe a more usual approach would be to mmap() a file
(or part thereof), optionally call madvise(), perform a series of
accesses and maybe msync()s of any updated regions then a single
munmap() before exiting.  Performing mmap()/msync()/munmap() (where
the msync() specifies the entire file) for each update maximises
system overheads for no obvious benefit.

Also, you mentioned hitting a "brick wall" between 940MB and 1161MB.
That straddles 1GB.  You may also be running into system or process
boundary conditions (how much RAM do you have and what tuning have you
done).  You might like to write a tool to simulate the rrdtool
behaviour with varying DB sizes and identify exactly what you are
hitting.

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080704/01d5aed9/attachment.pgp


More information about the freebsd-stable mailing list