rwatson at FreeBSD.org
Fri Jan 4 03:38:24 PST 2008
On Fri, 4 Jan 2008, Igor Mozolevsky wrote:
> On 04/01/2008, Robert Watson <rwatson at freebsd.org> wrote:
>> On Fri, 4 Jan 2008, Igor Mozolevsky wrote:
>>> Of course, if you're afraid of memory overcommit and you know in advance
>>>> how much memory you need, you can simply allocate a sufficient amount of
>>>> address space at startup and touch it all. This way, you will either be
>>>> killed right away, or be guaranteed to have sufficient memory for the
>>>> rest of your (process) lifetime. Alternatively, do what Varnish does:
>>>> create a large file, mmap it, and allocate everything you need from that
>>>> area, so you have your own private swap space. Just make sure to
>>>> actually allocate the disk space you need (by filling the file with
>>>> zeroes, or at the minimum writing a zero to the file every sb.st_blksize
>>>> bytes, preferably sequentially to avoid excessive fragmentation)
>>> Surely you can just fseek() on the file at the correct lenght?
>> That will create a sparse file without file system blocks to back it, and
>> is effectively also over-commit. When the file system runs out of room,
>> you will get SIGSEGV when the vnode pager discovers it can't write a page
>> to disk. If you zero-fill it, the blocks are pre-allocated.
> Surely you should not be allowed to overcommit on fseek() followed by
> write(,,1); zeroing out gigs of hdd space seems rather silly...
Sparse files are a feature. It just becomes inconvenient at that point
because you discover the lack of space asynchronously from a useful user
process event. When memory pressure gets high, the vnode pager decides it's
time to push a dirty page to disk, and then discovers that there are no free
blocks on the file system to write to. As I mentioned in my e-mail, it would
be nice if our file system supported a way to reserve blocks for files without
hooking them up to the file's visiible address space (in order to avoid
zeroing them, which is required if you do want to hook them up for an
unprivileged process). However, that feature doesn't currently exist.
Many systems with sensitivity to on-demand allocation costs and without
security requirements allow files to be extended without zeroing. On systems
with security requirements, this becomes a privileged operation (such as on
Mac OS X) because exposing unzeroed pages from other files or processes not
explicitly shared is Not Allowed.
Robert N M Watson
University of Cambridge
More information about the freebsd-current