VM_BCACHE_SIZE_MAX on i386
    Konstantin Belousov 
    kostikbel at gmail.com
       
    Sat Mar 23 21:10:21 UTC 2013
    
    
  
The unmapped I/O work allows avoiding the map of the vnode pages into
the kernel memory for the UFS mounts, if underlying geoms and disk
drivers accept unmapped BIOs.  Converting all geom classes and
drivers, despite not very hard, is quite big task, which requires a
lot of validation on the unusual configurations and rare hardware.  I
decided to provide the transient remapping for the classes which are
not yet converted, which allowed to put the work into HEAD much
earlier, if at all.
When unmapped BIO is passed through the geom stack and next geom is
not marked as accepting unmapped BIO, the KVA space in the so called
transient map is allocated and pages are mapped there.  On the
architectures with ample KVA creating the transient map is not an
issue, but it is very delicate on the architectures with the limited
KVA, i.e. mostly 32bit architectures.
To not distrurb the KVA layout and current balance, I split the space
previously allocated to the buffer map, into 90% which are still used
by the buffer map, and the rest 10%, dedicated to the transient
mapping.  The split rationale is that typical load have 9/1 split for
the user data/metadata buffers, and almost all user data buffers are
unmapped.
More precisely, the transient map is sized to 10% of the maximum
_theoretical_ allowed buffer map size on the arch. Real buffer map is
usually smaller, sized proportionally to the available RAM. The
details of the allocation are in the
vfs_bio.c:kern_vfs_bio_buffer_alloc().  The function uses maxbcache
tunable, initialized from VM_BCACHE_SIZE_MAX by default.
But, on i386 !PAE, VM_BCACHE_SIZE_MAX is bigger then the maximally
sized buffer cache, on the 4GB RAM machine. The max buffer cache map
size is around 110MB, while VM_BCACHE_SIZE_MAX is 200MB. This causes
the bio_transient_map oversizing, eating additional 90MB of precious
KVA on i386.
By itself this +90MB KVA use is not critical, but it starts
conflicting with other KVA hogs, like nvidia blob, which seemingly
tries to remap the whole aperture (256+ MB) into the KVA. The issue
was reported by dwh, and appeared to be quite misterious, since his
machine has no useful way to report panics from failed X.
The resolution I propose is to change the VM_BCACHE_SIZE_MAX on i386
!PAE case, to make it equal to the exact max size of the buffer cache.
Note that maxbcache can be tuned from the loader prompt, so the effect
of the change would be only on the i386 machines with tuned buffer
cache.
Also, the patch doubles the size of the transient map to 1/5 of the
max buffer cache. This gives 180 parallel remapped i/os in flight,
since I consider the re-caclulated 90 i/os too small even for i386.
The patch was tested by dwh, please comment. I intend to commit it in
several days.
http://people.freebsd.org/~kib/misc/i386_maxbcache.1.patch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20130323/c7556297/attachment.sig>
    
    
More information about the freebsd-arch
mailing list