int/long confusion with maxbcache and maxswzone (fixes 6.0 on
>12GB machines)
Max Laier
max at love2party.net
Wed Oct 12 03:36:27 PDT 2005
On Tuesday 11 October 2005 23:38, Kris Kennaway wrote:
> A few weeks ago I reported that bufinit() on sparc64 machines with
>
> >12GB of RAM goes into an infinite loop because of a 32-bit integer
>
> counter overflowing. On 5.x it was possible to work around this with
> the kern.maxbcache tunable, but this didn't work on 6.0 or above.
>
> It turns out the problem began here:
>
> ----
> Revision 1.67 / (download) - annotate - [select for diffs], Mon Nov 8
> 18:20:02 2004 UTC (11 months ago) by des Branch: MAIN
> Changes since 1.66: +17 -17 lines
> Diff to previous 1.66 (colored)
>
> #include <vm/vm_param.h> instead of <machine/vmparam.h> (the former
> includes the latter, but also declares variables which are defined
> in kern/subr_param.c).
>
> Change som VM parameters from quad_t to unsigned long. They refer to
> quantities (size limits for text, heap and stack segments) which must
> necessarily be smaller than the size of the address space, so long is
> adequate on all platforms.
>
> MFC after: 1 week
> ----
>
> which contained:
>
> -int maxswzone; /* max swmeta KVA storage */
> -int maxbcache; /* max buffer cache KVA storage */
> +long maxswzone; /* max swmeta KVA storage */
> +long maxbcache; /* max buffer cache KVA storage */
>
> However, des forgot to change the other definition of maxbcache in
> <sys/buf.h>:
>
> extern int maxbcache; /* Max KVA for buffer cache */
>
> In fact, it's a good thing he didn't. On sparc64 if you make that
> variable a long it causes 32-bit integer overflows elsewhere, which
> lead to severe filesystem damage on systems with >12GB RAM. With the
> above bug this is reduced to a hang at boot.
Isn't it enough to introduce the maximum values below? I imagine that the
ultimate goal is to get rid of the constrains, which will be easier if we
already have enough bits.
> The hang is because maxbcache is not capped to a maximum value on
> sparc64, and a loop termination condition never occurs because of a
> 32-bit integer overflow. On amd64 it's capped to
>
> /*
> * Ceiling on size of buffer cache (really only effects write queueing,
> * the VM page cache is not effected), can be changed via
> * the kern.maxbcache /boot/loader.conf variable.
> */
> #ifndef VM_BCACHE_SIZE_MAX
> #define VM_BCACHE_SIZE_MAX (400 * 1024 * 1024)
> #endif
>
> so large-memory amd64 systems never see it. ia64 and ppc would also
> hang at boot with >12GB, I think.
>
> On 5.x, the same hang exists, but you can work around it with the
> tunable. This tunable was broken by the long/int mismatch on 6.0, so
> sparc64 systems with >12GB were unusable.
>
> This patch reverts the above int->long change, and adds definitions
> for VM_BCACHE_SIZE_MAX and VM_SWZONE_SIZE_MAX on sparc64 copied from
> amd64. Actually, they should probably be added on other architectures
> too (ia64, ppc).
>
> Can someone please review?
--
/"\ Best regards, | mlaier at freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier at EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-sparc64/attachments/20051012/4fa77e6b/attachment.bin
More information about the freebsd-sparc64
mailing list