What parts of UMA are part of the stable ABI?
jhb at freebsd.org
Wed Mar 18 14:46:52 UTC 2015
On Friday, March 13, 2015 07:48:38 PM Ryan Stone wrote:
> In this freebsd-hackers thread, a user reported that 10.1-RELEASE
> crashes during boot on a system with 3TB of RAM. As it turns out, when you
> have that much RAM ZFS autotunes itself to allocate a 6GB hash table. This
> triggers a nasty 32-bit integer truncation bug in malloc(9). malloc()
> calls uma_large_malloc(), but uma_large_malloc() accepts an int instead of
> a size_t and all kinds of hilarity can ensure from there. The user has
> confirmed that the page in  fixed the kernel from instantly panicking
> once zfs.ko was loaded. I'm a bit concerned about whether the patch as
> written is an MFC candidate though.
> uma_large_malloc() calls page_alloc() to actuallly allocate the memory, and
> page_alloc() also accepts an int size parameter. This is where things get
> tricky. The signature for page_alloc() is governed by the uma_alloc()
> typedef, as uma also uses it internally for allocating memory for
> uma_zones. There is even a uma_zone_set_allocf() API for overriding the
> default allocation function. So there's definitely an argument to be made
> the the signature of page_alloc() being a part of the stable ABI.
> I have no hesitation in saying that uma_large_malloc() is not a stable API
> and changing it is fair game. If uma_alloc() is a part of the stable API,
> then it's simple enough to commit a 64-bit safe allocation function for
> uma_large_malloc() to call and changing page_alloc() to call it instead.
> That commit can be MFC'ed, and a follow-up commit could convert the UMA
> APIs to use size_t everywhere.
I think uma_zone_set_allocf() is most likely obscure enough to not be used
outside of the places in the kernel where it is used. From that, I think
you are fine to change uma_alloc()'s signature.
> While I am at this, I'd like to also change the uma init/fini/ctor/dtor to
> also use size_t. I'm a little torn on this because this will definitely
> cause a lot of churn, both in the tree and for downstream consumers, and
> there's not necessarily going to be a big benefit to it. However, I
> suppose that the existence of machines where 4GB is less than 1% of system
> memory may mean that allocating 4GB at a time may not that outlandish. I
> can definitely be talked out of this though.
I do think the normal zone callbacks passed to uma_zcreate() are too public
to change. Or at least, you would need to do some crazy ABI shim where you
have a uma_zcreate_new() that you map to uma_zcreate() via a #define for
the API, but include a legacy uma_zcreate() symbol that older modules can
call (and then somehow tag the old function pointers via an internal flag
in the zone and patch UMA to cast to the old function signatures for zones
with that flag).
Note that if you are really paranoid you could do the same thing with
uma_zone_set_allocf(). It would break the API, but would preserve the
ABI (i.e. leave an existing uma_zone_set_allocf() function that accepts the
old prototype but have a uma_zone_set_allocf_new() that gets mapped to
uma_zone_set_allocf via a #define).
More information about the freebsd-current