svn commit: r265367 - head/lib/libc/regex

Tue May 6 13:08:00 UTC 2014

On Tue, 6 May 2014, David Chisnall wrote:

> On 6 May 2014, at 05:46, Bruce Evans <brde at optusnet.com.au> wrote:
>
>> The standard behaviour is undefined.  It cannot be relied on.  From C99
>> (n869.txt):
>>
>> %        7.20.3.1  The calloc function
>> % %        Synopsis
>> % %        [#1]
>> % %                #include <stdlib.h>
>> %                void *calloc(size_t nmemb, size_t size);
>> % %        Description
>> % %        [#2] The calloc function allocates space  for  an  array  of
>> %        nmemb  objects,  each  of  whose size is size.  The space is
>> %        initialized to all bits zero.238)
>>
>> Oops, there is no object to begin with, so perhaps the behaviour is
>> defined after all.  This is unclear.
>
> You're missing off the next line:
>
>> 	• 3  The calloc function returns either a null pointer or a pointer to the allocated space.

It takes more that that to give defined behaviour.

There is a similar example for snprintf().  It is specified to return
a count in an int variable, but it is possible for the correct count
to be unrepresentable as an int.  The behaviour is then implicitly
undefined.  The function parameters are just invalid, and undefined
behaviour occurs because snprintf() just doesn't support invalid
parameters.

Here calloc() can sort of support invalid parameters by returning a
nondescript error for them.  The question is if it is required to do
this.

> Clarifications from WG14 have indicated that this means that calloc() *must* return either NULL or enough space for nmemb objects of size size.  The text of the standard was not changed in C11 because it seemed to be the consensus of library authors that this is obvious from the existing text.  See the CERT report from my previous email - in 2002 it was regarded as a security hole (and a lack of standards conformance) if your calloc did not do this and all known calloc implementations that did not were fixed.

It is not obvious.  C11 (n1570.pdf) also didn't change the wording for
snprintf().  It is not obvious that (because of 20+ year old design errors)
it has more undefined cases that might be expected.

> Now, you can argue that either:
>
> - In this case, we can statically prove that the multiplication won't overflow so we don't need a check, or
>
> - It is better to do the overflow check on the caller side and increase i-cache usage to save some memory zeroing.
>
> But please don't try to argue that it is permitted for calloc() to not correctly handle integer overflow.  It is both non-conformant and dangerous for it to fail to do so.

calloc() is not even required to do the multiplication...

>> It is also unclear if objects
>> can have size too large to represent as a size_t

... A  silly implementation of calloc() could do sbrk() 1 element at a time.
This is a slow way of doing the multiplication as well as the allocation.
It might work to allocate an object(?) larger than SIZE_MAX.  Is calloc()
allowed to do this?

> That is implementation defined, however if sizeof(ptrdiff_t) <= sizeof(size_t) then they can not because you must be able to represent the difference between any two pointers as a ptrdiff_t[1].  If you want to be pedantic, _Static_assert(sizeof(ptrdiff_t) <= sizeof(size_t), "Unsupported platform!") to make sure you catch it at compile time if this might change.

ptrdiff_t is much more broken as designed than size_t.   Apart from the
sign problem, it is not required to work useful unless the difference between
the pointers is between -65535 and 65335 (the behaviour of pointer
subtraction is undefined unless the result is represntable as a ptrdiff_t,
and ptrdiff_t is not required to be any larger than 1's complement with 17
bits even if size_t is much larger).

The unclear point is if the implementation-defined result of sizeof() can
be different to the size of the object due to the latter being too large
to represent in a size_t.

> [1] This also means, on our platforms, that the maximum size of an object must be one byte less than the total size of the address space, as C only defines pointer comparisons between valid pointers to the same object and allows pointers to be one element past the end of an array.

I used to run into this problem on 16-bit systems where half of the
address space is the good buffer or heap size 32K.  C90 didn't have
PTRDIFF_MIN/MAX and pre-C90 had even less, so most implementations had
ptrdiff_t = int and the undefined behaviour from pointer subtraction
occurred in practice.

Bruce