Re: UBSAN report for main [so: 14] /usr/bin/whatis: non-zero (48) and zero offsets from null pointer in qsort.c

From: Stefan Esser <se_at_FreeBSD.org>
Date: Wed, 12 Jan 2022 12:44:44 UTC
Am 12.01.22 um 08:50 schrieb Jan Kokemüller:
> On 11.01.22 22:08, Stefan Esser wrote:
>> diff --git a/lib/libc/stdlib/qsort.c b/lib/libc/stdlib/qsort.c
>> index 5016fff7895f..51c41e802330 100644
>> --- a/lib/libc/stdlib/qsort.c
>> +++ b/lib/libc/stdlib/qsort.c
>> @@ -108,6 +108,8 @@ local_qsort(void *a, size_t n, size_t es, cmp_t *cmp, void
>> *thunk)
>>  	int cmp_result;
>>  	int swap_cnt;
>>
>> +	if (__predict_false(a == NULL))
>> +		return;
>>  loop:
>>  	swap_cnt = 0;
>>  	if (n < 7) {
>>
>> This would also work to prevent the NULL pointer arithmetik for
>> ports that might also path a == NULL and n == 0 in certain cases.
> 
> The UB happens in this line, when "a == NULL" and "n == 0", right?
> 
>     for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es)
> 
> This is arithmetic on a pointer (the NULL pointer) which is not part of an
> array, which is UB.

Yes.

> Then, wouldn't "if (__predict_false(n == 0))" be more appropriate than checking
> for "a == NULL" here? Testing for "a == NULL" might suppress UBSAN warnings of
> valid bugs, i.e. when "qsort" is called with "a == NULL" and "n != 0". In that
> case UBSAN _should_ trigger.

Yes, but not only UBSAN would trigger, the program would probably
crash due to an attempt to access an unmapped page.

> UBSAN should not trigger when n == 0, though. At least, when "a" does point to
> a valid array. But what about the case of "a == NULL && n == 0"? Is that deemed
> UB? It looks like at least FreeBSD's "qsort_s" implementation says it's legal.

This might be legal, but it leads to adding the element size to a
NULL pointer, in the current implementation. The addition happens
in the initialization part of the for loop, before n == 0 leads to
no actual iteration being performed (a + es < a + n * es is false
for es > 0).

There is no functional difference if the case of a == NULL and
n == 0 is silently ignored.

But your are correct: just returning early for a == NULL and n != 0
will prevent the program abort.

> a != NULL (pointing to valid array), n != 0  ->  "normal" case, no UB
> a != NULL (pointing to valid array), n == 0  ->  should not trigger UB, and
>                                                  doesn't in the current
>                                                  implementation

It does trigger UB in a way that does not cause issues (or else
the problem would have been detected before). a == NULL makes the
calculation of pm = (char *)a + es undefined, but the value of pm
will never be used if n == 0.

> a == NULL, n == 0                            ->  should not trigger UB?
>                                                  (debatable)
> 
> So if "a == NULL && n == 0" was deemed legal, then there would be no bug in
> "mansearch.c", right?

IMHO it is not the question of "legal" or not, but we should prevent
the undefined behavior that results from execution reaching the
initialization part of the for loop.

Any you are correct, the patch should probably be:

diff --git a/lib/libc/stdlib/qsort.c b/lib/libc/stdlib/qsort.c
index 5016fff7895f..eef51d2dd3b3 100644
--- a/lib/libc/stdlib/qsort.c
+++ b/lib/libc/stdlib/qsort.c
@@ -108,6 +108,8 @@
 	int cmp_result;
 	int swap_cnt;

+	if (__predict_false(a == NULL && n == 0))
+		return;
 loop:
 	swap_cnt = 0;
 	if (n < 7) {

This will be detected by UBSAN if called with a == NULL and n != 0,
but it will also cause the program to fail with typical parameters
for the elements to sort and the cmp function.

Regards, STefan

PS: I just saw Mark's reply regarding n == 0 and cmp == NULL. That
    case is already covered, since n == 0 will prevent cmp from
    being dereferenced (since that only happens in the loop body,
    which will not be entered for n == 0).