64-bit NULL: a followup

Fri Nov 28 19:19:01 PST 2003

On Fri, Nov 28, 2003 at 04:58:23PM -0800, Marcel Moolenaar wrote:
> Previously on -standards:
> 
> On Mon, Oct 27, 2003 at 02:49:51PM +0100, Harti Brandt wrote:
> > 
> > a question came up wether the NULL should be defined as (0L) on sparc.
> > (Solaris does this). Currently we define NULL as 0.
> 
> On Mon, 27 Oct 2003, Erik Trulsson wrote:
> >
> > Both are perfectly good definitions of NULL.
> 
> On Mon, 27 Oct 2003, Tony Finch wrote:
> >
> > No, NULL is an implementation-defined null pointer constant, not a null
> > pointer. The difference is that a null pointer constant is an integer
> > constant expression that evaluates to zero (optionally cast to void*),
> > and a null pointer is a null pointer constant converted to a pointer type
> > (which might involve changes in representation). Therefore using a bare
> > NULL to terminate the execl argument list is not in general legal.
> 
> On ia64 I run into a problem where a caller of a function with a variable
> number of arguments leaves garbage in the high order 32 bits due to the
> fact that NULL is defined as 0, and thus has type int by default.
> 
> Take the following simple example:
> 
> extern int va(int, ...);
> int foo(void)
> {
> 	return (va(1, 2, 3, 4, 5, 6, 7, 8, NULL));
> }

The last argument needs to be casted to the correct pointer type, for
the code to be correct.

> 
> The last argument has to be passed onto the stack, which gcc does
> as follows:
> 
> 		:
>   1c:   01 61 00 84                         adds r14=16,r12;;
>   20:   00 00 00 1c 90 11       [MII]       st4 [r14]=r0
>   26:   40 0a 00 00 48 a0                   mov r36=1
>   2c:   24 00 00 90                         mov r37=2
>   30:   00 30 0d 00 00 24       [MII]       mov r38=3
>   36:   70 22 00 00 48 00                   mov r39=4
>   3c:   55 00 00 90                         mov r40=5
>   40:   00 48 19 00 00 24       [MII]       mov r41=6
>   46:   a0 3a 00 00 48 60                   mov r42=7
>   4c:   85 00 00 90                         mov r43=8
>   50:   1c 00 00 00 01 00       [MFB]       nop.m 0x0
>   56:   00 00 00 02 00 00                   nop.f 0x0
>   5c:   08 00 00 50                         br.call.sptk.many b0=50 <foo+0x50>
> 		: 
> 
> Notice the "st4".  If NULL was defined as 0L, this would look like:
> 
> 		:
>   1c:   01 61 00 84                         adds r14=16,r12;;
>   20:   00 00 00 1c 98 11       [MII]       st8 [r14]=r0
>   26:   40 0a 00 00 48 a0                   mov r36=1
> 		:
> 
> Notice the "st8".  Since NULL is a pointer constant, programmers do
> (implicitly) expect it to have the same width as a pointer type and
> thus do not cast it to a pointer type or an integer type that has a
> width larger or equal to a pointer type.

Such an expectation is erroneous.  Programmers who have such
expectations obviously do not know the C language well enough.

When passing NULL as an argument to a function, and when there is not a
prototype in scope telling the compiler what type the argument is
supposed to have, you must *always* cast it to the correct type.  This
is the case both for var-arg functions, and for functions with only an
old-style declaration/definition in scope.
For functions with a prototype in scope, the compiler will
automatically convert NULL to the appropriate type.

> 
> So, the bottomline is that we currently do have third-party code that
> fails to run on ia64 (and possibly other 64-bit platforms) due to the
> fact that NULL is defined as 0.

Then that third-party code is buggy, and if the authors of it have made
such a basic mistake, then I wouldn't trust the rest of it much either.

In a correct C program you it should be possible to just replace any
and all occurences of NULL with a plain 0, and the program should
continue to work.  If it doesn't the program is buggy.

> 
> Since Erik thinks 0 and 0L are both perfectly good definitions for
> NULL and Tony emphasizes that NULL is an integer expression, I think
> we should change the definition of NULL to 0L to improve portability
> to FreeBSD/LP64. It will definitely fix known breakages on ia64.
> 
> Thoughts?

You could of course do that, but that would mainly serve to hide bugs
in programs.  It would be better to get those programs fixed, so that
they don't contain such bugs. For that purpose it would be better to
continue to have NULL be defined as 0, so that such bugs can be
triggered, and thus found, rather than silently ignoring them.

Of course, since (0L) is indeed a valid definition of NULL, there is no
technical reason why you couldn't make that change.  Just be aware that
changing the defintion of NULL in this way, does not really fix
anything and would most likely trigger bugs in some other programs that
make different, equally invalid, assumptions.
(Granted, there are most likely fewer programs whos bugs are triggered
by NULL being defined as 0L, than when NULL is defined as 0, but there
are bound to be some.)

-- 
<Insert your favourite quote here.>
Erik Trulsson
ertr1013 at student.uu.se