clang 3.8.0 based powerpc (32 bit) buildworld runs on a PowerMac! [problems found]

Mark Millard markmi at dsl-only.net
Sun Jan 31 01:59:25 UTC 2016


I have submitted a minor variation of this analysis text for the uninitialized pointer use in in libc/stdio "string output" routine implementations as Bug 206770.

If anyone finds that I missed the initialization let me know and I'll change the status of the bug.

===
Mark Millard
markmi at dsl-only.net

On 2016-Jan-30, at 5:13 PM, Mark Millard <markmi at dsl-only.net> wrote:

So far I'm unable to reproduce the problem with simple code replacing the library code.

And I expect that I have have a smoking gun for why.  Care to check the below and see if I missed something? As far as I can tell this is a FreeBSD libc/stdio defect, not a clang 3.8.0 one.

Unfortunately the reason is spread out in the code so it takes a bit to describe the context for the uninitialized pointer that I expect is involved.

To start the description I note the actual, low-level failure point:

> #0  0x419a89c8 in memcpy (dst0=0xffffd734, src0=<optimized out>, length=<optimized out>) at /usr/src/lib/libc/string/bcopy.c:124
> 124				TLOOP1(*--dst = *--src);

In the assembler code for this is the the *--src access that gets the segmentation violation. I do not justify that claim here but use that fact later.

So what leads up to that? Going the other way, starting from the use of snprintf. . .

snprintf(char * __restrict str, size_t n, char const * __restrict fmt, ...) sets up its __vfprintf(FILE *fp, locale_t locale, const char *fmt0, va_list ap) use via:

>        va_list ap;
>        FILE f = FAKE_FILE;
. . .
>        va_start(ap, fmt);
>        f._flags = __SWR | __SSTR;
>        f._bf._base = f._p = (unsigned char *)str;
>        f._bf._size = f._w = n;
>        ret = __vfprintf(&f, __get_locale(), fmt, ap);

so at the __vfprintf call f._p reference the buffer that __vfprintf's str references. __vfprintf in turn does (in part):

>        struct io_state io;     /* I/O buffering state */
. . .
>        io_init(&io, fp);

where io is on-stack (not implicitly initialized). The io_init does:

> #define NIOV 8
> struct io_state {
>        FILE *fp;
>        struct __suio uio;      /* output information: summary */
>        struct __siov iov[NIOV];/* ... and individual io vectors */
> };
> 
> static inline void
> io_init(struct io_state *iop, FILE *fp)
> {
> 
>        iop->uio.uio_iov = iop->iov;
>        iop->uio.uio_resid = 0;
>        iop->uio.uio_iovcnt = 0;
>        iop->fp = fp;
> }

where (on stack as part of __vfprintf's io):

> struct __siov {
>        void    *iov_base;
>        size_t  iov_len;
> };
> struct __suio {
>        struct  __siov *uio_iov;
>        int     uio_iovcnt;
>        int     uio_resid;
> };

So via __vfprintf's io.fp->_p the str buffer is accessible for outputting to.

But in none of this or other code that I've looked at for this snprintf use case have I found code that initializes the involved io.uio.uio_iov->iov_base (i.e., io.iov[0].iov_base) to point to anything specific. (Nor is iov_base's matching iov_len initialized.)

Here is a stab at finding all the initializations of iov_base fields:

> # grep "iov_base.*=" /usr/src/lib/libc/stdio/*
> /usr/src/lib/libc/stdio/fputs.c:        iov.iov_base = (void *)s;
> /usr/src/lib/libc/stdio/fputws.c:       iov.iov_base = buf;
> /usr/src/lib/libc/stdio/fwrite.c:       iov.iov_base = (void *)buf;
> /usr/src/lib/libc/stdio/perror.c:               v->iov_base = (char *)s;
> /usr/src/lib/libc/stdio/perror.c:               v->iov_base = ": ";
> /usr/src/lib/libc/stdio/perror.c:       v->iov_base = msgbuf;
> /usr/src/lib/libc/stdio/perror.c:       v->iov_base = "\n";
> /usr/src/lib/libc/stdio/printfcommon.h: iop->iov[iop->uio.uio_iovcnt].iov_base = (char *)ptr;
> /usr/src/lib/libc/stdio/puts.c: iov[0].iov_base = (void *)s;
> /usr/src/lib/libc/stdio/puts.c: iov[1].iov_base = "\n";
> /usr/src/lib/libc/stdio/putw.c: iov.iov_base = &w;
> /usr/src/lib/libc/stdio/vfwprintf.c:    iov.iov_base = buf;
> /usr/src/lib/libc/stdio/xprintf.c:      io->iovp->iov_base = __DECONST(void *, ptr);

The only file above involved in common for this context turns out to be: /usr/src/lib/libc/stdio/printfcommon.h and the above assignment in that file is in io_print(struct io_state *iop, const CHAR * __restrict ptr, int len, locale_t locale), which is not in use for this context. Here is the assignment anyway (for reference):

> static inline int
> io_print(struct io_state *iop, const CHAR * __restrict ptr, int len, locale_t locale)
> {
> 
>        iop->iov[iop->uio.uio_iovcnt].iov_base = (char *)ptr;
>        iop->iov[iop->uio.uio_iovcnt].iov_len = len;
>        iop->uio.uio_resid += len;
. . .

In other words: The segmentation violation is for use of __vfprintf's uninitialized io.uio.uio_iov->iov_base .

Returning to tracing the actually used code for this context to support that claim some more. . .

The __vfprintf (FILE *fp, locale_t locale, const char *fmt0, va_list ap) eventually does:

       if (io_flush(&io, locale))

and io_flush(struct io_state *iop, locale_t locale) does:

       return (__sprint(iop->fp, &iop->uio, locale));

and _sprintf(FILE *fp, struct __suio *uio, locale_t locale) does:

       err = __sfvwrite(fp, uio);

and __sfvwrite(FILE *fp, struct __suio *uio) does:

       p = iov->iov_base;
       len = iov->iov_len;

where  iov->iov_base is another name for __vfprintf's io.uio.uio_iov->iov_base . __sfvwrite then uses:

#define COPY(n)   (void)memcpy((void *)fp->_p, (void *)p, (size_t)(n))

which fails dereferencing p (i.e., __vfprintf's io.uio.uio_iov->iov_base ). 

In other words (again): The segmentation violation is for use of the uninitialized iop->uio.uio_iov->iov_base.


===
Mark Millard
markmi at dsl-only.net

On 2016-Jan-30, at 5:58 AM, Mark Millard <markmi at dsl-only.net> wrote:

On 2016-Jan-30, at 3:29 AM, Roman Divacky <rdivacky at vlakno.cz> wrote:

> Can you file a bug in llvm bugzilla?

I could try for the example code. But I'd like to make the example more self contained first, avoiding snprintf from library code and hopefully with a much smaller, simpler implementation involved than the very-general library code.



Separately: I'm not sure any llvm folks are going to have a way to test unless someone shows the problem outside a FreeBSD context. powerpc-clang (32-bit) based FreeBSD buildworld's are not exactly a normal context at this point.

My files with powerpc (32-bit) tied differences from svn for projects/clang380-import -r294962 are:

Index: /media/usr/src/sys/boot/powerpc/Makefile
===================================================================
--- /media/usr/src/sys/boot/powerpc/Makefile	(revision 294962)
+++ /media/usr/src/sys/boot/powerpc/Makefile	(working copy)
@@ -1,5 +1,9 @@
# $FreeBSD$

-SUBDIR=		boot1.chrp kboot ofw ps3 uboot
+SUBDIR=		boot1.chrp
+.if ${MACHINE_ARCH} == "powerpc64"
+SUBDIR+=		kboot
+.endif
+SUBDIR+=		ofw ps3 uboot

.include <bsd.subdir.mk>
Index: /media/usr/src/sys/conf/Makefile.powerpc
===================================================================
--- /media/usr/src/sys/conf/Makefile.powerpc	(revision 294962)
+++ /media/usr/src/sys/conf/Makefile.powerpc	(working copy)
@@ -35,7 +35,11 @@

INCLUDES+= -I$S/contrib/libfdt

+.if ${COMPILER_TYPE} == "gcc"
CFLAGS+= -msoft-float -Wa,-many
+.else
+CFLAGS+= -msoft-float
+.endif

# Build position-independent kernel
CFLAGS+= -fPIC
Index: /media/usr/src/sys/conf/kern.mk
===================================================================
--- /media/usr/src/sys/conf/kern.mk	(revision 294962)
+++ /media/usr/src/sys/conf/kern.mk	(working copy)
@@ -144,7 +144,11 @@
#
.if ${MACHINE_CPUARCH} == "powerpc"
CFLAGS+=	-mno-altivec
+.if ${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} < 30800
CFLAGS.clang+=	-mllvm -disable-ppc-float-in-variadic=true
+.else
+CFLAGS.clang+=	-msoft-float
+.endif
CFLAGS.gcc+=	-msoft-float
INLINE_LIMIT?=	15000
.endif
Index: /media/usr/src/sys/conf/kmod.mk
===================================================================
--- /media/usr/src/sys/conf/kmod.mk	(revision 294962)
+++ /media/usr/src/sys/conf/kmod.mk	(working copy)
@@ -137,8 +137,12 @@
.endif

.if ${MACHINE_CPUARCH} == powerpc
+.if ${COMPILER_TYPE} == "gcc"
CFLAGS+=	-mlongcall -fno-omit-frame-pointer
+.else
+CFLAGS+=	-fno-omit-frame-pointer
.endif
+.endif

.if ${MACHINE_CPUARCH} == mips
CFLAGS+=	-G0 -fno-pic -mno-abicalls -mlong-calls


(I can not actually buildkernel for powerpc via clang 3.8.0. Still some of the above is for the kernel context.)

src.conf content:

KERNCONF=GENERICvtsc-NODEBUG
TARGET=powerpc
TARGET_ARCH=powerpc
#
WITH_FAST_DEPEND=
WITH_LIBCPLUSPLUS=
WITH_BOOT=
WITH_BINUTILS_BOOTSTRAP=
WITH_CLANG_BOOTSTRAP=
WITH_CLANG=
WITH_CLANG_IS_CC=
WITH_CLANG_FULL=
WITH_CLANG_EXTRAS=
#
# lldb requires missing atomic 8-byte operations for powerpc (non-64)
WITHOUT_LLDB=
#
WITHOUT_LIB32=
WITHOUT_GCC_BOOTSTRAP=
WITHOUT_GCC=
WITHOUT_GCC_IS_CC=
WITHOUT_GNUCXX=
#
NO_WERROR=
MALLOC_PRODUCTION=
#
WITH_DEBUG_FILES=


On Sat, Jan 30, 2016 at 03:00:26AM -0800, Mark Millard wrote:
> I got around to trying some more use of the 3.8.0 clang based world on powerpc (32 bit) (now -r294962 based) and ran into:
> 
> A) Segmentation faults during signal handlers in syslogd, nfsd, mountd, and (for SIGNFO) make.
> 
> B) ls sometimes segmentation faulting
> 
> C) make -j 6 buildworld segmentation faulting in make eventually but make buildworld works.
> 
> I have reduced (A) to a simple program that demonstrates the behavior:
> 
>> # more sig_snprintf_use_test.c 
>> #include <stdio.h>
>> #include <signal.h>
>> 
>> volatile sig_atomic_t sat = 0;
>> 
>> void
>> handler(int sig)
>> {
>>  char uidbuf[32];
>>  (void) snprintf(uidbuf, sizeof uidbuf, "%d", 10);
>>  sat = uidbuf[0];
>> }
>> 
>> int
>> main(void)
>> {
>>  if (signal(SIGINT, handler) != SIG_ERR) raise(SIGINT);
>>  return sat;
>> }
> 
>> # ./a.out
>> Segmentation fault (core dumped)
>> # /usr/local/bin/gdb a.out /var/crash/a.out.1510.core
>> GNU gdb (GDB) 7.10 [GDB v7.10 for FreeBSD]
> . . .
>> warning: Unexpected size of section `.reg2/100167' in core file.
>> #0  0x419a89c8 in memcpy (dst0=0xffffd734, src0=<optimized out>, length=<optimized out>) at /usr/src/lib/libc/string/bcopy.c:124
>> 124				TLOOP1(*--dst = *--src);
>> (gdb) bt
>> #0  0x419a89c8 in memcpy (dst0=0xffffd734, src0=<optimized out>, length=<optimized out>) at /usr/src/lib/libc/string/bcopy.c:124
>> #1  0x419a3984 in __sfvwrite (fp=<optimized out>, uio=<optimized out>) at /usr/src/lib/libc/stdio/fvwrite.c:128
>> #2  0x41934468 in __sprint (fp=<optimized out>, uio=<optimized out>, locale=<optimized out>) at /usr/src/lib/libc/stdio/vfprintf.c:164
>> #3  io_flush (iop=<optimized out>, locale=<optimized out>) at /usr/src/lib/libc/stdio/printfcommon.h:155
>> #4  __vfprintf (fp=<optimized out>, locale=<optimized out>, fmt0=<optimized out>, ap=<optimized out>) at /usr/src/lib/libc/stdio/vfprintf.c:1020
>> #5  0x4199c644 in snprintf (str=0xffffd734 "", n=<optimized out>, fmt=0x1800850 "%d") at /usr/src/lib/libc/stdio/snprintf.c:72
>> #6  0x01800708 in handler ()
>> Backtrace stopped: Cannot access memory at address 0xffffd760
> 
> (The "Unexpected size . . ." is a known problem in powerpc land at this point, not tied to clang 3.8.0 .)
> 
> The syslogd, nfsd, mountd, and SIGINFO-related make backtraces are similar. I got the program above from simplifying the mountd failure context.
> 
> A direct call, handler(0), does not get the segmentation fault.
> 
> I'll note that in C the handler calling snprintf or other such is a no-no for the general case: only abort(), _Exit(), or signal() as of C99 as I understand. But the restriction is not true of use of raise so the small program is still valid C99 code. Of course it appears FreeBSD allows more than C99 does in this area.
> 
> I've not yet investigated what the original signals are in syslogd, nfsd, or mountd. They may well indicate another problem.
> 
> 
> I've not gotten as far classifying (B) or (C) as well.
> 
> (B) is a xo_emit context each time so far (so C elipsis use again, like (A)) but no signal handler seems to be active. It stops in xo_format_string_direct. My attempts at simpler code have not produced the problem so far.
> 
> (C) is such that GDB 7.10 reports "previous frame to this frame (corrupt stack?)" or otherwise gives up. It shows Var_Value called by Make_Update before reporting that. gdb 6.1.1 shows more after that: JobFinish, JobReapChild, Job_CatchChildern, Job_CatchOutput, Make_Run, main). SIGCHLD or other such use may well be involved here.
> 
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> On 2016-Jan-19, at 2:35 AM, Mark Millard <markmi at dsl-only.net> wrote:
> 
> I now have an SSD that contains:
> 
> 0) installkernel material from a gcc 4.2.1 based buildkernel
> 
> 1) installworld material from a clang 3.8.0 based buildworld
> (clang 3.8.0, libc++, etc.)
> 
> It boots and seems to be operating fine after booting --in both a G5 and a G4 PowerMac.
> 
> Apparently the clang code generation has been updated to not require an explicit -mlongcall. I had to remove those since clang rejects them on command lines. It linked without complaint (and later seems to be running fine). (I've seen llvm review notes mentioning the "medium model" or some phrase like that for powerpc.)
> 
> (I've not been able to buildkernel yet for powerpc (non-64) from my amd64 environment: rejected command lines for other issues. Thus the current limitation to buildworld.)
> 
> 
> 
> To get to (1) I did the following sort of sequence:
> (The first few steps deal with other issues in order to have sufficient context.)
> 
> 
> A) Started by installing the latest powerpc (non-64) snapshot.
> ( http://ftp1.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/11.0/FreeBSD-11.0-CURRENT-powerpc-20160113-r293801-disc1.iso )
> 
> (I had to use a PowerMac with video hardware that vt would handle.)
> (Basic display, no X-windows involvement here.)
> 
> 
> B) Rebuild, including using my usual kernel configuration that has
> both vt and sc. I did this based on projects/clang380-import
> -r294201 /usr/src but still using gcc 4.2.1 (native on the
> PowerMac). The configuration turns off kernel debugging extras too.
> 
> 
> C) installkernel, installworld, etc., set to use sc instead of vt, and rebooted.
> 
> (As of this I could use the SSD in more PowerMacs by using sc instead of vt via a /boot/loader.conf assignment.)
> 
> 
> D) dump/restore the file systems to another SSD (after partitioning it).
> Adjust the host name and the like on the copy.
> 
> (This copy later ends up having new installworld materials overlaid.)
> 
> 
> E) In a projects/clang380-import -r294201 amd64 environment, buildworld for
> TARGET_ARCH=powerpc . WITH_LIBCPLUSPLUS= and clang related material built,
> gcc 4.2.1 related material not built. WITH_BOOT= as well. I choose
> WITHOUT_DEBUG= and WITHOUT_DEBUG_FILES= . (I've not tried enabling them yet.)
> binutils is not from ports.
> 
> 
> F) Use DESTDIR= with installworld to an initially empty directory tree. tar the tree.
> 
> 
> G) Transfer the tar file to the PowerMac. Mount the to-be-updated SSD to
> /mnt and /mnt/var. After chflags -R noschg on /mnt and /mnt/var use
> tar xpf to replace things from the buildworld on /mnt and /mnt/var.
> 
> (This does leave older gcc 4.2.1 related materials in place.)
> 
> H) Dismounts, shutdown, and then boot from the updated SSD.
> 
> 
> 
> Note: I've never manage to get powerpc64-xtoolchain-gcc/powerpc64-gcc to produce working 32-bit code. So I've never gotten this far via that path.
> 
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> 
> _______________________________________________
> freebsd-toolchain at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
> To unsubscribe, send any mail to "freebsd-toolchain-unsubscribe at freebsd.org"

===
Mark Millard
markmi at dsl-only.net





More information about the freebsd-toolchain mailing list