Help needed to identify golang fork / memory corruption issue on FreeBSD

Wed Dec 7 14:31:02 UTC 2016

On Wed, Dec 07, 2016 at 02:14:49PM +0200, Konstantin Belousov wrote:
> On Tue, Dec 06, 2016 at 08:35:04PM +0000, Steven Hartland wrote:
> > On 06/12/2016 14:35, Konstantin Belousov wrote:
> > > On Tue, Dec 06, 2016 at 01:53:52PM +0000, Steven Hartland wrote:
> > >> On 06/12/2016 12:59, Konstantin Belousov wrote:
> > >>> On Tue, Dec 06, 2016 at 12:31:47PM +0000, Steven Hartland wrote:
> > >>>> Hi guys I'm trying to help identify / fix an issue with golang where by
> > >>>> fork results in memory corruption.
> > >>>>
> > >>>> Details of the issue can be found here:
> > >>>> https://github.com/golang/go/issues/15658
> > >>>>
> > >>>> In summary when a fork is done in golang is has a chance of causing
> > >>>> memory corruption in the parent resulting in a process crash once detected.
> > >>>>
> > >>>> Its believed that this only effects FreeBSD.
> > >>>>
> > >>>> This has similarities to other reported issues such as this one which
> > >>>> impacted perl during 10.x:
> > >>>> https://rt.perl.org/Public/Bug/Display.html?id=122199
> > >>> I cannot judge about any similarilities when all the description provided
> > >>> is 'memory corruption'. BTW, the perl issue described, where child segfaults
> > >>> after the fork, is more likely to be caused by the set of problems referenced
> > >>> in the FreeBSD-EN-16:17.vm.
> > >>>
> > >>>> And more recently the issue with nginx on 11.x:
> > >>>> https://lists.freebsd.org/pipermail/freebsd-stable/2016-September/085540.html
> > >>> Which does not affect anything unless aio is used on Sandy/Ivy.
> > >>>
> > >>>> Its possible, some believe likely, that this is a kernel bug around fork
> > >>>> / vm that golang stresses, but I've not been able to confirm.
> > >>>>
> > >>>> I can reproduce the issue at will, takes between 5mins and 1hour using
> > >>>> 16 threads, and it definitely seems like an interaction between fork and
> > >>>> other memory operations.
> > >>> Which arch is the kernel and the process which demonstrates the behaviour  ?
> > >>> I mean i386/amd64.
> > >> amd64
> > > How large is the machine, how many cores, what is the physical memory size ?
> I was able to reproduce that as well, reliably, on two desktop-size
> machines. One is SandyBridge, same core microarchitecture as your
> crashbox, another is Haswell. I see the error both with PCID enabled
> and disabled on both machines (Haswell does implement INVPCID, so the
> original aio/PCID bug did never affected this microarchitecture).
> 
> I believe this clears the PCID changes from the accusations.
> 
> > >
> > >>>> I've tried reproducing the issue in C but also no joy (captured in the bug).
> > >>>>
> > >>>> For reference I'm currently testing on 11.0-RELEASE-p3 + kibs PCID fix
> > >>>> (#306350).
> > >>> Switch to HEAD kernel, for start.
> > >>> Show the memory map of the failed process.
> > No sign of zeroed memory that I can tell.
> > 
> > This error was caused by hitting the following validation in gc:
> > func (list *mSpanList) remove(span *mspan) {
> >          if span.prev == nil || span.list != list {
> >                  println("runtime: failed MSpanList_Remove", span, 
> > span.prev, span.list, list)
> >                  throw("MSpanList_Remove")
> >          }
> > 
> > runtime: failed MSpanList_Remove 0x80052e580 0x80052e300 0x53e9c0 0x53e9b0
> > fatal error: MSpanList_Remove
> > 
> > (gdb) print list
> > $4 = (runtime.mSpanList *) 0x53e9b0 <runtime.mheap_+4944>
> > (gdb) print span.list
> > $5 = (runtime.mSpanList *) 0x53e9c0 <runtime.mheap_+4960>
> The difference, which triggered the exception, is quite curious:
> list is 0x53e9b0, and span.list == list + 0x10.  More, this is not
> a single-bit error: bit patter is 1011 for 0xb and 1100 for 0xc.
> 
> It is highly unlikely that the cause is a memory corruption due to
> OS mis-managing pages or TLB.  Typically, you get either page or cache
> line of complete garbage, instead of the almost identical but slightly
> modified data.
> 
> > (gdb) print span.prev
> > $6 = (struct runtime.mspan **) 0x80052e300
> > (gdb) print *list
> > $7 = {first = 0x80052e580, last = 0x8008aa180}
> > (gdb) print *span.list
> > $8 = {first = 0x8007ea7e0, last = 0x80052e580}
> > 
> > procstat -v test.core.1481054183
> >    PID              START                END PRT  RES PRES REF SHD FLAG 
> > TP PATH
> >   1178           0x400000           0x49b000 r-x  115  223 3   1 CN-- vn 
> > /root/test
> >   1178           0x49b000           0x528000 r--   97  223 3   1 CN-- vn 
> > /root/test
> >   1178           0x528000           0x539000 rw-   10    0 1   0 C--- vn 
> > /root/test
> >   1178           0x539000           0x55a000 rw-   16   16 1   0 C--- df
> >   1178        0x800528000        0x800a28000 rw-  118  118 1   0 C--- df
> >   1178        0x800a28000        0x800a68000 rw-    1    1 1   0 CN-- df
> >   1178        0x800a68000        0x800aa8000 rw-    2    2 1   0 CN-- df
> >   1178        0x800aa8000        0x800c08000 rw-   50   50 1   0 CN-- df
> >   1178        0x800c08000        0x800c48000 rw-    2    2 1   0 CN-- df
> >   1178        0x800c48000        0x800c88000 rw-    1    1 1   0 CN-- df
> >   1178        0x800c88000        0x800cc8000 rw-    1    1 1   0 CN-- df
> >   1178       0xc000000000       0xc000001000 rw-    1    1 1   0 CN-- df
> >   1178       0xc41ffe0000       0xc41ffe8000 rw-    8    8 1   0 CN-- df
> >   1178       0xc41ffe8000       0xc41fff0000 rw-    8    8 1   0 CN-- df
> >   1178       0xc41fff0000       0xc41fff8000 rw-    8    8 1   0 C--- df
> >   1178       0xc41fff8000       0xc420300000 rw-  553  553 1   0 C--- df
> >   1178       0xc420300000       0xc420400000 rw-  234  234 1   0 C--- df
> >   1178     0x7ffffffdf000     0x7ffffffff000 rwx    2    2 1   0 C--D df
> >   1178     0x7ffffffff000     0x800000000000 r-x    1    1 33   0 ---- ph
> > 
> > This is from FreeBSD 12.0-CURRENT #36 r309618M
> > 
> > ktrace on 11.0-RELEASE is still running 6 hours so far.

One thing that I noted.  In my later attempt to reproduce the issue, I
got the following output:

sandy% GOGC=2 ./1.go                                                     /mnt/1
Starting 16 forking goroutines...
GOMAXPROCS: 8
runtime: failed MSpanList_Remove 0x8006a9d60 0x53f2e0 0x53f2f0 0x53f2e0
fatal error: MSpanList_Remove

runtime stack:
runtime.throw(0x4cca4d, 0x10)
        /usr/local/go/src/runtime/panic.go:566 +0x95
runtime.(*mSpanList).remove(0x53f2e0, 0x8006a9d60)
        /usr/local/go/src/runtime/mheap.go:1001 +0x19d
runtime.(*mcentral).cacheSpan(0x53f2d0, 0x44abfb)
        /usr/local/go/src/runtime/mcentral.go:55 +0x3d0
runtime.(*mcache).refill(0x80052a0d0, 0xc400000016, 0xc420200d38)
        /usr/local/go/src/runtime/mcache.go:121 +0xae
runtime.(*mcache).nextFree.func1()
        /usr/local/go/src/runtime/malloc.go:505 +0x33
runtime.systemstack(0xc420018000)
        /usr/local/go/src/runtime/asm_amd64.s:298 +0x79
runtime.mstart()
        /usr/local/go/src/runtime/proc.go:1079

goroutine 8797810 [running]:
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_amd64.s:252 fp=0xc420200be0 sp=0xc420200bd8
runtime.(*mcache).nextFree(0x80052a0d0, 0xc420200c16, 0x40ec95, 0xc42011c050, 0x10)
        /usr/local/go/src/runtime/malloc.go:506 +0xb2 fp=0xc420200c38 sp=0xc420200be0
runtime.mallocgc(0x1a0, 0x4aaf60, 0xc420200d01, 0x44ab40)
        /usr/local/go/src/runtime/malloc.go:658 +0x809 fp=0xc420200cd8 sp=0xc420200c38
runtime.makeslice(0x4aaf60, 0x0, 0x19, 0xc4200b8688, 0x8, 0x8)
        /usr/local/go/src/runtime/slice.go:57 +0x7b fp=0xc420200d30 sp=0xc420200cd8
syscall.Environ(0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/env_unix.go:142 +0xd0 fp=0xc420200dc0 sp=0xc420200d30
os.Environ(0x0, 0x0, 0x0)
        /usr/local/go/src/os/env.go:116 +0x22 fp=0xc420200de8 sp=0xc420200dc0
os/exec.(*Cmd).envv(0xc4201422c0, 0xc4200b8680, 0x0, 0x1)
        /usr/local/go/src/os/exec/exec.go:171 +0x38 fp=0xc420200e10 sp=0xc420200de8
os/exec.(*Cmd).Start(0xc4201422c0, 0x6, 0x0)

There are two traces for goroutines, and note that both first and second
are in malloc.  Is go malloc fine-grain locked ?  E.g. I know that JVM
uses per-thread arenas.

In fact, are there stress-tests for the go mutual exclusion primitives ?
The runtime seems to try to use thr and umtx syscalls directly, which
could be the source of bugs.