Diablo 1.5 SIGBUS - fixed

Kurt Miller lists at intricatesoftware.com
Mon Apr 10 17:15:14 UTC 2006


On Sunday 09 April 2006 9:45 pm, odela01 wrote:
> > From: Kurt Miller <kurt at intricatesoftware.com>
> > I was able to catch the SIGBUS in gdb once so far on a remote
> > multiprocessor system. There was some evidence that the use of
> > of the jvm argument -XX:+UseMembar will help correct the problem.
> > I wasn't readily able to reproduce the problem so I'm not sure
> > yet if this is the proper solution. Can those of you who are
> > getting the SIGBUS try this and see if it improves things?
> 
> I think you nailed it! Previously it would always sigbus a few seconds after
> launching TestNG, but after adding that argument, it completed the whole
> test suite, which takes about 20 minutes.
> 
> Google didn't have much to say about UseMembar, can you tell me what effect
> it has?

The SIGBUS occurred in a thread related optimization for
multiprocessor systems that Sun introduced after the initial
release of 1.5.0. From what I can gather from a brief
inspection of the code is that they were removing unnecessary
memory barriers, but added the UseMembar option to have a
way to enable them again. The SIGBUS was happening in the
new optimized code path that didn't use a membar.

The best description I've found so far on the use of
membar is in this old bug report:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4629468

The following excerpt was the most relevant, "For MP systems,
a membar may still be required so that the load or store is seen
in the order desired across a machine's memory system.  It depends
on the memory system."

In other bug reports I see that on Windows for the 1.5.0_0x
releases -XX:+UseMembar is the default because of problems with
the changes. So diablo on multiprocessor systems will need to use
-XX:+UseMembar until the next release where it can be made the
default.

-Kurt


More information about the freebsd-java mailing list