misc/89103: gcc segmentation fault errors
kris at obsecurity.org
Mon Nov 21 21:47:02 GMT 2005
On Fri, Nov 18, 2005 at 06:00:32AM +0000, Walter Roberts wrote:
> The following reply was made to PR misc/89103; it has been noted by GNATS.
> From: "Walter Roberts" <wroberts at securenym.net>
> To: <bug-followup at FreeBSD.org>, <wroberts at securenym.net>
> Subject: Re: misc/89103: gcc segmentation fault errors
> Date: Fri, 18 Nov 2005 00:55:21 -0500
> This is a multi-part message in MIME format.
> Content-Type: text/plain;
> Content-Transfer-Encoding: quoted-printable
> Ruled out hardware issue:
> 1. Ran memtest 86 -- 7 full cycles (18 hours +/-).
> 2. Reduced memory from 512Mb to 256Mb, repeated with different memory =
> 3. Ran full burncpu, passed.
> Power supplies operating at nominal voltages.
> System is apparently not using swap space for this process.
> Replaced AMD K6 200 with old K6 slow processor=20
> Same failure. CPU temps are <33C in all cases. I don't know the exact =
> numbers, but it's typically around 28C.
> This simply does not smell like a hardware problem
[Snip historical anecdotes]
> I'm willing to believe you, =
> but I'd like to know why you're so convinced this is a hardware issue. =20
Because I've been answering these questions for years, and I've seen
dozens of people start out saying "I'm convinced it's not a hardware
problem" and then working their way around to "it was a hardware
problem, sorry for wasting your time".
> The factors pointing against a hardware issue are: 1. The machine runs =
> everything else without a problem. 2. The machine ran non-stop =
> (non-reboot) on a UPS for over a half a year without a glitch, (take =
> that NT), and it seems to run f90 ok, and most cc's ok. 3. The system =
> runs very compute/memory intenstive monte carlo high energy physics code =
> that stores lots and lots of numbers to be written to files at the end =
> of the day and works consistantly. I would expect that if it weren't =
> working properly, something would be amiss elsewhere and would expect a =
> panic at some point, or the system to just plain stop working. 4. From =
> the archives it appears that more than one of us is havng a similar =
Not that I've seen. Where are these other reports?
> 5. This exact system ran for years without a glitch running =
> FreeBSD 2.2 and FreeBSD 3.2. =20
This kind of problem can be *very* workload-specific. i.e. everything
will work fine except one task that tickles the machine in exactly the
right way to trigger the hardware failure.
Yes, I've seen exactly this scenario happen many times.
> Is it safe to upgrade to GCC 4? Would that solve the problem? I'd be =
> happy to get it from gnu and try it, if it won't break anything. I =
> don't have the time I used to have to go messing in operating system =
> innards, much as I'd like to.
It won't fix a hardware problem, naturally. You can't use a
non-system compiler to compile FreeBSD, although you could compile
your own code with it.
> It is certainly possible that a pointer is misprogrammed (or perhaps the =
> fixed point register in the AMD chip doesn't work right??) and picks up =
> something funny that causes the compiler to have the "segementation =
> fault 11" That fault is consistent!
I'm sure it's consistent on this machine, but you're really reaching
by suggesting that it's a CPU bug affecting thousands of users :-)
P.S. Did you say in a previous email that the machine worked fine when
it was running at a site at high altitude, but stopped working when
you moved it and then upgraded it? That's a big clue that says
something broke at that point (or before, but was masked by lower
ambient temperatures, or something).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-bugs/attachments/20051121/9370e0d2/attachment.bin
More information about the freebsd-bugs