Random Lockup with FreeBSD 10.2 on SuperMicro Boards

Julien Cigar jcigar at ulb.ac.be
Tue Nov 17 17:58:48 UTC 2015

On Tue, Nov 17, 2015 at 06:01:55PM +0100, Gerhard Schmidt wrote:
> Am 17.11.2015 17:23, schrieb Adam Vande More:
> > On Mon, Nov 16, 2015 at 10:45 AM, <kpneal at pobox.com
> > <mailto:kpneal at pobox.com>> wrote:
> > 
> >     When in doubt use 'fsck -f' to force a check despite the filesystem
> >     being marked clean.
> > 
> >  
> > Yes, but a full fsck should be run on a regular basis regardless of
> > suspicion.
> > 
> >     Personally, I got bit by SU (plain) a long time ago and I've never
> >     really
> >     trusted it since. I strongly advise you to 'fsck -f' on your /var
> >     just to
> >     rule out _any_ corruption there.
> > 
> > 
> > A lower level fs error isn't going be to detected by a background
> > fsck(only does preening) or SUJ fsck(trusts the journal).  Such errors
> > can occur on *any* journaled fs.  Periodically doing a full fsck on fs's
> > is actually something Linux does better.
> > 
> > https://lists.freebsd.org/pipermail/freebsd-current/2013-July/042951.html
> > 
> > Many think SU or SUJ obviate the need for a periodic full fsck.  It does
> > not.  SU and SUJ devs have repeated this since their respective
> > inception.  [1] Hardware still lies, bitrot still occurs, do a full
> > fsck.  Vague reports of "I don't trust this" aren't helpful.   If you
> > know of a bug, please report it so it can be addressed. 
> > 
> > [1]
> > https://lists.freebsd.org/pipermail/freebsd-arch/2010-January/009872.html --
> > Well initially it's claimed "eliminate fsck after an unclean shutdown"
> > but details it later showing fsck using journal isn't a full fsck.
> Let's get back to Topic. There is no corruption. And still if there is
> that's software bug and has to be fixed. This is not biology where
> something happens spontaneously. This is computer science. If there is
> something wrong there are only three explanations. The User done
> something wrong, not likely here. There is an Hardware error, on three
> different Servers roughly after the same amount of time not very likely
> either. So it's cause number three: Bug in the Software.
> As I said. I have 76 Servers running FreeBSD (various versions from 8.4
> to 10.2) only 3 of them are 10.2 (5 since yesterday) and of this three
> running 10.2 longer than a month 100% had this Problem at least once.
> out of the 73 other servers 0% had this Problem and 45 of them are the
> exact same Hardware and all of them running considerably longer than one
> Month.
> And for the fscks. The last time i had to do a fsck on any partition,
> beside Hardware failures, was about 2 and 1/2 years ago when your UPS
> died and killed the power. And besides from some logfiles even than
> there was no corruption. I have filesystems that are 8 years without a
> fsck, that are production servers. I have never had problems with UFS SU
> and UFS SU-J.
> Sorry guys there is no problem with UFS on FreeBSD.

couldn't you disable SU+J only on one of them? It would be worth trying
at least. I never had any problem with SU, but I'm sorry to say that
SU+J almost never worked for me (see PR 203588 for latest problem that I

I'll repeat myself but I had random lock ups on some HP Proliant servers
here too (without any corruption) with SU+J. Since I disabled journaling
lock ups "automagically" disappeared.

> I agree if there is an unclean shutdown you might want to do an complete
> fsck. But in the case discussed here the unclean shutdown was an result
> of the lockup not the other way round.
> Regards
>   Estartu
> -- 
> -------------------------------------------------
> Gerhard Schmidt       | E-Mail and JabberID:
> TU-München            | schmidt at ze.tum.de
> WWW & Online Services | PGP-Publickey on Request

Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20151117/136cebc5/attachment.bin>

More information about the freebsd-questions mailing list