5.2 SMP data corruption problems...
julian at elischer.org
Wed Jan 21 10:41:59 PST 2004
On Tue, 20 Jan 2004, Jaye Mathisen wrote:
> 5.2-current as of 1/15. mobo is Tyan HESL-T, bios rev is 1.04, dual
> P3 1G'S. 2 3WARE CONTROLLERS, latest bios, 16 drives.
> Was seeing data corruption on large copies to the 3ware drives, via
> FTP/samba or even just tar from disk to disk. Small files never
> seemed to get corruped (md5 checksum'd everything regularly), but
> files over 4G seemed to always get corrupted somewhere, although not
> at the same spots.
We see this with old AMD based systems and 3ware cards.. I think the
3ware cards have very suspect PCI bus interfaces.. under 4.x..
We found that it was the writes to disk that were bad.. as long as the
datat was still in cache it was ok, but if you flushed it then the data
actually on disk was bad.
> Eventually the box panic'd with a lock order reversal, and would not
> let me fsck the large partition (900GB), it would keep panicing in
> pass 2 wiht anotehr lock-order reversal.
> I supped to current as of 1/19, tried again, same thing, file
> corruption, lots of panics.
> Finally, in the midst of just messing with stuff, I build a new
> kernel without the smp/apic stuff, and it's working fine.
> Disk-to-disk copies are fine, no corruption, nothing during uploads,
> no panics. And I can fsck the partition that I couldn't before, and
> it works fine.
how very interesting...
> I do not have the kernel dump info, the debugging was being done
> remotely over the phone, no way I was going to transcribe it that
> Anyway, just a heads up for those with potentially serverworks
> chipsets and 5.2, there's possibly something wrong. The corruption
> is silent, if I hadn't checked, there'd be no way to know.
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current To
> unsubscribe, send any mail to
> "freebsd-current-unsubscribe at freebsd.org"
More information about the freebsd-current