FreeBSD 4.x - SATA problems ... ?
Marc G. Fournier
scrappy at hub.org
Tue Jul 5 23:24:38 GMT 2005
Recently, I added a new server to our network, using the 3Ware RAID
controller (the 9500S-4LP card) and 3x140G SATA drives ... overall, the
system works, but I'm getting a very odd behaviour that I've never seen
before ...
I have a process that run an rsync from another server to 'duplicate' the
VPSs ... a 'live backup' sort of thing ... this is running on all our
servers, without incident, *except*, it appears, the SATA server ...
I had disabled it for a time, and just re-enabled it this morning, and
somehow or another, it seems to be causing file system corruption ...
As most 'old timers' here know, we use UNIONFS on all our servers ... when
the corruption occurs, it looks like the "directory structures" are being
changed ... this one is hard to explain :( For example,
/usr/local/cyrus/bin has a bunch of binaries in it ... the binaries are
kept on the "lower layer", so the upper layer only has a
/usr/local/cyrus/bin directory created/ghosted, but no copies of the
binaries ... so, when you are in the VPS, and do an ls of that directory,
you see:
# ls /usr/local/cyrus/bin
arbitron cyr_expire lmtpd notifyd smmapd
chk_cyrus cyrdump masssievec pop3d squatter
ctl_cyrusdb deliver master pop3proxyd timsieved
ctl_deliver fud mbexamine quota tls_prune
ctl_mboxlist imapd mbpath reconstruct
cvt_cyrusdb ipurge mkimap sievec
When the 'corruption' happens, those all disappear, almost as if someone
did a 'rm -rf' of the directory within the VPS, and then a 'mkdir' ...
except that, from what I've been able to tell, this only happens randomly,
it happens on any of the VPSs *and* only around the time that the rsync
process is running ...
As if, somehow, the rsync is taxing the system and causing bad writes ...
but I can't find anything anywhere to indicate a problem ...
To "fix" things, I umount the UNIONFS layer, and then do a 'find / cpio'
to copy the "top layer" back over to fix the directory structure itself
...
The thing is, I don't even know *where* to begin debugging this issue,
since there aren't any errors being reported anywhere ... but maybe
someone out there has an idea?
thanks ...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy at hub.org Yahoo!: yscrappy ICQ: 7615664
More information about the freebsd-stable
mailing list