Slow disk access while rsync - what should I tune?

Daan Vreeken Daan at vehosting.nl
Tue Oct 26 21:43:18 UTC 2010


Hi Cronfy,

On Sunday 24 October 2010 15:15:53 cronfy wrote:
> Hello,
>
> I have a web-server (nginx + apache + mysql, FreeBSD 7.3) with many
> sites. Every night it creates a backup of /home on a separate disk.
> /home is a RAID1 mirror on Adaptec 3405 (128M write cache) with SAS
> drives; /backup is a single SATA drive on the same controller.
>
> Rsync creates backups using hardlinks, it stores 7 daily and 4 weekly
> copies. Total amount of data is ~300G and 11M of files. The server is
> under heavy web load every time (appox 100 queries/sec).
>
> Every time backup starts server slows down significantly, disk
> operations become very slow. It may take up to 10 seconds to stat() a
> file that is not in filesystem cache. At the same time, rsync on
> remote server does not affect disk load much, server works without
> slowdown.
>
> I think that problem can be caused by two reasons:
>  * either bulk of reads on SATA /backup drive, that fills OS
> filesystem cache and many file access operations require real disk
> read.
>  * or bulk of writes on /backup fills controller write cache and geom
> disk operations queue grown, causing  all disk operations to wait.
>
> This is only my assumption of course, I may be wrong.

Try "gstat -a" to see which one it is. I guess you'll see bulk reads on /home 
and bulk reads on /backup mostly.
When rsync starts, it will index the source and the destination directory 
structures using readdir() and stat() calls to see what files have changed 
(and need to be copied later on).

rsync offers the "--bwlimit" option to lower the network bandwidth between an 
rsync server and a client, but this won't change the stress the stat() calls 
generate when rsync() indexes the directories.

> How can I find a real reason of these slowdowns, to either conclude
> that it is not possible to solve this because of hardware/software
> limits, or tune my software/hardware system to make this all work at
> an acceptable speed?

You could try the patch below to rsync's "syscall.c" file, which will pause 
rsync for short periods of time every second to reduce the IO pressure it 
creates.
Changing "500" to an even lower value, should almost linearly scale the 'busy' 
percentage "gstat -a" shows to even lower levels.


--- syscall.c.org       2010-10-26 22:47:20.000000000 +0200
+++ syscall.c   2010-10-26 22:47:33.000000000 +0200
@@ -215,8 +215,19 @@
 #endif
 }
 
+void tiny_pause(void)
+{
+       struct timeval          tv;
+
+       // only work in the first half of every second.
+       gettimeofday(&tv, NULL);
+       if (tv.tv_usec > 500 * 1000)
+               usleep(1000 * 1000 - tv.tv_usec);
+}
+
 int do_stat(const char *fname, STRUCT_STAT *st)
 {
+       tiny_pause();
 #ifdef USE_STAT64_FUNCS
        return stat64(fname, st);
 #else
@@ -226,6 +237,7 @@
 
 int do_lstat(const char *fname, STRUCT_STAT *st)
 {
+       tiny_pause();
 #ifdef SUPPORT_LINKS
 # ifdef USE_STAT64_FUNCS
        return lstat64(fname, st);
@@ -239,6 +251,7 @@
 
 int do_fstat(int fd, STRUCT_STAT *st)
 {
+       tiny_pause();
 #ifdef USE_STAT64_FUNCS
        return fstat64(fd, st);
 #else


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380


More information about the freebsd-hackers mailing list