stat() takes 54 msec in a directory with 94k files (even with a
big dirhash)
Paul B. Mahol
onemda at gmail.com
Tue May 12 10:26:50 UTC 2009
On 5/12/09, Attila Nagy <bra at fsn.hu> wrote:
> Hello,
>
> I have a strange error on FreeBSD 7-STABLE (compiled on 7th May, just
> few commits after the release, but an earlier kernel did the same).
>
> I'm doing several parallel rsyncs from a machine to another (let's call
> them source and destination). The source contains maildirs, so there are
> some directories with a (relatively) lot of files.
> The source runs an earlier (around 6.2) FreeBSD and plain softupdates
> mounted UFS2 file systems.
> The destination has a bigger (UFS2) filesystem, on top of gjournal,
> mounted as async.
>
> I've noticed that rsync sometimes stops moving data and the destination
> machine gets sluggish. After some testing, I could catch the effect in
> action (was not that hard, because it persists even for hours sometimes).
>
> top shows around 20% system activity (there are two quad core CPUs) and
> 0% user. The WCPU field at rsync shows 100%.
>
> ktrace-ing the rsync process I can see this:
> 31639 rsync 0.000004 CALL lstat(0x7fffffffab70,0x7fffffffaf70)
> 31639 rsync 0.000004 NAMI
> "hm33/00/16/uid/Maildir/new/1212536121.54673,S=3128"
> 31639 rsync 0.054226 STRU struct stat {dev=100, ino=136943662,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942760,
> atime=1241807071, stime=1212536121, ctime=1241807071,
> birthtime=1212536121, size=3128, blksize=4096, blocks=8, flags=0x0 }
> 31639 rsync 0.000013 RET lstat 0
> 31639 rsync 0.000018 CALL lstat(0x7fffffffab70,0x7fffffffaf70)
> 31639 rsync 0.000004 NAMI
> "hm33/00/16/uid/Maildir/new/1212537276.69702,S=4634"
> 31639 rsync 0.054409 STRU struct stat {dev=100, ino=136943663,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942762,
> atime=1241807071, stime=1212537276, ctime=1241807071,
> birthtime=1212537276, size=4634, blksize=4096, blocks=12, flags=0x0 }
> 31639 rsync 0.000013 RET lstat 0
> 31639 rsync 0.000020 CALL lstat(0x7fffffffab70,0x7fffffffaf70)
> 31639 rsync 0.000005 NAMI
> "hm33/00/16/uid/Maildir/new/1212537689.74390,S=3172"
> 31639 rsync 0.054230 STRU struct stat {dev=100, ino=136943664,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942765,
> atime=1241807071, stime=1212537689, ctime=1241807071,
> birthtime=1212537689, size=3172, blksize=4096, blocks=8, flags=0x0 }
> 31639 rsync 0.000013 RET lstat 0
>
> So according to ktrace, the stat call takes 54 milliseconds to return
> for each of the files.
> I have tried with the default and a pretty much raised dirhash maxmem
> value, but I can still get these.
> Currently I have:
> vfs.ufs.dirhash_docheck: 0
> vfs.ufs.dirhash_mem: 18589428
> vfs.ufs.dirhash_maxmem: 209715200
> vfs.ufs.dirhash_minsize: 2560
> So dirhash has space to expand.
>
> The directory in question contains 94493 files.
>
> The source machine doesn't show this behaviour.
>
> top's output on the destination machine:
> CPU: 0.0% user, 0.0% nice, 22.7% system, 0.0% interrupt, 77.3% idle
> Mem: 159M Active, 3032M Inact, 599M Wired, 47M Cache, 399M Buf, 102M Free
> Swap: 4096M Total, 4096M Free
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 31639 root 1 118 0 50648K 10512K CPU0 0 2:01 100.00% rsync
> 634 root 1 -4 0 2536K 628K vlruwk 1 0:20 0.00% supervise
> 26760 root 1 44 0 25940K 3316K select 1 0:10 0.00% sshd
> 31640 root 1 75 0 87512K 8324K suspfs 4 0:10 0.00% rsync
> 31641 root 1 75 0 18904K 7124K suspfs 6 0:10 0.00% rsync
> 31637 root 1 75 0 40408K 7744K suspfs 4 0:09 0.00% rsync
> 31636 root 1 44 0 20952K 6288K select 2 0:09 0.00% rsync
> 31638 root 1 44 0 104M 8912K select 3 0:09 0.00% rsync
> 31635 root 1 75 0 80344K 7812K suspfs 4 0:09 0.00% rsync
> 31642 root 1 44 0 17940K 7624K select 1 0:04 0.00% ssh
> 31646 root 1 45 0 17940K 7656K select 1 0:03 0.00% ssh
>
> All of the rsyncs use the same file system, but with different top level
> directories. During this, neither of the other rsyncs can run.
>
> Any ideas about what could be done to work around this?
Big guess, maybe it updates atime? Try with noatime mount option.
--
Paul
More information about the freebsd-stable
mailing list