stat() takes 54 msec in a directory with 94k files (even with a big dirhash)

Paul B. Mahol onemda at gmail.com
Tue May 12 10:26:50 UTC 2009


On 5/12/09, Attila Nagy <bra at fsn.hu> wrote:
> Hello,
>
> I have a strange error on FreeBSD 7-STABLE (compiled on 7th May, just
> few commits after the release, but an earlier kernel did the same).
>
> I'm doing several parallel rsyncs from a machine to another (let's call
> them source and destination). The source contains maildirs, so there are
> some directories with a (relatively) lot of files.
> The source runs an earlier (around 6.2) FreeBSD and plain softupdates
> mounted UFS2 file systems.
> The destination has a bigger (UFS2) filesystem, on top of gjournal,
> mounted as async.
>
> I've noticed that rsync sometimes stops moving data and the destination
> machine gets sluggish. After some testing, I could catch the effect in
> action (was not that hard, because it persists even for hours sometimes).
>
> top shows around 20% system activity (there are two quad core CPUs) and
> 0% user. The WCPU field at rsync shows 100%.
>
> ktrace-ing the rsync process I can see this:
>    31639 rsync    0.000004 CALL  lstat(0x7fffffffab70,0x7fffffffaf70)
>    31639 rsync    0.000004 NAMI
> "hm33/00/16/uid/Maildir/new/1212536121.54673,S=3128"
>    31639 rsync    0.054226 STRU  struct stat {dev=100, ino=136943662,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942760,
> atime=1241807071, stime=1212536121, ctime=1241807071,
> birthtime=1212536121, size=3128, blksize=4096, blocks=8, flags=0x0 }
>    31639 rsync    0.000013 RET   lstat 0
>    31639 rsync    0.000018 CALL  lstat(0x7fffffffab70,0x7fffffffaf70)
>    31639 rsync    0.000004 NAMI
> "hm33/00/16/uid/Maildir/new/1212537276.69702,S=4634"
>    31639 rsync    0.054409 STRU  struct stat {dev=100, ino=136943663,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942762,
> atime=1241807071, stime=1212537276, ctime=1241807071,
> birthtime=1212537276, size=4634, blksize=4096, blocks=12, flags=0x0 }
>    31639 rsync    0.000013 RET   lstat 0
>    31639 rsync    0.000020 CALL  lstat(0x7fffffffab70,0x7fffffffaf70)
>    31639 rsync    0.000005 NAMI
> "hm33/00/16/uid/Maildir/new/1212537689.74390,S=3172"
>    31639 rsync    0.054230 STRU  struct stat {dev=100, ino=136943664,
> mode=-rw------- , nlink=1, uid=999, gid=999, rdev=546942765,
> atime=1241807071, stime=1212537689, ctime=1241807071,
> birthtime=1212537689, size=3172, blksize=4096, blocks=8, flags=0x0 }
>    31639 rsync    0.000013 RET   lstat 0
>
> So according to ktrace, the stat call takes 54 milliseconds to return
> for each of the files.
> I have tried with the default and a pretty much raised dirhash maxmem
> value, but I can still get these.
> Currently I have:
> vfs.ufs.dirhash_docheck: 0
> vfs.ufs.dirhash_mem: 18589428
> vfs.ufs.dirhash_maxmem: 209715200
> vfs.ufs.dirhash_minsize: 2560
> So dirhash has space to expand.
>
> The directory in question contains 94493 files.
>
> The source machine doesn't show this behaviour.
>
> top's output on the destination machine:
> CPU:  0.0% user,  0.0% nice, 22.7% system,  0.0% interrupt, 77.3% idle
> Mem: 159M Active, 3032M Inact, 599M Wired, 47M Cache, 399M Buf, 102M Free
> Swap: 4096M Total, 4096M Free
>
>   PID USERNAME   THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
> 31639 root         1 118    0 50648K 10512K CPU0   0   2:01 100.00% rsync
>   634 root         1  -4    0  2536K   628K vlruwk 1   0:20  0.00% supervise
> 26760 root         1  44    0 25940K  3316K select 1   0:10  0.00% sshd
> 31640 root         1  75    0 87512K  8324K suspfs 4   0:10  0.00% rsync
> 31641 root         1  75    0 18904K  7124K suspfs 6   0:10  0.00% rsync
> 31637 root         1  75    0 40408K  7744K suspfs 4   0:09  0.00% rsync
> 31636 root         1  44    0 20952K  6288K select 2   0:09  0.00% rsync
> 31638 root         1  44    0   104M  8912K select 3   0:09  0.00% rsync
> 31635 root         1  75    0 80344K  7812K suspfs 4   0:09  0.00% rsync
> 31642 root         1  44    0 17940K  7624K select 1   0:04  0.00% ssh
> 31646 root         1  45    0 17940K  7656K select 1   0:03  0.00% ssh
>
> All of the rsyncs use the same file system, but with different top level
> directories. During this, neither of the other rsyncs can run.
>
> Any ideas about what could be done to work around this?

Big guess, maybe it updates atime? Try with noatime mount option.
-- 
Paul


More information about the freebsd-stable mailing list