Suddenly slow lstat syscalls on CURRENT from Juli

Sat Jan 1 16:00:54 UTC 2011

On 01.01.2011 16:45, Kostik Belousov wrote:
> On Sat, Jan 01, 2011 at 04:37:44PM +0100, Beat G?tzi wrote:
>> On 01.01.2011 16:10, Alexander Best wrote:
>>> On Sat Jan  1 11, Beat G?tzi wrote:
>>>> Hi,
>>>>
>>>> Since a couple of days commands like ls(1) are very slow on one of my
>>>> tinderboxes. Checking with ktrace/kdump I see that the lstat syscall
>>>> takes about one second:
>>>>
>>>>  70559 ls       0.004644 CALL  lstat(0x284472f8,0x28447298)
>>>>  70559 ls       0.004651 NAMI  "Mk"
>>>>  70559 ls       0.004664 STRU  struct stat {... }
>>>>  70559 ls       0.004672 RET   lstat 0
>>>>  70559 ls       0.004688 CALL  lstat(0x2844a1b8,0x2844a158)
>>>>  70559 ls       0.004696 NAMI  "README,v"
>>>>  70559 ls       1.004245 STRU  struct stat {...}
>>>>  70559 ls       1.004263 RET   lstat 0
>>>>  70559 ls       1.004277 CALL  lstat(0x2844a2b8,0x2844a258)
>>>>  70559 ls       1.004286 NAMI  ".cvsignore,v"
>>>>  70559 ls       2.004282 STRU  struct stat {...}
>>>>  70559 ls       2.004302 RET   lstat 0
>>>>  70559 ls       2.004316 CALL  lstat(0x2844a3b8,0x2844a358)
>>>>  70559 ls       2.004325 NAMI  "CHANGES,v"
>>>>  70559 ls       3.004275 STRU  struct stat {...}
>>>>  70559 ls       3.004296 RET   lstat 0
>>>>  70559 ls       3.004310 CALL  lstat(0x2844a4b8,0x2844a458)
>>>>  70559 ls       3.004318 NAMI  "COPYRIGHT,v"
>>>>  70559 ls       4.004300 STRU  struct stat {...}
>>>>  70559 ls       4.004318 RET   lstat 0
>>>>
>>>> When i immediately re-execute the command in the same directory it looks
>>>> normal:
>>>>
>>>>  78007 ls       0.004665 CALL  lstat(0x284472f8,0x28447298)
>>>>  78007 ls       0.004673 NAMI  "Mk"
>>>>  78007 ls       0.004686 STRU  struct stat {...}
>>>>  78007 ls       0.004693 RET   lstat 0
>>>>  78007 ls       0.004708 CALL  lstat(0x2844a1b8,0x2844a158)
>>>>  78007 ls       0.004715 NAMI  "README,v"
>>>>  78007 ls       0.004728 STRU  struct stat {...}
>>>>  78007 ls       0.004735 RET   lstat 0
>>>>  78007 ls       0.004742 CALL  lstat(0x2844a2b8,0x2844a258)
>>>>  78007 ls       0.004749 NAMI  ".cvsignore,v"
>>>>  78007 ls       0.004761 STRU  struct stat {...}
>>>>  78007 ls       0.004768 RET   lstat 0
>>>>  78007 ls       0.004775 CALL  lstat(0x2844a3b8,0x2844a358)
>>>>  78007 ls       0.004782 NAMI  "CHANGES,v"
>>>>  78007 ls       0.004795 STRU  struct stat {...}
>>>>  78007 ls       0.004802 RET   lstat 0
>>>>  78007 ls       0.004809 CALL  lstat(0x2844a4b8,0x2844a458)
>>>>  78007 ls       0.004817 NAMI  "COPYRIGHT,v"
>>>>  78007 ls       0.004828 STRU  struct stat {...}
>>>>  78007 ls       0.004835 RET   lstat 0
>>>>
>>>> System is a HP DL360 G3 with a Compaq Smart Array 5i controller. There
>>>> is no evidence of a RAID controller or hard disk problem in the system
>>>> logs. Also writing to a file and reading from a file looks normal.
>>>> The running kernel is a GENERIC kernel from Juli without debugging options:
>>>>
>>>> # uname -a
>>>> FreeBSD tinderbox.chruetertee.ch 9.0-CURRENT FreeBSD 9.0-CURRENT #5
>>>> r209980: Tue Jul 13 11:25:50 CEST 2010
>>>> root at tinderbox.chruetertee.ch:/usr/obj/usr/home/beat/dev/src/head/sys/BEASTIE
>>>>  i386
>>>
>>> no problems here.
>>
>> I haven't had this problem the first ~166 days since I rebooted the box
>> the last time. Then the problem suddenly occur in all directories on all
>> partitions. The box was pretty much under load all the time as it builds
>> packages almost 7x24. I forgot to say the used filesystem is UFS.
>>
>> # diskinfo -ct da0
>> da0
>>         512             # sectorsize
>>         299992412160    # mediasize in bytes (279G)
>>         585922680       # mediasize in sectors
>>         0               # stripesize
>>         0               # stripeoffset
>>         36472           # Cylinders according to firmware.
>>         255             # Heads according to firmware.
>>         63              # Sectors according to firmware.
>>                         # Disk ident.
>>
>> I/O command overhead:
>>         time to read 10MB block     0.333161 sec  =    0.016 msec/sector
>>         time to read 20480 sectors  3.392359 sec  =    0.166 msec/sector
>>         calculated command overhead               =    0.149 msec/sector
>>
>> Seek times:
>>         Full stroke:      250 iter in   1.202862 sec =    4.811 msec
>>         Half stroke:      250 iter in   1.120656 sec =    4.483 msec
>>         Quarter stroke:   500 iter in   2.109077 sec =    4.218 msec
>>         Short forward:    400 iter in   1.892342 sec =    4.731 msec
>>         Short backward:   400 iter in   1.399378 sec =    3.498 msec
>>         Seq outer:       2048 iter in   0.399352 sec =    0.195 msec
>>         Seq inner:       2048 iter in   0.385460 sec =    0.188 msec
>> Transfer rates:
>>     outside:       102400 kbytes in   2.325967 sec =    44025 kbytes/sec
>>     middle:        102400 kbytes in   2.157681 sec =    47458 kbytes/sec
>>     inside:        102400 kbytes in   2.717089 sec =    37687 kbytes/sec
>>
>> I also observed that the cpu output of vmstat is a little bit confusing
>> on this system:
>>
>> # vmstat
>>  procs  memory     page                   disk   faults         cpu
>>  r b w  avm  fre   flt  re  pi  po    fr  sr da0   in   sy   cs us sy id
>>  0 0 0  1663M 1120M 117 0   1   0   120  21   0  172  27 245 -29 -10 139
>>
> Check the output of sysctl kern.maxvnodes and vfs.numvnodes. I suspect
> they are quite close or equial. If yes, consider increasing maxvnodes.
> Another workaround, if you have huge nested directories hierarhy, is
> to set vfs.vlru_allow_cache_src to 1.

Thanks for the hint. kern.maxvnodes and vfs.numvnodes were equal:
# sysctl kern.maxvnodes vfs.numvnodes
kern.maxvnodes: 100000
vfs.numvnodes: 100765

I've increased kern.maxvnodes and the problem was gone until
vfs.numvnodes reached the value of kern.maxvnodes again:
# sysctl kern.maxvnodes vfs.numvnodes
kern.maxvnodes: 150000
vfs.numvnodes: 150109

As the directory structure is quite huge on this server I've set
vfs.vlru_allow_cache_src to one now.

> You did not specified how much memory your machine have, but I assume it
> is > 1GB. Anyway, increase of maxvnodes on i386 should be done very
> cautiously, since it is easy to exhaust KVA.

The server has 4GB of RAM. Is it possible to check how much i could
increase kern.maxvnodes without exhausting KVA?

Thanks a lot!
Beat