[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Apr 13 02:59:27 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

            Bug ID: 218622
           Summary: libc/gen/telldir [hack-n-PATCH]  performance limited
                    to O(n) vs file count, O(n^2) against samba ls
                    workload
           Product: Base System
           Version: 11.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: standards
          Assignee: freebsd-standards at FreeBSD.org
          Reporter: ash at ixsystems.com

We have been tracking a performance regression in FreeBSD 11 stable based smbd
pegs a cpu and takes much longer to list huge directories than it's FreeBSD 9.x
base counterpart .  Profiling showed that the time was geometrically related to
file count within the directory. 

Samba emits a telldir for every file during a readdir; and in 9.x that seemed
to run in linear time and things where fine; however this change appears to
expands the workload of list scan to O(n^2) vs file count. 
https://svnweb.freebsd.org/base/stable/11/lib/libc/gen/telldir.c?r1=235647&r2=269204

For a directory with 64k files, the performance is as follows when driven by
samba listing files. Identical hardware and zfs data sets are used:

using dtrace script:
BEGIN { printf("thinking, hit control-c when you are tired of it");}

pid$1::$2:entry
        { self->st= timestamp; }
pid$1::$2:return
        {
        @[execname,"delta(ns)" ] = quantize(  timestamp - self->st);
        self->st = 0;
        }

9.3: 
#./dt_time_in_pid_func.dt `top -b | head -10 | tail -1 | chomp | cut -w -f1`
telldir 
dtrace: script './dt_time_in_pid_func.dt' matched 3 probes
CPU     ID                    FUNCTION:NAME
  7      1                           :BEGIN thinking, hit control-c when you
are tired of it
^C

  smbd                                                delta(ns)                 
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  53883    
            8192 |@                                        730      
           16384 |                                         502     

and under 11-stable(ish):
 #dtrace -s dtprofile.is_in_path.dt `top -b | head -10 | tail -1 | chomp | cut
-w -f1` telldir  
dtrace: script 'dtprofile.is_in_path.dt' matched 2 probes
^C

  deltans                                           
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |                                         90       
            8192 |                                         8        
           16384 |                                         0        
           32768 |                                         0        
           65536 |@                                        1583     
          131072 |@@@@@@@@@@                               12270    
          262144 |@@@@@@@@@@@@@@@@@@@@@@                   28159    <<- libc
telldir takes how long now?!
          524288 |@@@@@@@                                  9299     
         1048576 |      



After reverting the telldir change shamelessly:
https://github.com/freenas/os/commit/92873f3190c830302143d759411b23bd719b0ba2

Performance for the telldir returned to constant time. 

The change appears important to @standards however the impact is tough to
explain to samba users.  To conjecture, I wonder if a run time tunable could
select the 'conforming' or 'fast' behaviour for telldir like  LD_PRELOAD...
style directives.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-standards mailing list