Strange performance issue with grep -r -i as non-root user

Jeremy Chadwick freebsd at jdc.parodius.com
Sun Mar 6 03:07:29 UTC 2011


On Sat, Mar 05, 2011 at 09:46:04PM -0500, Gary Palmer wrote:
> On Sat, Mar 05, 2011 at 03:45:14PM -0800, Jeremy Chadwick wrote:
> > This is a strange one, and the more I started debugging it (starting
> > with truss, comparing fast vs. slow results, where all that appears
> > different is read() operations are taking a lot longer -- I haven't had
> > time to check with ktrace yet), the more strange it got: that's when I
> > found out the behaviour changes depending on if you're a user or root.
> > 
> > Easy to reproduce:
> > 
> > - grep -r string /usr/src, as non-root, is fast
> > - grep -r -i string /usr/src, as non-root, is 8x slower than without -i
> > - grep -r string /usr/src, as root, is fast
> > - grep -r -i string /usr/src, as root, is fast
> 
> This is a stab in the dark, but are there any differences in your
> shell environment variables between root and non-root?  Specifically
> LANG or LC_ style variables.  I ran into issues in the past with grep
> being horrendously slow and traced it to LANG or LC_* in the environment
> causing a much longer code path than without the settings.

Bingo -- you found it, Gary.  Thank you very much.  I hadn't thought of
LANG/LC_* variables but I did think of dotfile or shell differences, but
didn't test them thoroughly.

My dotfiles do make use of LANG/LC_CTYPE/LC_COLLATE:

export LANG="en_GB.UTF-8"
export LC_CTYPE="en_GB.UTF-8"
export LC_COLLATE="C"

Testing on System #1:

$ unset LANG LC_CTYPE LC_COLLATE
$ for i in {0..9}; do /usr/bin/time -h grep -r PAE /usr/src/sys/dev > /dev/null ; done
        0.18s real              0.11s user              0.06s sys
        0.15s real              0.09s user              0.05s sys
        0.12s real              0.06s user              0.05s sys
        0.12s real              0.06s user              0.05s sys
        0.12s real              0.07s user              0.04s sys
        0.12s real              0.08s user              0.03s sys
        0.12s real              0.08s user              0.03s sys
        0.12s real              0.07s user              0.04s sys
        0.12s real              0.08s user              0.03s sys
        0.12s real              0.07s user              0.04s sys
$ for i in {0..9}; do /usr/bin/time -h grep -r -i PAE /usr/src/sys/dev > /dev/null ; done
        0.13s real              0.11s user              0.02s sys
        0.13s real              0.10s user              0.03s sys
        0.13s real              0.08s user              0.05s sys
        0.13s real              0.09s user              0.04s sys
        0.13s real              0.08s user              0.05s sys
        0.13s real              0.11s user              0.02s sys
        0.13s real              0.10s user              0.03s sys
        0.13s real              0.11s user              0.02s sys
        0.13s real              0.09s user              0.03s sys
        0.13s real              0.08s user              0.05s sys

I wanted to track it down to a specific variable or combo:

$ unset LANG
  - Result: still 80x slower with -i
$ unset LANG LC_COLLATE
  - Result: still 80x slower with -i
$ unset LANG LC_CTYPE
  - Result: normal/fast.
$ unset LC_CTYPE
  - Result: still 80x slower with -i
$ unset LC_CTYPE LC_COLLATE
  - Result: still 80x slower with -i
$ unset LC_COLLATE
  - Result: still 80x slower with -i

So the LANG + LC_CTYPE combo when used together are what cause this.
I'm not sure what's going on with locale, but given the nasty
side-effects it should probably be documented somewhere; maybe in
setlocale(3)?  Unsure.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list