[Bug 223532] GNU egrep -i is terrible slow if utf-8 locale is enabled

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 02 Jun 2021 17:14:15 +0000
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223532

Helge Oldach <freebsd_at_oldach.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |freebsd_at_oldach.net

--- Comment #3 from Helge Oldach <freebsd_at_oldach.net> ---
(In reply to Wolfram Schneider from comment #2)
Hmm. I have noticed that as well. I suspect it's a fallout of bug #253209, as I
noticed it was much more slowly after that fix.

However, I'm seeing the vast majority of slowdown with other locales but utf-8
as well:

root_at_nuc ~ # grep -V
grep (BSD grep, GNU compatible) 2.6.0-FreeBSD
root_at_nuc ~ # time fgrep zpipe /usr/ports/INDEX-13
        0.28 real         0.15 user         0.06 sys
root_at_nuc ~ # time fgrep -i zpipe /usr/ports/INDEX-13
       13.87 real        13.86 user         0.00 sys
root_at_nuc ~ # LANG=en_US.UTF-8 time fgrep -i zpipe /usr/ports/INDEX-13
       17.67 real        17.65 user         0.01 sys
root_at_nuc ~ # LANG=C.UTF-8 time fgrep -i zpipe /usr/ports/INDEX-13
       17.63 real        17.59 user         0.02 sys
root_at_nuc ~ # LANG=en_US.iso8859-1 time fgrep -i zpipe /usr/ports/INDEX-13
       13.97 real        13.95 user         0.02 sys
root_at_nuc ~ # LANG=C time fgrep -i zpipe /usr/ports/INDEX-13
       14.00 real        13.97 user         0.03 sys
root_at_nuc ~ #

To summarize, "-i" adds two (!) orders of magnitude, and changing to a
multibyte character set adds some more 33% further on top.

-- 
You are receiving this mail because:
You are the assignee for the bug.
Received on Wed Jun 02 2021 - 17:14:15 UTC

Original text of this message