[Bug 223532] egrep -i is terrible slow if utf-8 locale is enabled
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed Nov 8 12:59:43 UTC 2017
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223532
Bug ID: 223532
Summary: egrep -i is terrible slow if utf-8 locale is enabled
Product: Base System
Version: CURRENT
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: bin
Assignee: freebsd-bugs at FreeBSD.org
Reporter: wosch at FreeBSD.org
egrep -i is terrible slow if the locale is set to utf-8. In fact, it is 77
times slower then a case sensitive search.
How to repeat:
First, we create a 100MB text file:
for i in $(seq 1 20);do man tcsh;done > /tmp/tcsh20;
for i in $(seq 1 20); do cat /tmp/tcsh20;done > /tmp/tcsh400
$ du -hs /tmp/tcsh400
99M /tmp/tcsh400
# case sensitive search with utf-8
LANG=en_CA.UTF-8 time egrep -c foobar /tmp/tcsh400
0
0.11 real 0.06 user 0.04 sys
# case in-sensitive search with utf-8, terrible slow
LANG=en_CA.UTF-8 time egrep -ic foobar /tmp/tcsh400
0
8.47 real 8.42 user 0.04 sys
# case sensitive search with ASCII
LANG=C time egrep -c foobar /tmp/tcsh400
0
0.10 real 0.06 user 0.03 sys
# case in-sensitive search with ASCII
LANG=C time egrep -ic foobar /tmp/tcsh400
0
0.10 real 0.07 user 0.03 sys
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list