[Bug 254763] grep very slow with 13.0-RC4

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 18 Jan 2022 23:07:15 UTC

Mark.Martinec@ijs.si changed:

           What    |Removed                     |Added
                 CC|                            |Mark.Martinec@ijs.si

--- Comment #11 from Mark.Martinec@ijs.si ---
I was about to open a new PR, but I realized the problem
has already been reported, so I'm commenting here.

My particular issue: fgrep -i is terribly slow on 13.0.

More in detail:

Searching for a string though a one day's worth of
mail log (100 MB, postfix) takes unreasonably long time
with BSD grep when option -i (case-insensitive search)
is specified:

$ time /usr/bin/fgrep -i fwzhkqwfoherbfqo mail.log
real 59.664s, user 58.907s, sys 0.222s

$ time /usr/bin/fgrep fwzhkqwfoherbfqo mail.log
real 0.276s, user 0.097s, sys 0.173s

With option -i it takes 200-times longer than without -i.
Compared to pcregrep or GNU grep, it takes 500-times longer.

fgrep and grep and egrep are the same in this respect.

An interesting observation: as the length of a search
string increases, so does the run time.

It would be worthwhile investing some time in finding
and fixing this particular hotspot in BSD grep.

(the machine above was a slightly older 13.0-RELEASE-p6 amd64,
with SSD disks and ZFS, the file is cached (cache warmed up))


The rest below is some more benchmarking, this time
on a faster machine with nvme disks, ZFS, same log file.
Absolute times are shorter, but the ratio is about the same.

$ uname -a
FreeBSD xxx 13.0-RELEASE-p6 FreeBSD ... amd64/sys/GENERIC  amd64

$ ls -l mail.log
xxx  99999998 Jan 18 22:49 mail.log

BSD grep:

$ time /usr/bin/fgrep -i fwzh mail.log
real 0m8.733s, user 0m8.664s, sys 0m0.061s

$ time /usr/bin/fgrep -i fwzhkqwf mail.log
real 0m12.759s, user 0m12.666s, sys 0m0.057s

$ time /usr/bin/fgrep -i fwzhkqwfoherbfqo mail.log
real 0m18.922s, user 0m18.813s, sys 0m0.064s

$ time /usr/bin/fgrep -i fwzhkqwfoherbfqojhkqnsazmzlwknhg mail.log
real 0m32.593s, user 0m32.438s, sys 0m0.056s

BSD grep without -i:

$ time /usr/bin/fgrep fwzhkqwfoherbfqojhkqnsazmzlwknhg mail.log
real 0m0.112s, user 0m0.033s, sys 0m0.079s

GNU grep (textutils/gnugrep):

$ time /usr/local/bin/fgrep -i fwzh mail.log
real 0m0.204s, user 0m0.073s, sys 0m0.022s

$ time /usr/local/bin/fgrep -i fwzhkqwf mail.log
real 0m0.203s, user 0m0.060s, sys 0m0.015s

$ time /usr/local/bin/fgrep -i fwzhkqwfoherbfqo mail.log
real 0m0.175s, user 0m0.036s, sys 0m0.017s

$ time /usr/local/bin/fgrep -i fwzhkqwfoherbfqojhkqnsazmzlwknhg mail.log
real 0m0.217s, user 0m0.083s, sys 0m0.006s


$ time pcregrep -i fwzh mail.log
real 0m0.249s, user 0m0.114s, sys 0m0.026s

$ time pcregrep -i fwzhkqwf mail.log
real 0m0.227s, user 0m0.094s, sys 0m0.013s

$ time pcregrep -i fwzhkqwfoherbfqo mail.log
real 0m0.128s, user 0m0.073s, sys 0m0.055s

$ time pcregrep -i fwzhkqwfoherbfqojhkqnsazmzlwknhg mail.log
real 0m0.126s, user 0m0.079s, sys 0m0.047s

You are receiving this mail because:
You are the assignee for the bug.