Grepping a list of words

Thu Aug 12 17:09:52 UTC 2010

Oliver Fromme <olli at lurza.secnetix.de> writes:

> John Levine <johnl at iecc.com> wrote:
>  > > > % egrep 'word1|word2|word3|...|wordn' filename.txt
>  > 
>  > > Thanks for the replies. This suggestion won't do the job as the list of
>  > > words is very long, maybe 50-60. This is why I asked how to place them all
>  > > in a file. One reply dealt with using a file with egrep. I'll try that.
>  > 
>  > Gee, 50 words, that's about a 300 character pattern, that's not a problem
>  > for any shell or version of grep I know.
>  > 
>  > But reading the words from a file is equivalent and as you note most
>  > likely easier to do.
>
> The question is what is more efficient.  This might be
> important if that kind of grep command is run very often
> by a script, or if it's run on very large files.
>
> My guess is that one large regular expression is more
> efficient than many small ones.  But I haven't done real
> benchmarks to prove this.

BTW, not using regular expressions is even more efficient, e.g.

  $ fgrep -f /usr/share/dict/words /etc/group

When using egrep(1) it takes considerably more time and memory.