[RFC] Replacing our regex implementation

Bakul Shah bakul at bitblocks.com
Mon May 9 05:21:45 UTC 2011


On Mon, 09 May 2011 08:30:57 +0400 Lev Serebryakov <lev at FreeBSD.org>  wrote:
> Hello, Bakul.
> You wrote 9 =EC=E0=FF 2011 =E3., 5:17:09:
> 
> > As per the following URLs re2 is much faster than TRE (on the
> > benchmarks they ran):
> 
> > http://lh3lh3.users.sourceforge.net/reb.shtml
> > http://sljit.sourceforge.net/regex_perf.html
>   re2 is much faster at price of memory. I don't remember details now,
> but I've found (simple) situations when re2 consumes a HUGE amount of
> memory (read: hundreds of megabytes). It work faster than tre, yes. If
> you have this memory to RE engine alone.

As per http://swtch.com/~rsc/regexp/regexp3.html RE2 requires
about 10 KB per regexp, in contrast to PCRE's half a KB.  This
is not excessive in this day and age. But 100s of megabytes
sounds very strange....  I'd appreciate a reference to an
actual example (and I am sure so would the author of re2).

But I do not want to defend re2 here. My intent was to just
make sure re2 was at least considered.  Mainly because it was
actually quite surprising to see TRE is 10 to 45 times slower
than re2!



More information about the freebsd-hackers mailing list