[RFC] Replacing our regex implementation

Bakul Shah bakul at bitblocks.com
Tue May 10 00:15:19 UTC 2011


On Mon, 09 May 2011 17:51:46 EDT David Schultz <das at FreeBSD.ORG>  wrote:
> On Sun, May 08, 2011, Bakul Shah wrote:
> > On Sun, 08 May 2011 21:35:04 CDT Zhihao Yuan <lichray at gmail.com>  wrote:
> > > 1. This lib accepts many popular grammars (PCRE, POSIX, vim, etc.),
> > > but it does not allow you to change the mode.
> > > http://code.google.com/p/re2/source/browse/re2/re2.h
> > 
> > The mode is decided when an RE2 object is instantiated so this
> > is ok. You can certainly instantiate multiple objects with
> > different options if so desired.
> > 
> > > 2. It focuses on speed and features, not stability and standardization.
> > 
> > Look at the open issues. Seems stable enough to me. re2 has a
> > posix only mode. It also does unicode.

s/posix only mode/posix only mode as well/

> > 
> > > 3. It uses C++. We seldom accepts C++ code in base system, and does
> > > not accept it in libc.
> > 
> > This is the show stopper.
> 
> Use of C++ is a clear show-stopper if it introduces new runtime
> requirements, e.g., dependencies on STL or exceptions.  Aside from
> that, however, I can't think of any fundamental, technical reasons
> why a component of libc couldn't be written in C++.  (Perhaps the
> toolchain maintainers could name some, and they'd be the best
> authority on the matter.)  You can expect some resistance
> regardless, however, so make sure the technical merits of RE2 are
> worth the trouble.

Ok, I just verified there are no additional runtime
requirements by running a simple test, where I added a C
wrapper around an RE2 C++ call, compiled it with c++, then
compiled the client C code with cc, and linked everything with
cc. This works (tested on on x86_64, under 8.1).

I do think RE2 is very well done (see swtch.com/~rsc/regexp
articles) and it is actively maintained, has a battery of
pretty exhaustive tests.  Seems TRE's author also likes re2:
http://hackerboss.com/is-your-regex-matcher-up-to-snuff/

So if we want to consider this, it is a real possibility.

> IIRC, some of the prior discussions on using more C++ in the base
> system got derailed by tangents on multiple inheritance, operator
> overloading, misfeatures of STL, and what subset of C++ ought to
> be considered kosher in FreeBSD.  You don't have to get involved
> in any of that because you'd only be proposing to import a
> self-contained third-party library.

Indeed; we would just use it via a C wrapper API.  But I can
see someone thinking this is the camel's nose in the tent :-)


More information about the freebsd-hackers mailing list