bin/74020: regexec() hangs with UTF-8 locales

Jean-Yves Lefort jylefort at
Tue Nov 16 15:40:33 PST 2004

>Number:         74020
>Category:       bin
>Synopsis:       regexec() hangs with UTF-8 locales
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 16 23:40:23 GMT 2004
>Originator:     Jean-Yves Lefort
>Release:        FreeBSD 5.3-RELEASE i386
System: FreeBSD 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Fri Nov 12 15:27:39 CET 2004 jylefort at i386
In some situations, regexec() hangs.
Compile this:

--- cut ---
#include <locale.h>
#include <sys/types.h>
#include <regex.h>
#include <assert.h>

main (int argc, char **argv)
  int status;
  regex_t test_re;
  regmatch_t pmatch[3];

  setlocale(LC_ALL, "");

  status = regcomp(&test_re, "foo=(.*) bar=(.*)", REG_EXTENDED);
  assert(status == 0);

  /* if the locale encoding is UTF-8, this call hangs */
  regexec(&test_re, "foo=one bar=two\302\251", test_re.re_nsub + 1, pmatch, 0);

  return 0;
--- cut ---

Works fine when executed with a non UTF-8 locale:

	$ LANG=en_US.ISO8859-1 ./test

Hangs when executed with an UTF-8 locale:

	$ LANG=en_US.UTF-8 ./test

More information about the freebsd-bugs mailing list