Re: FreeBSD awk behavior change proposal

From: Baptiste Daroussin <bapt_at_FreeBSD.org>
Date: Fri, 09 Jul 2021 14:28:30 UTC
On Fri, Jul 09, 2021 at 06:21:29AM -0700, Rodney W. Grimes wrote:
> > Greetings,
> > 
> > I've posted  https://reviews.freebsd.org/D31114 which eliminates the last
> > delta we have from upstream one-true-awk. This delta has basically been
> > rejected by upstream as being a really bad idea. Let me give some
> > background.
> > 
> > In 2005, FreeBSD changed one-true-awk to honor the locale's collating order.
> > https://svnweb.freebsd.org/base/head/usr.bin/awk/b.c.diff?annotate=146322&pathrev=201988
> > This was billed as a temporary patch. It was also compatible with
> > the then-current behavior of gawk. That temporary patch has lasted 16
> > years now.
> > 
> > However, IEEE Std 1003.1-2008 changed the behaivor of ranges in regular
> > expressions outside of the "C" and "POSIX" locales to be undefined.
> > 
> > Starting in 2011, gawk 4.0 stopped using the locale for the range
> > regular expressions and used the traditional behavior only. The
> > maintainer had grown weary of answering why '[A-Z]' would sometimes
> > match lower-case expressions. The details about are explained here:
> > https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
> > 
> > To restore compatibility with other implementaitons of awk, revert this
> > patch. FreeBSD is the odd-system out. It also has the nice side effect
> > of eliminating the last of our differences with upstream one-true-awk.
> > 
> > I'd like to commit the change at least to -current. Ideally, I'd like to MFC
> > the change. I believe better compatibility with gawk and other awk
> > implementations justifies this change in behavior because the current
> > behavior is outside the mainstream enough to be considered a bug.
> > 
> > I'd like to solicit input before I do this, however.
> 
> My only concern on this is does anything in the ports system get
> tickled by this change, I know its a pita, but maybe have an exp
> run done?  I reviewed and accepted the differential, and by examination
> I do not see how this could cause an issue now, so Meh give it a long
> back in -current and things should be ok.
> 
It would require an exp-run, but I really doubt anything use it, we have real
collation support for not that long and actually bringing that collation support
did break script expecting the behaviour warner is bringing in.

Best regards,
Bapt