git: f7f76c200a8c - main - awk: Document deprecated behavior of hex constants and locales.

Warner Losh imp at FreeBSD.org
Sat Jul 31 05:43:19 UTC 2021


The branch main has been updated by imp:

URL: https://cgit.FreeBSD.org/src/commit/?id=f7f76c200a8c33822a25ae36e4399c9896efa951

commit f7f76c200a8c33822a25ae36e4399c9896efa951
Author:     Warner Losh <imp at FreeBSD.org>
AuthorDate: 2021-07-31 05:31:00 +0000
Commit:     Warner Losh <imp at FreeBSD.org>
CommitDate: 2021-07-31 05:41:39 +0000

    awk: Document deprecated behavior of hex constants and locales.
    
    FreeBSD will convert "0x12" from hex and print it as 18. Other awks will
    convert it to 0. This extension has been removed upstream, and will be
    removed in FreeBSD 14.0.
    
    FreeBSD used to set the locale on startup, and make the ranges use that
    locale. This lead to weird results like "[A-Z]" matching lower case
    characters in some locales. This bug has been fixed.
    
    MFC After:              3 days
    Sponsored by:           Netflix
---
 usr.bin/awk/awk.1 | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/usr.bin/awk/awk.1 b/usr.bin/awk/awk.1
index 20bb510a1516..35d5e8d9d29b 100644
--- a/usr.bin/awk/awk.1
+++ b/usr.bin/awk/awk.1
@@ -814,9 +814,44 @@ The scope rules for variables in functions are a botch;
 the syntax is worse.
 .Sh DEPRECATED BEHAVIOR
 One True Awk has accpeted
-.Fl Ft
+.Fl F Ar t
 to mean the same as
-.Fl F\t
+.Fl F Ar <TAB>
 to make it easier to specify tabs as the separator character.
 Upstream One True Awk has deprecated this wart in the name of better
 compatibility with other awk implementations like gawk and mawk.
+.Pp
+Historically,
+.Nm
+did not accept
+.Dq 0x
+as a hex string.
+However, since One True Awk used strtod to convert strings to floats, and since
+.Dq 0x12
+is a valid hexadecimal representation of a floating point number,
+On
+.Fx ,
+.Nm
+has accepted this notation as an extension since One True Awk was imported in
+.Fx 5.0 .
+Upstream One True Awk has restored the historical behavior for better
+compatibility between the different awk implementations.
+Both gawk and mawk already behave similarly.
+Starting with
+.Fx 14.0
+.Nm
+will no longer accept this extension.
+.Pp
+The
+.Fx
+.Nm
+sets the locale for many years to match the environment it was running in.
+This lead to pattern ranges, like
+.Dq "[A-Z]"
+sometimes matching lower case characters in some locales.
+This misbehavior was never in upstream One True Awk and has been removed as a
+bug in
+.Fx 12.3 ,
+.Fx 13.1 ,
+and
+.Fx 14.0 .


More information about the dev-commits-src-main mailing list