git: f32a6403d346 - main - Merge one true awk from 2024-01-22 for the Awk Second Edition support
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 29 Feb 2024 17:46:20 UTC
The branch main has been updated by imp:
URL: https://cgit.FreeBSD.org/src/commit/?id=f32a6403d34654ac6e61182d09abb5e85850e1ee
commit f32a6403d34654ac6e61182d09abb5e85850e1ee
Merge: 73157ce4982e e8a605e129c6
Author: Warner Losh <imp@FreeBSD.org>
AuthorDate: 2024-02-28 15:16:16 +0000
Commit: Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-02-29 17:42:06 +0000
Merge one true awk from 2024-01-22 for the Awk Second Edition support
This brings in Unicode support, CSV support and a number of bug fixes.
They are described in _The AWK Programming Language_, Second Edition, by
Al Aho, Brian Kernighan, and Peter Weinberger (Addison-Wesley, 2024,
ISBN-13 978-0138269722, ISBN-10 0138269726).
Sponsored by: Netflix
contrib/one-true-awk/FIXES | 1429 ++------------------
contrib/one-true-awk/FIXES.1e | 1429 ++++++++++++++++++++
contrib/one-true-awk/README.md | 80 +-
contrib/one-true-awk/awk.1 | 34 +-
contrib/one-true-awk/awk.h | 23 +-
contrib/one-true-awk/awkgram.y | 49 +-
contrib/one-true-awk/b.c | 409 ++++--
contrib/one-true-awk/bugs-fixed/REGRESS | 8 +-
.../one-true-awk/bugs-fixed/getline-corruption.awk | 5 +
.../one-true-awk/bugs-fixed/getline-corruption.in | 1 +
.../one-true-awk/bugs-fixed/getline-corruption.ok | 1 +
contrib/one-true-awk/bugs-fixed/matchop-deref.awk | 11 +
contrib/one-true-awk/bugs-fixed/matchop-deref.bad | 2 +
contrib/one-true-awk/bugs-fixed/matchop-deref.in | 1 +
contrib/one-true-awk/bugs-fixed/matchop-deref.ok | 2 +
.../one-true-awk/bugs-fixed/missing-precision.ok | 2 +
contrib/one-true-awk/bugs-fixed/negative-nf.ok | 2 +
contrib/one-true-awk/bugs-fixed/pfile-overflow.ok | 4 +
contrib/one-true-awk/bugs-fixed/rstart-rlength.awk | 10 +
contrib/one-true-awk/bugs-fixed/rstart-rlength.ok | 4 +
contrib/one-true-awk/bugs-fixed/system-status.awk | 19 +
contrib/one-true-awk/bugs-fixed/system-status.bad | 3 +
contrib/one-true-awk/bugs-fixed/system-status.ok | 3 +
contrib/one-true-awk/bugs-fixed/system-status.ok2 | 3 +
.../one-true-awk/bugs-fixed/unicode-fs-rs-1.awk | 6 +
contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.in | 2 +
contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.ok | 5 +
.../one-true-awk/bugs-fixed/unicode-fs-rs-2.awk | 7 +
contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.in | 2 +
contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.ok | 4 +
.../one-true-awk/bugs-fixed/unicode-null-match.awk | 6 +
.../one-true-awk/bugs-fixed/unicode-null-match.bad | 1 +
.../one-true-awk/bugs-fixed/unicode-null-match.ok | 1 +
contrib/one-true-awk/lex.c | 60 +-
contrib/one-true-awk/lib.c | 158 ++-
contrib/one-true-awk/main.c | 23 +-
contrib/one-true-awk/makefile | 8 +-
contrib/one-true-awk/maketab.c | 4 +-
contrib/one-true-awk/parse.c | 2 +-
contrib/one-true-awk/proto.h | 10 +-
contrib/one-true-awk/run.c | 942 +++++++++----
contrib/one-true-awk/testdir/Compare.tt | 2 +-
contrib/one-true-awk/testdir/REGRESS | 2 +-
contrib/one-true-awk/testdir/T.argv | 6 +
contrib/one-true-awk/testdir/T.csv | 80 ++
contrib/one-true-awk/testdir/T.flags | 5 +-
contrib/one-true-awk/testdir/T.misc | 20 +
contrib/one-true-awk/testdir/T.overflow | 2 +
contrib/one-true-awk/testdir/T.split | 1 +
contrib/one-true-awk/testdir/T.utf | 194 +++
contrib/one-true-awk/testdir/T.utfre | 234 ++++
contrib/one-true-awk/testdir/tt.15 | 2 +-
contrib/one-true-awk/tran.c | 26 +-
53 files changed, 3525 insertions(+), 1824 deletions(-)
diff --cc contrib/one-true-awk/FIXES.1e
index 000000000000,8cbd6ac1a097..8cbd6ac1a097
mode 000000,100644..100644
--- a/contrib/one-true-awk/FIXES.1e
+++ b/contrib/one-true-awk/FIXES.1e
diff --cc contrib/one-true-awk/README.md
index 76ae3d48c983,000000000000..a41fb3c3b128
mode 100644,000000..100644
--- a/contrib/one-true-awk/README.md
+++ b/contrib/one-true-awk/README.md
@@@ -1,123 -1,0 +1,149 @@@
+# The One True Awk
+
+This is the version of `awk` described in _The AWK Programming Language_,
- by Al Aho, Brian Kernighan, and Peter Weinberger
- (Addison-Wesley, 1988, ISBN 0-201-07981-X).
++Second Edition, by Al Aho, Brian Kernighan, and Peter Weinberger
++(Addison-Wesley, 2024, ISBN-13 978-0138269722, ISBN-10 0138269726).
++
++## What's New? ##
++
++This version of Awk handles UTF-8 and comma-separated values (CSV) input.
++
++### Strings ###
++
++Functions that process strings now count Unicode code points, not bytes;
++this affects `length`, `substr`, `index`, `match`, `split`,
++`sub`, `gsub`, and others. Note that code
++points are not necessarily characters.
++
++UTF-8 sequences may appear in literal strings and regular expressions.
++Aribtrary characters may be included with `\u` followed by 1 to 8 hexadecimal digits.
++
++### Regular expressions ###
++
++Regular expressions may include UTF-8 code points, including `\u`.
++
++### CSV ###
++
++The option `--csv` turns on CSV processing of input:
++fields are separated by commas, fields may be quoted with
++double-quote (`"`) characters, quoted fields may contain embedded newlines.
++Double-quotes in fields have to be doubled and enclosed in quoted fields.
++In CSV mode, `FS` is ignored.
++
++If no explicit separator argument is provided,
++field-splitting in `split` is determined by CSV mode.
+
+## Copyright
+
+Copyright (C) Lucent Technologies 1997<br/>
+All Rights Reserved
+
+Permission to use, copy, modify, and distribute this software and
+its documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appear in all
+copies and that both that the copyright notice and this
+permission notice and warranty disclaimer appear in supporting
+documentation, and that the name Lucent Technologies or any of
+its entities not be used in advertising or publicity pertaining
+to distribution of the software without specific, written prior
+permission.
+
+LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
+INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
+IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
+SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
+IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
+ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
+THIS SOFTWARE.
+
+## Distribution and Reporting Problems
+
+Changes, mostly bug fixes and occasional enhancements, are listed
+in `FIXES`. If you distribute this code further, please please please
+distribute `FIXES` with it.
+
+If you find errors, please report them
- to bwk@cs.princeton.edu.
++to the current maintainer, ozan.yigit@gmail.com.
+Please _also_ open an issue in the GitHub issue tracker, to make
+it easy to track issues.
+Thanks.
+
+## Submitting Pull Requests
+
+Pull requests are welcome. Some guidelines:
+
+* Please do not use functions or facilities that are not standard (e.g.,
+`strlcpy()`, `fpurge()`).
+
+* Please run the test suite and make sure that your changes pass before
+posting the pull request. To do so:
+
+ 1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example).
+ 1. Run `oldawk=nawk make check > check.out 2>&1`.
+ 1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors.
+
+* Please create the pull request with a request
+to merge into the `staging` branch instead of into the `master` branch.
+This allows us to do testing, and to make any additional edits or changes
+after the merge but before merging to `master`.
+
+## Building
+
+The program itself is created by
+
+ make
+
+which should produce a sequence of messages roughly like this:
+
- yacc -d awkgram.y
- conflicts: 43 shift/reduce, 85 reduce/reduce
- mv y.tab.c ytab.c
- mv y.tab.h ytab.h
- cc -c ytab.c
- cc -c b.c
- cc -c main.c
- cc -c parse.c
- cc maketab.c -o maketab
- ./maketab >proctab.c
- cc -c proctab.c
- cc -c tran.c
- cc -c lib.c
- cc -c run.c
- cc -c lex.c
- cc ytab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm
++ bison -d awkgram.y
++ awkgram.y: warning: 44 shift/reduce conflicts [-Wconflicts-sr]
++ awkgram.y: warning: 85 reduce/reduce conflicts [-Wconflicts-rr]
++ awkgram.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o awkgram.tab.o awkgram.tab.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o b.o b.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o main.o main.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o parse.o parse.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 maketab.c -o maketab
++ ./maketab awkgram.tab.h >proctab.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o proctab.o proctab.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o tran.o tran.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lib.o lib.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o run.o run.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lex.o lex.c
++ gcc -g -Wall -pedantic -Wcast-qual -O2 awkgram.tab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm
+
+This produces an executable `a.out`; you will eventually want to
+move this to some place like `/usr/bin/awk`.
+
+If your system does not have `yacc` or `bison` (the GNU
+equivalent), you need to install one of them first.
++The default in the `makefile` is `bison`; you will have
++to edit the `makefile` to use `yacc`.
+
- NOTE: This version uses ANSI C (C 99), as you should also. We have
++NOTE: This version uses ISO/IEC C99, as you should also. We have
+compiled this without any changes using `gcc -Wall` and/or local C
+compilers on a variety of systems, but new systems or compilers
+may raise some new complaint; reports of difficulties are
+welcome.
+
+This compiles without change on Macintosh OS X using `gcc` and
+the standard developer tools.
+
+You can also use `make CC=g++` to build with the GNU C++ compiler,
+should you choose to do so.
+
- The version of `malloc` that comes with some systems is sometimes
- astonishly slow. If `awk` seems slow, you might try fixing that.
- More generally, turning on optimization can significantly improve
- `awk`'s speed, perhaps by 1/3 for highest levels.
-
+## A Note About Releases
+
- We don't do releases.
++We don't usually do releases.
+
+## A Note About Maintenance
+
+NOTICE! Maintenance of this program is on a ''best effort''
+basis. We try to get to issues and pull requests as quickly
+as we can. Unfortunately, however, keeping this program going
+is not at the top of our priority list.
+
+#### Last Updated
+
- Sat Jul 25 14:00:07 EDT 2021
++Mon 05 Feb 2024 08:46:55 IST
diff --cc contrib/one-true-awk/bugs-fixed/getline-corruption.awk
index 000000000000,461e551cfff5..461e551cfff5
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/getline-corruption.awk
+++ b/contrib/one-true-awk/bugs-fixed/getline-corruption.awk
diff --cc contrib/one-true-awk/bugs-fixed/getline-corruption.in
index 000000000000,78981922613b..78981922613b
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/getline-corruption.in
+++ b/contrib/one-true-awk/bugs-fixed/getline-corruption.in
diff --cc contrib/one-true-awk/bugs-fixed/getline-corruption.ok
index 000000000000,3efb54597c6d..3efb54597c6d
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/getline-corruption.ok
+++ b/contrib/one-true-awk/bugs-fixed/getline-corruption.ok
diff --cc contrib/one-true-awk/bugs-fixed/matchop-deref.awk
index 000000000000,000000000000..6c066aad911d
new file mode 100644
--- /dev/null
+++ b/contrib/one-true-awk/bugs-fixed/matchop-deref.awk
@@@ -1,0 -1,0 +1,11 @@@
++function foo() {
++ return "aaaaaab"
++}
++
++BEGIN {
++ print match(foo(), "b")
++}
++
++{
++ print match(substr($0, 1), "b")
++}
diff --cc contrib/one-true-awk/bugs-fixed/matchop-deref.bad
index 000000000000,000000000000..343ee5c2f6cb
new file mode 100644
--- /dev/null
+++ b/contrib/one-true-awk/bugs-fixed/matchop-deref.bad
@@@ -1,0 -1,0 +1,2 @@@
++-1
++-1
diff --cc contrib/one-true-awk/bugs-fixed/matchop-deref.in
index 000000000000,000000000000..0d197e1b6a30
new file mode 100644
--- /dev/null
+++ b/contrib/one-true-awk/bugs-fixed/matchop-deref.in
@@@ -1,0 -1,0 +1,1 @@@
++aaaaaab
diff --cc contrib/one-true-awk/bugs-fixed/matchop-deref.ok
index 000000000000,000000000000..49019db80789
new file mode 100644
--- /dev/null
+++ b/contrib/one-true-awk/bugs-fixed/matchop-deref.ok
@@@ -1,0 -1,0 +1,2 @@@
++7
++7
diff --cc contrib/one-true-awk/bugs-fixed/missing-precision.ok
index 000000000000,75e1e3d00446..75e1e3d00446
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/missing-precision.ok
+++ b/contrib/one-true-awk/bugs-fixed/missing-precision.ok
diff --cc contrib/one-true-awk/bugs-fixed/negative-nf.ok
index 000000000000,de97f8b27def..de97f8b27def
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/negative-nf.ok
+++ b/contrib/one-true-awk/bugs-fixed/negative-nf.ok
diff --cc contrib/one-true-awk/bugs-fixed/pfile-overflow.ok
index 000000000000,a0de50f9007f..a0de50f9007f
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/pfile-overflow.ok
+++ b/contrib/one-true-awk/bugs-fixed/pfile-overflow.ok
diff --cc contrib/one-true-awk/bugs-fixed/rstart-rlength.awk
index 000000000000,f423f0168be3..f423f0168be3
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/rstart-rlength.awk
+++ b/contrib/one-true-awk/bugs-fixed/rstart-rlength.awk
diff --cc contrib/one-true-awk/bugs-fixed/rstart-rlength.ok
index 000000000000,961cb895b51b..961cb895b51b
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/rstart-rlength.ok
+++ b/contrib/one-true-awk/bugs-fixed/rstart-rlength.ok
diff --cc contrib/one-true-awk/bugs-fixed/system-status.awk
index 000000000000,8daf563e6f4f..8daf563e6f4f
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/system-status.awk
+++ b/contrib/one-true-awk/bugs-fixed/system-status.awk
diff --cc contrib/one-true-awk/bugs-fixed/system-status.bad
index 000000000000,a1317dba54a8..a1317dba54a8
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/system-status.bad
+++ b/contrib/one-true-awk/bugs-fixed/system-status.bad
diff --cc contrib/one-true-awk/bugs-fixed/system-status.ok
index 000000000000,737828f5ed7a..737828f5ed7a
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/system-status.ok
+++ b/contrib/one-true-awk/bugs-fixed/system-status.ok
diff --cc contrib/one-true-awk/bugs-fixed/system-status.ok2
index 000000000000,000000000000..f1f631e1cb33
new file mode 100644
--- /dev/null
+++ b/contrib/one-true-awk/bugs-fixed/system-status.ok2
@@@ -1,0 -1,0 +1,3 @@@
++normal status 42
++death by signal status 257
++death by signal with core dump status 262
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.awk
index 000000000000,67366ec75070..67366ec75070
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.awk
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.awk
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.in
index 000000000000,2e882af62a2c..2e882af62a2c
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.in
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.in
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.ok
index 000000000000,f337302be903..f337302be903
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.ok
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-1.ok
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.awk
index 000000000000,34d77bf2c95f..34d77bf2c95f
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.awk
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.awk
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.in
index 000000000000,2de6e718fd3b..2de6e718fd3b
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.in
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.in
diff --cc contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.ok
index 000000000000,2387001bc1b2..2387001bc1b2
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.ok
+++ b/contrib/one-true-awk/bugs-fixed/unicode-fs-rs-2.ok
diff --cc contrib/one-true-awk/bugs-fixed/unicode-null-match.awk
index 000000000000,0c056126922b..0c056126922b
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-null-match.awk
+++ b/contrib/one-true-awk/bugs-fixed/unicode-null-match.awk
diff --cc contrib/one-true-awk/bugs-fixed/unicode-null-match.bad
index 000000000000,7cd35ff2d932..7cd35ff2d932
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-null-match.bad
+++ b/contrib/one-true-awk/bugs-fixed/unicode-null-match.bad
diff --cc contrib/one-true-awk/bugs-fixed/unicode-null-match.ok
index 000000000000,1ac142f8a895..1ac142f8a895
mode 000000,100644..100644
--- a/contrib/one-true-awk/bugs-fixed/unicode-null-match.ok
+++ b/contrib/one-true-awk/bugs-fixed/unicode-null-match.ok
diff --cc contrib/one-true-awk/testdir/T.csv
index 000000000000,e0f3d708edaf..e0f3d708edaf
mode 000000,100755..100755
--- a/contrib/one-true-awk/testdir/T.csv
+++ b/contrib/one-true-awk/testdir/T.csv
diff --cc contrib/one-true-awk/testdir/T.utf
index 000000000000,18f2b9c355cf..18f2b9c355cf
mode 000000,100755..100755
--- a/contrib/one-true-awk/testdir/T.utf
+++ b/contrib/one-true-awk/testdir/T.utf
diff --cc contrib/one-true-awk/testdir/T.utfre
index 000000000000,20e66cbde9a5..20e66cbde9a5
mode 000000,100755..100755
--- a/contrib/one-true-awk/testdir/T.utfre
+++ b/contrib/one-true-awk/testdir/T.utfre