git: e7a04a110724 - main - awk: Merge upstream manpage updates
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 04 Sep 2025 06:00:00 UTC
The branch main has been updated by imp:
URL: https://cgit.FreeBSD.org/src/commit/?id=e7a04a110724183c72e25c5c8461f89f50b4d08a
commit e7a04a110724183c72e25c5c8461f89f50b4d08a
Author:     Warner Losh <imp@FreeBSD.org>
AuthorDate: 2025-09-04 05:44:33 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2025-09-04 05:59:48 +0000
    awk: Merge upstream manpage updates
    
    Merge the upstream manpage upades into awk.1. This goes through upstream
    hash 9acc510. Upstream man page is written in raw nroff with "an"
    macros, rather than in mandoc, so convert to mandoc as well. The man
    page isn't updated on imports automatically, plus our man page has
    diverged somewhat from upstraem's so it's not a mechanical change...
    
    PR: 230730
    
    Sponsored by:           Netflix
---
 usr.bin/awk/awk.1 | 136 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 127 insertions(+), 9 deletions(-)
diff --git a/usr.bin/awk/awk.1 b/usr.bin/awk/awk.1
index 65c91738966b..612669629a02 100644
--- a/usr.bin/awk/awk.1
+++ b/usr.bin/awk/awk.1
@@ -21,7 +21,7 @@
 .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
 .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
 .\" THIS SOFTWARE.
-.Dd July 30, 2021
+.Dd September 3, 2025
 .Dt AWK 1
 .Os
 .Sh NAME
@@ -32,7 +32,7 @@
 .Op Fl safe
 .Op Fl version
 .Op Fl d Ns Op Ar n
-.Op Fl F Ar fs
+.Op Fl F Ar fs | Fl -csv
 .Op Fl v Ar var Ns = Ns Ar value
 .Op Ar prog | Fl f Ar progfile
 .Ar
@@ -42,9 +42,11 @@ scans each input
 .Ar file
 for lines that match any of a set of patterns specified literally in
 .Ar prog
-or in one or more files specified as
+or in one or more files
+specified as
 .Fl f Ar progfile .
-With each pattern there can be an associated action that will be performed
+With each pattern
+there can be an associated action that will be performed
 when a line of a
 .Ar file
 matches the pattern.
@@ -76,6 +78,11 @@ to dump core on fatal errors.
 .It Fl F Ar fs
 Define the input field separator to be the regular expression
 .Ar fs .
+.It Fl -csv
+causes
+.Nm
+to process records using (more or less) standard comma-separated values
+(CSV) format.
 .It Fl f Ar progfile
 Read program code from the specified file
 .Ar progfile
@@ -178,7 +185,7 @@ as the field separator, use the
 option with a value of
 .Sq [t] .
 .Pp
-A pattern-action statement has the form
+A pattern-action statement has the form:
 .Pp
 .D1 Ar pattern Ic \&{ Ar action Ic \&}
 .Pp
@@ -347,7 +354,7 @@ in a pattern.
 A pattern may consist of two patterns separated by a comma;
 in this case, the action is performed for all lines
 from an occurrence of the first pattern
-through an occurrence of the second.
+through an occurrence of the second, inclusive.
 .Pp
 A relational expression is one of the following:
 .Pp
@@ -363,7 +370,8 @@ A relational expression is one of the following:
 .Pp
 where a
 .Ar relop
-is any of the six relational operators in C, and a
+is any of the six relational operators in C,
+and a
 .Ar matchop
 is either
 .Ic ~
@@ -386,6 +394,9 @@ and after the last.
 and
 .Ic END
 do not combine with other patterns.
+They may appear multiple times in a program and execute
+in the order they are read by
+.Nm
 .Pp
 Variable names with special meanings:
 .Pp
@@ -428,6 +439,11 @@ The length of the string matched by the
 function.
 .It Va RS
 Input record separator (default newline).
+If empty, blank lines separate records.
+If more than one character long,
+.Va RS
+is treated as a regular expression, and records are
+separated by text matching the expression.
 .It Va RSTART
 The starting position of the string matched by the
 .Fn match
@@ -515,7 +531,8 @@ occurs, or 0 if it does not.
 The length of
 .Fa s
 taken as a string,
-or of
+number of elements in an array for an array argument,
+or length of
 .Va $0
 if no argument is given.
 .It Fn match s r
@@ -696,10 +713,44 @@ records from
 .Ar file
 remains open until explicitly closed with a call to
 .Fn close .
+.It Fn systime
+returns the current date and time as a standard
+.Dq seconds since the epoch
+value.
+.It Fn strftime fmt timestamp
+formats
+.Fa timestamp
+(a value in seconds since the epoch)
+according to
+Fa fmt ,
+which is a format string as supported by
+.Xr strftime 3 .
+Both
+.Fa timestamp
+and
+.Fa fmt
+may be omitted; if no
+.Fa timestamp ,
+the current time of day is used, and if no
+.Fa fmt ,
+a default format of
+.Dq %a %b %e %H:%M:%S %Z %Y
+is used.
 .It Fn system cmd
 Executes
 .Fa cmd
 and returns its exit status.
+This will be -1 upon error,
+.Fa cmd 's
+exit status upon a normal exit,
+256 +
+.Va sig
+upon death-by-signal, where
+.Va sig
+is the number of the murdering signal,
+or 512 +
+.Va sig
+if there was a core dump.
 .El
 .Ss Bit-Operation Functions
 .Bl -tag -width "lshift(a, b)"
@@ -725,6 +776,16 @@ Returns integer argument x shifted by n bits to the right.
 But note that the
 .Ic exit
 expression can modify the exit status.
+.Sh ENVIRONMENT VARIABLES
+If
+.Va POSIXLY_CORRECT
+is set in the environment, then
+.Nm
+follows the POSIX rules for
+.Fn sub
+and
+.Fn gsub
+with respect to consecutive backslashes and ampersands.
 .Sh EXAMPLES
 Print lines longer than 72 characters:
 .Pp
@@ -734,7 +795,7 @@ Print first two fields in opposite order:
 .Pp
 .Dl { print $2, $1 }
 .Pp
-Same, with input fields separated by comma and/or blanks and tabs:
+Same, with input fields separated by comma and/or spaces and tabs:
 .Bd -literal -offset indent
 BEGIN { FS = ",[ \et]*|[ \et]+" }
       { print $2, $1 }
@@ -810,6 +871,63 @@ to it.
 .Pp
 The scope rules for variables in functions are a botch;
 the syntax is worse.
+.Pp
+Input is expected to be UTF-8 encoded.
+Other multibyte character sets are not handled.
+However, in eight-bit locales,
+.Nm
+treats each input byte as a separate character.
+.Sh UNUSUAL FLOATING-POINT VALUES
+.Nm
+was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
+and Infinity values, which are supported by all modern floating-point
+hardware.
+.Pp
+Because
+.Nm
+uses
+.Xr strtod 3
+and
+.Xr atof 3
+to convert string values to double-precision floating-point values,
+modern C libraries also convert strings starting with
+.Va inf
+and
+.Va nan
+into infinity and NaN values respectively.
+This led to strange results,
+with something like this:
+.Bd -literal -offset indent
+echo nancy | awk '{ print $1 + 0 }'
+.Ed
+.Pp
+printing
+.Dq nan
+instead of zero.
+.Pp
+.Nm
+now follows GNU AWK, and prefilters string values before attempting
+to convert them to numbers, as follows:
+.Bl -tag -width "Hexadecimal values"
+.It Hexadecimal values
+Hexadecimal values (allowed since C99) convert to zero, as they did
+prior to C99.
+.It NaN values
+The two strings
+.Dq +nan
+and
+.Dq -nan
+(case independent) convert to NaN.
+No others do.
+(NaNs can have signs.)
+.It Infinity values
+The two strings
+.Dq +inf
+and
+.Dq -inf
+(case independent) convert to positive and negative infinity, respectively.
+No others do.
+.El
 .Sh DEPRECATED BEHAVIOR
 One True Awk has accepted
 .Fl F Ar t