[Bug 241441] inconsistency between allowed empty regex for `awk -F` and split()
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed Oct 23 19:21:00 UTC 2019
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241441
Bug ID: 241441
Summary: inconsistency between allowed empty regex for `awk -F`
and split()
Product: Base System
Version: 12.0-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: bin
Assignee: bugs at FreeBSD.org
Reporter: freebsd at tim.thechases.com
I get an error when I try to use an empty regex for the field separator:
$ echo hello | awk -F '' '{print $2}'
awk: field separator FS is empty
but awk has no issues splitting things on an empty regex:
$ awk 'BEGIN{s="hello"; split(s, a, ""); print a[1]}'
h
Over on gawk, I get the expected behavior
$ echo hello | awk -F '' '{print $1}'
h
This is somewhat similar to #226112
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226112
I get that awk uses EREs and `man re_format` says that "A (modern [Extended])
RE is one or more non-empty branches, separated by '|'", but
1) that's not what split() does
2) it's not what gawk's -F parameter does
3) permitting an empty regex for splitting already seems supported in awk code
(as the split example shows) and shouldn't break any existing usage
4) as a non-workaround, `man re_format` says that the atom "()" matches the
null string, but
$ echo hello | awk -F '()' '{print $1}'
doesn't split the row on the null regular expression (FWIW, gawk gives the same
results when using "()" as the split pattern).
In an ideal world, the behavior would match the behavior of gawk & the split()
function, splitting the record into each individual character.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list