standards/54410 (awk command)

Jens Schweikhardt schweikh at schweikhardt.net
Mon Oct 11 12:09:16 PDT 2004


Kamal,

On Mon, Oct 11, 2004 at 04:51:07PM +0530, Kamal R. Prasad wrote:
# --------------------------------------------------
# 
# *How-To-Repeat*
# 
# 	echo e | /usr/bin/awk '/e{1}/'          # should print e, but prints 
# 	nothing
# 
#    
# 
# *Fix*
#    <> It's probaly POLA violation to change the default RE style from
#    BRE to ERE, but we should add a POSIX mode that uses BRE (e.g.
#    gawk needs --posix to be compliant).
#    <>-------------------------------------------------------------
# 
# I can fix this -but that would change the traditional behaviour. Your 
# idea of adding a --posix flag may not be appropriate because POSIX 
# requires the specified behaviour as default behaviour.
# i.e. a posix compliant awk script would break because it expects the 
# code to be fully portable across unix'es. Let me know how it goes.

I just looked at our awk(1) man page, and it explicitly says that

   Regular expressions are as in egrep; see grep(1).

And in fact, patterns like /(a|b)/ do work as expected. It appears only
the quantifiers {n}, {n,}, {,m} and {n,m} are not implemented. This is
where I noticed the POSIX deviation.

I want to ask a wider audience how to fix this, thus cc to standards at .
Some of the options:

1. Do nothing and document the missing {} quantifiers in awk(1)'s BUGS.
2. Use some environment variable (POSIXLY_CORRECT?) if {} should be
   handled like a proper ERE and remain bug compatible to old behavior if not.
3. Add {} unconditionally at the risk of breaking awk scripts and point
   users to awk(1) where it says this should always have been like this.
   Place prominent note in UPGRADING. Hah, as if anyone reads that :-)
4. Your opinion here.

Regards,

	Jens
-- 
Jens Schweikhardt http://www.schweikhardt.net/
SIGSIG -- signature too long (core dumped)


More information about the freebsd-standards mailing list