[Bug 172862] sed(1) improperly deals with escape chars

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Mon Dec 4 05:41:38 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=172862

Yuri Pankov <yuripv at gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yuripv at gmx.com

--- Comment #3 from Yuri Pankov <yuripv at gmx.com> ---
Despite what comment #1 says, sed does NOT match literal "\t" to a tab
character outside [] as well -- quoting
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html:

9.3.2 BRE Ordinary Characters

The interpretation of an ordinary character preceded by an unescaped
<backslash> ('\\') is undefined.

9.4.2 ERE Ordinary Characters

The interpretation of an ordinary character preceded by an unescaped
<backslash> ( '\\' ) is undefined.

There's an exception that is important here that <backslash> loses it's special
meaning inside the bracket expression, so in your examples the bracket
expression "[\t ]" correctly matches the 't' character and whitespace.

Given the above, GNU sed actually violates the standard which defines the 's'
command as the following:

[2addr]s/BRE/replacement/flags

...so all BRE (ERE with -E) rules apply here.

I would agree that *outside* of the bracket expression we could make "\t" match
the tab character (that's what libtre does apparently) as it's more readable
than inserting literal tab characters in RE, but inside the bracket expression
GNU sed is clearly wrong, and our sed (through regex(3)) is doing the right
thing.

(I just hope I'm understanding everything correctly here, of course)

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list