strange tr behaviour
Jon Noack
noackjr at alumni.rice.edu
Fri Mar 26 02:44:17 PST 2004
On 3/26/2004 4:09 AM, Michael Reifenberger wrote:
> On Fri, 26 Mar 2004, Jon Noack wrote:
>> Short version:
>> tr(1) was modified to be POSIX compliant for 5.x. You are seeing
>> correct behavior. See the solution below.
>
> Thanks all for the hints.
>
> Only that tr(1) states:
> ...
> COMPATIBILITY
> System V has historically implemented character ranges using the syntax
> ``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and
> standardized by POSIX. System V shell scripts should work under this
> implementation as long as the range is intended to map in another range,
> i.e. the command ``tr [a-z] [A-Z]'' will work as it will map the ``[''
> character in string1 to the ``['' character in string2. However, if the
> ...
>
> So I just expected the historic behaviour so that [a-z] map to [A-Z]
> as before :-(
From tr(2):
c-c For non-octal range endpoints represents the range of charac-
ters between the range endpoints, inclusive, in ascending
order, as defined by the collation sequence.
It's translating _ranges_. To help understand this:
$ echo abcdef | tr a-z A-D
ABCDDD
The first range (a-z) is larger than the second (A-D), so it does a
one-to-one mapping until it hits the end of the second range. At that
point it must just use the final character from the second range.
In your locale, the range a-z is smaller than the range A-Z. Thus, the
one-to-one mappings won't result in proper case conversion.
Perhaps tr(2) should be updated to say something about this.
Jon
More information about the freebsd-current
mailing list