tr(1) buggy with de_DE.ISO8859-1(5) locale?
Martin Krzysiak
cinek at gmx.de
Mon Feb 6 17:53:52 PST 2006
Oliver Fromme wrote:
> It's not a bug. It's perfectly POSIX-compatible.
I think this behavior is "undefined" in POSIX, as
I found in some documents. This is a difference.
> To convert lower case to upper case, use the command
> "tr '[:lower:]' '[:upper:]'" (or enumerate all letters
> explicitely, like "tr abcdef ABCDEF"). Skripts that
> use things like "tr a-z A-Z" are broken and need to be
> fixed.
It's not only upper-lowercase conversion that is weird.
Try "echo wxyz | tr w-z a-d". Ranges are broken generally
in ISO-locales, in my opinion.
> By the way: Do not set LANG or LC_ALL, expecially for
> the root user, and especially when compiling things.
One thing I like about FreeBSD is that I have my German
environment. But you are right. The only locale that is
expected to work correctly is "C".
> Not only will tr behave in unexpected ways when used
> like above, but also other things might break. For
> example, German month names appear in "ls -l", which
> will break scripts that try to parse them.
Don't tell me about localization problems. I've seen
lots of stupid things. The latest one was a localized
"Date:" header produced by a commercial application.
> Some tools
> use decimal commas instead of decimal points, which
> can lead to further confusion, etc. Yes, scripts
> which try to do that are broken, but they do exist.
Yes. You are right.
How many times did you use tr(1) to convert your texts
to upper/lower case? Do you expect that it works correctly?
I would prefer to use it like: "tr a-zäöü A-ZÄÖÜ",
_if_ I ever need to do it.
> If you only need support for German umlauts, then only
> set LC_CTYPE. That shouldn't break anything.
I appreciate really really really that FreeBSD supports
German locales.
Let's stop arguing. I just wanted to ask about the behavior.
Now I know that something might by fishy with tr(1) and I
understand how to avoid this problem. That's all I need to
know.
For people who are interested in a simple workaround.
Don't use de_DE.ISO8859-1(5). Instead use de_DE.UTF-8.
tr(1)'s ranges work like expected there.
Martin
More information about the freebsd-stable
mailing list