[Bug 257972] collating sequence not sensible in some locales

Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 20 Aug 2021 14:13:54 UTC

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257972

            Bug ID: 257972
           Summary: collating sequence not sensible in some locales
           Product: Base System
           Version: 13.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: standards
          Assignee: standards@FreeBSD.org
          Reporter: freebsd@oldach.net

As discussed  in
https://lists.freebsd.org/archives/freebsd-stable/2021-August/000193.html

> > # uname -a
> > FreeBSD 13STABLE 13.0-STABLE FreeBSD 13.0-STABLE #49 stable/13-n246779-64085efb677-dirty: Mon Aug 16 08:42:53 CEST 2021     root@XXX amd64
> > # export LANG=en_US.ISO8859-1
> > # (echo bla; echo Bla) | grep '[A-Z]'
> > bla
> > Bla
> 
> This one is unexpected, the upper case should be a range of its own
> and should not include any lower case letters.

> > # export LANG=en_US.UTF-8
> > # (echo bla; echo Bla) | grep '[A-Z]'
> > Bla
> 
> Here I had expected the result you got with en_US.ISO8859-1 ...

> > For comparison, a Linux RHEL box delivers the expected results:
> >
> > # uname -a
> > Linux rhel.local 3.10.0-1062.9.1.el7.x86_64 #1 SMP Mon Dec 2 08:31:54 EST
2019 x86_64 x86_64 x86_64 GNU/Linux
> > # export LANG=en_US.ISO8859-1
> > # (echo bla; echo Bla) | grep '[A-Z]'
> > Bla
> > # export LANG=en_US.UTF-8
> > # (echo bla; echo Bla) | grep '[A-Z]'
> > Bla
>
> Seems that this version uses a POSIX style collating sequence for UTF-8.

> Definitely a bug in the definition of the collating sequences.
>
> And I have just verified that de_DE.ISO8859-1 wrongly considers "ö"
> to be within [a-z], while de_DE.UTF-8 does not (but should).
>
> Seems that the correct collating sequences for ISO8859-1 and UTF-8 are
> each assigned to the other one.

Can some knowledgeable person please validate?

-- 
You are receiving this mail because:
You are the assignee for the bug.