[Bug 266001] uniq says it's affected by LC_COLLATE, must not be according to POSIX
Date: Mon, 24 Oct 2022 21:31:46 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266001
наб <nabijaczleweli@nabijaczleweli.xyz> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|Works As Intended |---
Status|Closed |Open
--- Comment #2 from наб <nabijaczleweli@nabijaczleweli.xyz> ---
POSIX.1 Issue 8 Draft 2.1 says (ll. 111035-111036):
Issue 8
Austin Group Defect 1070 is applied, changing the APPLICATION USAGE
section.
I.e.: https://www.austingroupbugs.net/view.php?id=1070
Please grep down to "On Page: 3310 Line: 111099 Section: uniq" there.
Notably, this updates APPLICATION USAGE, which is non-normative, to align with
the actual hard description, to:
The sort utility can be used to cause repeated lines to be adjacent in the
input file.
If the collating sequence of the current locale does not have a total
ordering of all characters, the behavior of <tt>sort | uniq</tt> differs from
<tt>sort -u</tt>, as uniq treats lines as duplicates only if they are
identical, whereas <tt>sort -u</tt> treats lines as duplicates if they collate
equally.
The actual normative text is the same as it is in
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
In which /no mention/ is made of "collation" or "equivalence", and LC_COLLATE
is /not/ listed in the ENVIRONMENT VARIABLES – lines are "compared", i.e.
strcmp().
This is the precise difference between sort -u and sort | uniq.
--
You are receiving this mail because:
You are the assignee for the bug.