[Bug 256473] FreeBSD shells are case insensitive for character ranges
Date: Tue, 08 Jun 2021 18:48:23 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256473 --- Comment #7 from Stefan Eßer <se@FreeBSD.org> --- (In reply to Jason W. Bacon from comment #6) > I see the pattern now, but your range expansion above is incorrect and doesn't agree with the ls output I provided. > > The lower case letters actually come first, which is not what I expected either. That's why the output seemed inexplicable at first. > > [A-Z] == [AbB..zZ] == all letters except 'a' > [a-z] == [aAbB..z] == all letters except 'Z' > > [A-Z]* selects for all but those that start with 'a', not 'z'. This explains why zip is listed and aardvark is not. Seems your collating sequence has lower case letters before upper case letters, but in fact, which is very common (I got that reversed). But Unicode collation sequences are much more complex than that. For example, many languages sort by character without regard to upper/lower case and only if the case-ignorant comparison does not define an ordering, the case comes into play. E.g., in /usr/ports: $ /bin/ls -1d [cC]* cad CHANGES chinese comms CONTRIBUTING.md converters COPYRIGHT Case is ignored if the case-ignorant comparison gives a result, and that makes "cad" come before "CHANGES" and that is followed by "chinese". This shows, that the order is not primarily determined by the case of the initial character "c" vs. "C", but by comparing the full name and then using upper/lower case only as a less relevant criterion. And that makes "[C]*" behave different from looking at the sorted list and starting at the first entry that has "C" as its initial letter. Anyway, this is all specified by the Unicode collation algorithm (UCA), which describes the algorithm. Each locale definition specifies parameters of that algorithm and the order you observe complies with that specification (you did not specify your locale, e.g. the LANG value that is in effect). There is nothing wrong with the FreeBSD shells, but you may have to set some environment variable (LC_COLLATE) to the specific value that results in the correct sort order, if the default does not work for you. -- You are receiving this mail because: You are the assignee for the bug.