Uppercase RE matching problems in FreeBSD 11
Charles Swiger
cswiger at mac.com
Mon Nov 7 21:13:48 UTC 2016
On Nov 6, 2016, at 1:49 PM, Stefan Bethke <stb at lassitu.de> wrote:
> Am 06.11.2016 um 22:27 schrieb Baptiste Daroussin <bapt at FreeBSD.org>:
>> That works for POSIX locale aka C aka ASCII only world
>
> So what do I set my LANG and LC variables to? I do want UTF-8, but I do also want my scripts to continue to work. Clearly, en_US.UTF-8 is not what I want. Is it C.UTF-8? Or do I set LANG=en_US.UTF-8 and LC_COLLATE=C?
If you want to use a UTF8 locale, then you must start using character classes like '[:upper:]' and '[:lower:]' because those will-- or at least "should", modulo bugs-- properly handle the collation issues including for languages which do not possess a 1-1 mapping between upper and lower case letters.
Someone with a German email address is presumably familiar with ß / Eszett...? :-)
Regards,
--
-Chuck
More information about the freebsd-stable
mailing list