Uppercase RE matching problems in FreeBSD 11

Charles Swiger cswiger at mac.com
Mon Nov 7 21:13:48 UTC 2016


On Nov 6, 2016, at 1:49 PM, Stefan Bethke <stb at lassitu.de> wrote:
> Am 06.11.2016 um 22:27 schrieb Baptiste Daroussin <bapt at FreeBSD.org>:
>> That works for POSIX locale aka C aka ASCII only world
> 
> So what do I set my LANG and LC variables to?  I do want UTF-8, but I do also want my scripts to continue to work.  Clearly, en_US.UTF-8 is not what I want.  Is it C.UTF-8?  Or do I set LANG=en_US.UTF-8 and LC_COLLATE=C?

If you want to use a UTF8 locale, then you must start using character classes like '[:upper:]' and '[:lower:]' because those will-- or at least "should", modulo bugs-- properly handle the collation issues including for languages which do not possess a 1-1 mapping between upper and lower case letters.

Someone with a German email address is presumably familiar with ß / Eszett...?  :-)

Regards,
-- 
-Chuck



More information about the freebsd-stable mailing list