[Bug 289370] wcsxfrm() fails with EINVAL for some characters
- In reply to: bugzilla-noreply_a_freebsd.org: "[Bug 289370] wcsxfrm() fails with EINVAL for some characters"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 08 Sep 2025 20:32:53 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289370
--- Comment #6 from Mark Millard <marklmi26-fbsd@yahoo.com> ---
(In reply to Serhiy Storchaka from comment #3)
UTF-8 has (:
Code point to/from UTF-8 conversion
First code point Last code point Byte 1 Byte 2 Byte 3
Byte 4
U+0000 U+007F 0yyyzzzz
U+0080 U+07FF 110xxxyy 10yyzzzz
U+0800 U+FFFF 1110wwww 10xxxxyy
10yyzzzz
U+010000 U+10FFFF 11110uvv 10vvwwww
10xxxxyy 10yyzzzz
L'\u00C5' ( a.k.a. U+00C5 )is in the range:
U+0080 U+07FF
That range uses 2 bytes for the UTF-8 encoding, not one:
110xxxyy 10yyzzzz
As far as I can tell: U+00C5 is not an example of:
single-byte LC_CTYPE
It looks like the BUGS section that I referenced does apply.
--
You are receiving this mail because:
You are the assignee for the bug.