[Bug 243229] awk in base system does not work with UTF-8 strings correctly
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Thu Jan 9 21:20:50 UTC 2020
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243229
Bug ID: 243229
Summary: awk in base system does not work with UTF-8 strings
correctly
Product: Base System
Version: 12.1-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: misc
Assignee: bugs at FreeBSD.org
Reporter: sv at ulstu.ru
I tried using the function length() with UTF-8 strings. And this function
produces an incorrect result. The function works with strings not as
characters, but as bytes. And the number of characters per string is multiplied
by two.
Steps to reproduce (for LANG=ru_RU.UTF-8):
echo 'Привет' | awk '{print length($1)}'
If I use the function length() with lang/gawk, then UTF-8 string length is
calculated correctly.
Are you planning to update awk in the base system to support UTF-8 strings in
the near future?
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list