UTF-8 by default?
Don Lewis
truckman at FreeBSD.org
Wed Jul 20 20:23:56 UTC 2016
On 20 Jul, Tim Čas wrote:
> On 20 July 2016 at 20:33, Don Lewis <truckman at freebsd.org> wrote:
>> wc(1) has problems with its multibyte support pointed out by Coverity
>> as I recall.
>
> Not sure how critical that issue is (e.g. byte counts [`-c`], line
> counts [`-l`], and such should still work as intended; whether word
> counts work or not depends on whether we should count Unicode
> whitespace as, well, whitespace). I do wonder if everyone agrees that
> an effort should be made towards UTF-8 default, though?
It passes a fixed-length non-NUL terminated buffer (returned by read(2))
to mbrtowc(). In addition to the lack of termination, the buffer could
also contain a partial character at its beginning or end if the contents
are UTF-8.
The Coverity ID is 978825.
More information about the freebsd-current
mailing list