[Bug 232374] /bin/sh can not handle ja_JP.eucJP character code
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Thu Nov 8 13:08:44 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232374
Yuichiro NAITO <naito.yuichiro at gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |naito.yuichiro at gmail.com
--- Comment #2 from Yuichiro NAITO <naito.yuichiro at gmail.com> ---
In my investigation, main reason of this problem is because read_char()
function
doesn't retry read(2) from STDIN when mbrtowc(3) returns -2.
In lib/libedit/read.c, we can see following code that retries only when
CHARSET_IS_UTF8 flag is set.
```
switch (ct_mbrtowc(cp, cbuf, cbp)) {
<snip>
case (size_t)-2:
/*
* We don't support other multibyte charsets.
* The second condition shouldn't happen
* and is here merely for additional safety.
*/
if ((el->el_flags & CHARSET_IS_UTF8) == 0 ||
cbp >= MB_LEN_MAX) {
errno = EILSEQ;
*cp = L'\0';
return -1;
}
/* Incomplete sequence, read another byte. */
goto again;
```
Of course, CHARSET_IS_UTF8 flag is not set in eucJP environment.
Try cutting CHARSET_IS_UTF8 flag check, /bin/sh works to read eucJP code.
And I found another problem with cutting CHARSET_IS_UTF8 flag check.
It is that command history mistakes calculating eucJP character length,
because ct_enc_width() function in chartype.c doesn't understand other charset
than UTF-8.
I rewrite ct_enc_width() to use wctomb(3), command history problem is fixed.
With these two changes, we don't need CHARSET_IS_UTF8 flag any more.
CHARSET_IS_UTF8 flag controls NARROW_HISTORY flag, and NARROW_HISTORY flag
is used only in HIST_FUN definition.
```
#ifdef WIDECHAR
#define HIST_FUN(el, fn, arg) \
(((el)->el_flags & NARROW_HISTORY) ? hist_convert(el, fn, arg) : \
HIST_FUN_INTERNAL(el, fn, arg))
#else
#define HIST_FUN(el, fn, arg) HIST_FUN_INTERNAL(el, fn, arg)
#endif
```
In WIDECHAR environment, hist_convert() should be called always,
because hist_convert() is a multibyte aware function.
For all my fix, I opened new differential on Phabricator.
https://reviews.freebsd.org/D17903
I believe my fix solve this problem and doesn't affect other charset than
eucJP.
Please review my code.
Hirabayashi-san:
Could you please try my patch from Phabricator and check if this problem is
fixed?
I don't think /bin/sh is wrong.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list