citrus/BSD iconv doesn't respect ICONV_SET_DISCARD_ILSEQ flag
Lev Serebryakov
lev at FreeBSD.org
Sun Apr 9 11:06:47 UTC 2017
Hello Freebsd-i18n,
I understand, that iconvctl(3) is GNU extension, but as soon as citurs
iconv used by FreeBSD libc formally supports this API and
ICONV_SET_DISCARD_ILSEQ flag, they should work, IMHO. But they don't. If I
try to convert simple UTF-8 string with illegal sequence to ASCII (all
legal character in this string is ASCII), it stops on illegal sequence and
returns error. GNU iconv from ports works correctly. I didn't try UTF-16
and UTF-32/UCS-4, but by looking at code, I'm afraid, they have same
problems.
Here are simple program, which reproduce problem:
===============
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iconv.h>
int main(int argc, char *argv[]) {
const char *src = "X\x80Y";
char dst[64] = {0, 0, 0, 0, 0, 0, 0};
char *s = (char*)src;
char *d = &dst[0];
size_t ss = strlen(src) + 1;
size_t ds = sizeof(dst);
int flag;
iconv_t ic = iconv_open("ascii", "utf-8");
flag = 1;
iconvctl(ic, ICONV_SET_DISCARD_ILSEQ, &flag);
printf("Result: %ld\n", iconv(ic, &s, &ss, &d, &ds));
printf("Converted: from %lu to %lu bytes\n", strlen(src) + 1 - ss, sizeof(dst) - ds);
printf("Out: \"%s\"\n", &dst[0]);
iconv_close(ic);
return 0;
}
===============
% cc ic.c
% ./a.out
Result: -1
Converted: from 1 to 1 bytes
Out: "X"
% cc -L/usr/local/lib -I/usr/local/include ic.c -liconv
% ./a.out
Result: 0
Converted: from 4 to 3 bytes
Out: "XY"
%
% uname -a
FreeBSD blob.home.serebryakov.spb.ru 11.0-STABLE FreeBSD 11.0-STABLE #13 r315153M: Sun Mar 12 20:11:36 MSK 2017 root at blob.home.serebryakov.spb.ru:/usr/obj/usr/src/sys/BLOB amd64
%
--
Best regards,
Lev mailto:lev at FreeBSD.org
More information about the freebsd-i18n
mailing list