Differences between iconv from ports and iconv in base (transliteration)

Michael Gmelin freebsd at grem.de
Fri Dec 6 10:17:43 UTC 2013


On Fri, 06 Dec 2013 11:36:57 +0900 (JST)
Hiroki Sato <hrs at FreeBSD.org> wrote:

> Michael Gmelin <freebsd at grem.de> wrote
>   in <20131206001554.0d9d3e23 at bsd64.grem.de>:
> 
> fr> I'm in the process of changing ports from ports iconv to iconv in
> fr> base. I noticed that transliteration doesn't work in base as it
> fr> does with iconv from ports. Examples:
> fr>
> fr> "T\xc5\xbdst"
> fr> ports: "TZst"
> fr> base: "Tst"
> fr>
> fr> "T\xe2\x82\xacst"
> fr> ports: "TEURst"
> fr> base: "Tst"
> fr>
> fr> Conversion done using:
> fr> iconv_open("ISO8859-1//IGNORE//TRANSLIT", "UTF-8");
> fr>
> fr> Any ideas?
> 
>  //TRANSLIT is a GNU iconv specific extension and iconv in base (and
>  other POSIX systems) does not support it.  Does your software depend
>  on it?
> 
> -- Hiroki

The porters handbook implies that USES= iconv and the macros it
provides (ICONV_LIB etc.) are a plugin replacement for GNU iconv. This
is obviously not the case, so I think two things are required:

1. The documentation should point out this fact (something like "A port
   might depend on GNU iconv specific features, one way to figure
   out if your port might be affected is
   egrep -Rl "//(TRANSLIT|IGNORE)" work )
2. Provide some way to specify that GNU iconv is required by the port.
   (assuming that it is possible to use both at the same time on one
   system)

Otherwise people will get bitten by this change, especially if such
features are used on top of the stack (e.g. PHP5's iconv function is
officially documented to support "//TRANSLIT", but completely depends on
the underlying iconv library to provide it). Imagine you're updating
your application server to FreeBSD 10 and all the sudden your third
party web application written in PHP will break in ways that are really
hard to track down (I just verified this, I chose PHP since it's quite
popular and has a user base that's probably not aware of the underlying
libraries). An example I found in production code is converting a
person's name to ASCII to transmit it to an old credit card system,
transliterating central European characters (e.g. ä => ae, á => a etc.).

All of this is independent of the question if GNU iconv is POSIX
compliant or not and if it should behave like this or not. For practical
reasons the ports system must provide a way to select GNU iconv for
software that had been written with GNU iconv in mind (or mark it as
BROKEN instead). Well, IMHO at least.

Cheers,
Michael


-- 
Michael Gmelin


More information about the freebsd-ports mailing list