Re: domain names and internationalization?

From: Stefan Esser <se_at_FreeBSD.org>
Date: Tue, 20 Sep 2022 07:31:08 UTC
Am 19.09.22 um 22:27 schrieb Rick Macklem:
> Hi,
> 
> Recently there has been discussion on the NFSv4 IETF working
> group email list w.r.t. internationalization for the domain name
> it uses for users/groups.

Hi Rick,

I do assume that you know about RFC 3492 (Punycode):

	https://datatracker.ietf.org/doc/html/rfc3492

> Right now, I am pretty sure the FreeBSD nfsuserd(8) only works
> for ascii domain names, but...

You can manually translate domain names into their Punycode
representation. The NFS code could work with them and only
translate them back to UTF-8 (or whatever) for display purposes.

For pure ASCII this is an identity transformation, for names
that actually represent UTF-8 strings, the value to send to
DNS servers (and to locally store in the daemon) could be the
internally stored Punycode representation.

> I am hoping someone knows what DNS does in this area (the
> working group list uses terms like umlaut, which I have never
> even heard of;-).

That's the contraction of "ae", "oe", "ue" that has long ago
been introduced into the German writing system, with the "e"
abbreviated to two dots above the vocal, e.g. "ae" --> "รค".
Just a convenience rule to speed up manually copying the bible
in monasteries in medieval times ;-)

But there are many other accented letters in other languages,
that can be used in internationalized domain names, and the
whole set of Unicode characters can be represented using
Punycode.

> I know essentially nothing about internationalization, so any hints
> will be appreciated.

For a start:

	https://en.wikipedia.org/wiki/Internationalized_domain_name
	https://en.wikipedia.org/wiki/Punycode

There are C implementations of the transformations, e.g. in the
dns/libidn2 port.

We do not seem to have equivalent library functions in the
FreeBSD base system yet, but probably should provide them.

Best regards, STefan