cvs commit: src/sys/sys iconv.h

R. Imura imura at FreeBSD.org
Sun Jul 3 01:27:44 GMT 2005


imura       2005-07-03 01:12:37 UTC

  FreeBSD src repository

  Modified files:
    sys/sys              iconv.h 
  Log:
  Switch Unicode charset name from "ISO-10646-UCS-2" to "UTF-16BE".
  Using ISO-10646-UCS-2 will cause a problem when we use our own
  iconv functions in the future, or port iconv other than GNU
  libiconv.
  
  Each vendors treat "UCS-2" as follows, and endian issue is
  vendor specific:
  
   - Solaris 8 iconv
    Little Endian with BOM
  
   - HP-UX iconv
    Big Endian
  
   - NetBSD/i386 1.6 iconv
    Little Endian
  
   - GNU libiconv
    Big Endian
  
   - glibc(RedHat AS 2.1 x86) iconv
    Little Endian
  
   - IANA
    Name: ISO-10646-UCS-2
    MIBenum: 1000
    Source: the 2-octet Basic Multilingual Plane, aka Unicode
            this needs to specify network byte order: the standard
            does not specify (it is a 16-bit integer space)
    Alias: csUnicode
  
   - MSDN
    Little Endian
    http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemtextencodingclassgetencodingtopic2.asp
  
  Now using UTF-16BE is harmless, because
  - same as UCS-2 with 2 byte range (U+0000 - U+FFFF)
  - kernel code of each file systems(cd9660, msdosfs, ntfs) believes
    Unicode is a 2 byte character at this time.
  - UDF has only 2 byte range of Unicode filenames.
  - It's defined at RFC2781.
  
  So I believe it's time to change before starting new RELENG_6. :)
  
  Approved by:    re (scottl)
  
  Revision  Changes    Path
  1.11      +1 -1      src/sys/sys/iconv.h


More information about the cvs-src mailing list