misc/112636: Command line arguments are byteswapped before being passed to the program runing in custom locale.

Sergiy Vyshnevetskiy serg at vostok.net
Sun May 13 17:40:05 UTC 2007


>Number:         112636
>Category:       misc
>Synopsis:       Command line arguments are byteswapped before being passed to the program runing in custom locale.
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun May 13 17:40:04 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Sergiy Vyshnevetskiy
>Release:        6.2
>Organization:
>Environment:
FreeBSD serg2.vostok.net 6.2-STABLE FreeBSD 6.2-STABLE #0: Sun Apr 29 18:08:21 EEST 2007     serg at serg2.vostok.net:/o0/obj/usr/6/src/sys/SERG  i386
>Description:
Conversion to UCS-2 encoding in GNU libiconv returns bytes in network order. gcj wants to have them in host order. The hack in gcj swaps bytes when necessary. However, command line arguments slip by the swapper (or go through it even number of times).

>How-To-Repeat:
cat >Example.java <<EOF
import java.io.File;

class Example
{
  public static void main(String args[])
  {
    System.out.println(args[0]);
  }
}
EOF
gcj41 -o Example --main=Example Example.java
LANG=ru_RU.KOI8-R ./Example abc

produces

???

instead of 

abc
>Fix:
To avoid swapping it is possible to use UCS-2-INTERNAL encoding instead of UCS-2 for all internal character data. Command line bug is "fixed" by this as well.

This hack works for systems with GNU libiconv, like FreeBSD.

Patch attached with submission follows:

--- gcc/java/lex.c.orig	Tue Aug 16 21:46:18 2005
+++ gcc/java/lex.c	Sun May 13 03:35:35 2007
@@ -184,7 +184,7 @@
 #endif
 
 #ifdef HAVE_ICONV
-  lex->handle = iconv_open ("UCS-2", encoding);
+  lex->handle = iconv_open ("UCS-2-INTERNAL", encoding);
   if (lex->handle != (iconv_t) -1)
     {
       lex->first = -1;
@@ -204,7 +204,7 @@
 
 	  byteswap_init = 1;
 
-	  handle = iconv_open ("UCS-2", "UTF-8");
+	  handle = iconv_open ("UCS-2-INTERNAL", "UTF-8");
 	  if (handle != (iconv_t) -1)
 	    {
 	      unicode_t result;
--- libjava/gnu/gcj/convert/natIconv.cc.orig	Fri Nov 14 03:48:30 2003
+++ libjava/gnu/gcj/convert/natIconv.cc	Sun May 13 03:41:25 2007
@@ -45,7 +45,7 @@
   _Jv_GetStringUTFRegion (encoding, 0, encoding->length(), buffer);
   buffer[len] = '\0';
 
-  iconv_t h = iconv_open ("UCS-2", buffer);
+  iconv_t h = iconv_open ("UCS-2-INTERNAL", buffer);
   if (h == (iconv_t) -1)
     throw new java::io::UnsupportedEncodingException (encoding);
 
@@ -145,7 +145,7 @@
   _Jv_GetStringUTFRegion (encoding, 0, encoding->length(), buffer);
   buffer[len] = '\0';
 
-  iconv_t h = iconv_open (buffer, "UCS-2");
+  iconv_t h = iconv_open (buffer, "UCS-2-INTERNAL");
   if (h == (iconv_t) -1)
     throw new java::io::UnsupportedEncodingException (encoding);
 
@@ -260,7 +260,7 @@
   // heuristic, but is is all we've got.)
   jboolean result = false;
 #ifdef HAVE_ICONV
-  iconv_t handle = iconv_open ("UCS-2", "UTF-8");
+  iconv_t handle = iconv_open ("UCS-2-INTERNAL", "UTF-8");
   if (handle != (iconv_t) -1)
     {
       jchar c;

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list