Odd String behavior

Milo Hyson milo at cyberlifelabs.com
Tue Nov 4 14:37:49 PST 2003


For some reason, the following code fragment performs differently on two 
different machines:

int  length  = (int)htmlFile.length();
byte bytes[] = new byte[length];
logger.info("length = " + length);

InputStream in = new FileInputStream(htmlFile);

for (int offset=0; length>0; )
{
    int numRead = in.read(bytes, offset, length);
    logger.info("numRead = " + numRead);

    if (numRead == -1)
        break;

    offset += numRead;
    length -= numRead;
}

in.close();

String rawHTML = new String(bytes, "UTF-8");
logger.info("rawHTML = " + rawHTML.length() + " characters");


On 4.9-RELEASE with native JDK 1.4.1-p3, I get the following correct output:

30799 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - length = 18936
30799 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - numRead = 18936
30806 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - rawHTML = 18936 characters


However, on 4.5-RELEASE with native JDK 1.3.1-p7, I get the following 
erroneous output:

3046 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - length = 18936
3046 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - numRead = 18936
3052 [main] INFO com.internetdentalalliance.doorway.builder.TemplateFactory  - rawHTML = 11419 characters


It's losing 40% of the characters when converting the raw bytes to a 
String. Any ideas?

-- 
Milo Hyson
Chief "Mad" Scientist and Director of Asian Operations
CyberLife Labs, LLC




More information about the freebsd-java mailing list