[Home] [By Thread] [By Date] [Recent Entries]

  • From: "James Tauber" <jtauber@j...>
  • To: <xml-dev@i...>
  • Date: Fri, 30 Jul 1999 09:16:34 +0800

----- Original Message -----
From: Chris Maden <crism@o...>
> [James Tauber]
> > I stepped through the code and it appears XP is treating it as
> > big-endian UTF-16. By the time XP is reading off its buffer of
> > bytes, they are 0x00 0xE2 0x20 0xAC 0x00 0xA2.
>
> What is XP using to read the file?  You mentioned you were using
> Microsoft's Java implementation; I suspect that the problem is there.
> The conversion of 0x80 to 0x20AC makes me very suspicious, because the
> use of 0x80 for Euro is a relatively recent Windows codepage change,
> so I'm inclined to suspect the Microsoft Java implementation.

I did too until I tried it with Sun's JDK and it behaved identically.

In com.jclark.xml.sax.Driver, there is a method OpenEntity that begins:

private OpenEntity openInputSource(org.xml.sax.InputSource inputSource)
throws IOException {
    Reader reader = inputSource.getCharacterStream();
    String encoding;
    InputStream in;
    if (reader != null) {
      in = new ReaderInputStream(reader);
      encoding = "UTF-16";
    }
    else {
      in = inputSource.getByteStream();
      encoding = inputSource.getEncoding();
    }
    ...

The encoding gets set there and never changes.

James Tauber


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member