[Home] [By Thread] [By Date] [Recent Entries]
<snip/> > Note that Java uses UTF-16, which isn't quite fixed-width, though no > one really notices. Err... David, I thought Java used UTF-8, actually a version slightly different from the "typical" version that expresses: Characters in the range \u0001 to \u007F in one byte: 0[bits 0-6] Characters in the range \u0080 to \u07FF and \u0000 in two bytes: 110[bits 7 -10] 10[bits 0-6] Characters in the range \u0800 to \uFFFF in three bytes: 1110[bits 12-15] 10[bits 6-11] 10[bits 0-5] (what's different from typical is that NULL is in two bytes, so there's no embedded nulls in java vm strings) .... However, It has been quite a while since the last time I looked... Have this changed in latest versions? Best, Fabio -- Fabio Arciniegas A. Viaduct Technologies, Inc. fabio@v... Software Engineer Interests: XML, Wittgenstein and just about everything in between. Oblique Strategy of the day: "Abandon normal instruments" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



