[Home] [By Thread] [By Date] [Recent Entries]

  • From: Miles Sabin <msabin@i...>
  • To: xml-dev@l...
  • Date: Wed, 25 Jul 2001 15:50:08 +0100

Elliotte Rusty Harold wrote,
> The Java way to handle this is to stop thinking of a Java char as 
> representing a Unicode character. It doesn't. A Java char represents 
> a UTF-16 code point, which may be a surrogate. The public API to 
> java.lang.String is essentially a UTF-16 API. For example, the 
> length() method of a string does not return the number of Unicode 
> characters in the string. Rather it returns the number of UTF-16 
> code points.

This is correct, but not yet officially documented in the Java
Language Specification. It got hammered out during the development
of the java.nio spec.

Cheers,


Miles

-- 
Miles Sabin                                     InterX
Internet Systems Architect                      27 Great West Road
+44 (0)20 8817 4030                             Middx, TW8 9AS, UK
msabin@i...                               http://www.interx.com/


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member