[Home] [By Thread] [By Date] [Recent Entries]

  • From: James Clark <jjc@j...>
  • To: xml-dev@l...
  • Date: Thu, 9 Dec 2010 13:05:29 +0700


It's also a common misconception that Unicode is a 16-bit character set;
it defines more than 65536 characters, and "surrogate pairs" in
languages like Java make utf16 as complex as utf8; processing characters
in either utf-8 or ucs-32 are the most common choices outside the Java
world as far as I can tell.

UTF-16 is very common as an internal representation. Not just Java, also .NET, JavaScript, Windows, OS X, Symbian, IE, Mozilla, Opera, OpenOffice.org, Qt.

James

  • Follow-Ups:
  • References:
    • nextml
      • From: Amelia A Lewis <amyzing@t...>
    • Re: nextml
      • From: Michael Sokolov <sokolov@i...>
    • Re: nextml
      • From: Liam R E Quin <liam@w...>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member