- From: James Clark <jjc@j...>
- To: xml-dev@l...
- Date: Thu, 9 Dec 2010 13:05:29 +0700
It's also a common misconception that Unicode is a 16-bit character set;
it defines more than 65536 characters, and "surrogate pairs" in
languages like Java make utf16 as complex as utf8; processing characters
in either utf-8 or ucs-32 are the most common choices outside the Java
world as far as I can tell.
UTF-16 is very common as an internal representation. Not just Java, also .NET, JavaScript, Windows, OS X, Symbian, IE, Mozilla, Opera, OpenOffice.org, Qt.
James
- References:
- nextml
- From: Amelia A Lewis <amyzing@t...>
- Re: nextml
- From: Michael Sokolov <sokolov@i...>
- Re: nextml
- From: Liam R E Quin <liam@w...>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|