[Home] [By Thread] [By Date] [Recent Entries]

  • From: David Brownell <david-b@p...>
  • To: Elliotte Rusty Harold <elharo@m...>, xml-dev@l...
  • Date: Thu, 26 Jul 2001 08:45:49 -0700

> Furthermore, I think Java is broken enough here that Java needs to change.
> I don't think XML should be limited by this brain damage in Java.

I think "broken" is seriously overstating things.  It's not a real issue "now",
and in fact if you accept that UTF-16 is native, the issue is just a lack of
support for a missing feature that's got workarounds.

Anyone who really needs such support can code it themselves; it's clear
things could be better, but there's no fatal problem.  Just a need for an
overdue (!) update to the standard Java library.


>     One silver
> lining to the Blueberry cloud might be that it could convince Sun to use a
> four-byte char like they should have back in 1995. 

Nah, people complain enough about wasted space ... admittedly
there's a religous war on whether (in C terms) "wchar_t" should
be 16 bits or 32.

But I did expect Sun would have addressed the issue of variable
length characters in Java by now.  The paper David Jackson
pointed to (http://www.unicode.org/iuc/iuc16/b17/paper.pdf) is
from last year, but the issues weren't new then.  Variable length
characters show up in the case of combining marks there, not just
with surrogate pairs, and a 32-bit wchar_t won't help with the
case of combining marks:  fat wchar_t isn't sufficient.

- Dave




Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member