[Home] [By Thread] [By Date] [Recent Entries]

  • To: xml-dev@l...
  • Subject: Re: Why would MS want to make XML break on UNIX, Perl, Python, etc ?
  • From: Richard Tobin <richard@c...>
  • Date: Sat, 22 Dec 2001 00:03:28 GMT
  • Cc:
  • In-reply-to: <E16HTSe-0000Jd-00@s...>
  • Organization: HCRC, University of Edinburgh


>> I cannot see why they would cause any problems that UTF-16 doesn't.
>
>Sure, but then you talk about NUL being bad. UTF-16 includes a lot of zero 
>bytes (as you know) so the point is moot.

No!  There is a crucial difference here:

The supposed problem with control characters in general is that the
presence of those bytes in files causes problems.  I am suggesting
that UTF-16 already gives us those problems.

But the specificproblem with nul is in XML APIs.  Any C API that uses
nul-terminated strings will not be able to handle nuls in those
strings.  If the strings are UTF-16 strings, they are terminated by a
UTF-16 nul character, not by a single zero byte.  UTF-16 characters
with zero bytes are not a problem.

-- Richard

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member