[Home] [By Thread] [By Date] [Recent Entries]

  • From: Tim Bray <tbray@t...>
  • To: xml-dev@l...
  • Date: Mon, 10 Sep 2001 13:13:08 -0700

A bit of correspondence off-list reveals that I'm not the only
person who regularly does the following: J. Random Hacker sends
me a chunk of something alleged to be XML.  First thing I usually
do is open it up in IE.  If it isn't XML, IE says so, and what 
the problem is, in a good and effective way.  If it is, I get 
that nice prettyprinted display so I can get a feel for the data.

For this app, this IE6 character-handling bug is particularly
horrible.  It isn't a corner case.  For a programmer generating
XML in C or Java or whatever, one of the easiest and most 
common mistakes to make [of course *I've* never done this :)]
is to screw up and get bogus character data in the output 
stream.  One of the nice side-effects of XML's intolerance of 
control characters is that this kind of screw-up very often 
leads to characters with values like 0 or 5 making their way 
into the XML, which bust well-formedness.  It is guaranteed
that expat or xerces or in fact a reasonably modern MSXML
will properly toss such data - and a good thing too, one
shudders at the thought of character data with null bytes
in the bowels of much of the C-family code out there.

Anyhow, Microsoft REALLY SHOULD FIX this one PDQ.  It's
bad. -Tim


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member