[Home] [By Thread] [By Date] [Recent Entries]


Ronald Bourret wrote:
 > This points out something that should be a requirement for binary XML:
 > lossless roundtripping. In other words, you should be able to go from
 > the text serialization to the binary serialization and back losslessly
 > (within the confines of canonical XML). Same is true for binary <=>
 > text, binary <=> binary, and (of course) text <=> text.

Of course text <=> text? This doesn't work today. I don't keep a list, 
but off the top of my head. Information in the text such as character 
references and internal general entity references in attribute values 
are removed by parsers (e.g., SAX) and are not available to write back 
out again. This is a perennial source of XSLT questions. Until SAX2 
Extensions 1.1, SAX didn't report the xml declaration, so the 
application didn't know the original encoding. The application couldn't 
tell which attribute values were specified in the document and which 
came from the DTD as defaults. As ERH points out, canonicalization loses 
the DOCTYPE declaration. And so on.

It has taken many years and several iterations to get XML parsers to the 
point where they are even close to supporting roundtripping. Imagine if 
this had been a "requirement" for XML 1.0.

Bob Foster



Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member