[Home] [By Thread] [By Date] [Recent Entries]

  • From: "Simon St.Laurent" <simonstl@s...>
  • To: "xml-dev@l..." <xml-dev@l...>
  • Date: Mon, 16 Dec 2013 18:37:26 -0500

On 12/16/13 5:29 PM, Hans-Juergen Rennau wrote:
The expression-oriented thinking practised in XML technology stops
abruptly at the border provided by XML syntax. Differences of encoding,
quote character, use of entities, etc. are abstracted away and defined
to be irrelevant to the information content - as long as the text in
question is XML. But HTML is "something else", not XML. The standards
will not allow to parse an HTML document into a node tree. The prevalent
thinking seems to be that text resources defined to encode node trees
must be XML text. Is there a good reason, apart from inertia of habit?
I'm not sure that border is particularly sharp, even with inertia of habit. There was XHTML, and even with the (not my favorite) parsing specified by HTML5, there's definitely a clear path to a node tree.

HTML folks rarely use XPath and XQuery, generally preferring CSS selectors and JavaScript, but they certainly do work with node trees when convenient. XML folks have been applying XPath and XQuery to HTML for a long while now - as John Cowan's Tag Soup demonstrates, even to HTML where such things are difficult.

There is also a growing, though I think unfortunate, school of web application design that thinks of HTML as merely a serialization for the underlying DOM created and manipulated through JavaScript.

Thanks,
--
Simon St.Laurent
http://simonstl.com/


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member