[Home] [By Thread] [By Date] [Recent Entries]
Roger, I think to a certain extent that you're applying Ted Cobb's "12 rules of database design" to XML - in essence, what you've done here is given in XML a number of key normalization rules. The challenge that I find when attempting to do is the fact that such normalization can only occur in situations where there is what I term a low complexity to the schema (I address this in a distinctly un-user-friendly paper called <a href="http://www.understandingxml.com/archives/2005/01/information_los.html">Information Loss and Schema Complexity</a>, which represents some thoughts I've had on gaining a handle on the potentially complex nature of XML. It's pretty heavy reading, though I admit that I backed away from going to a more formal mathematical notation when I realized just how intimidating it looked). Keep in mind that in areas where you have complex documents (ones in which there are a large number of potential states for a given sequence of nodes in the document tree), normalization becomes much more difficult to maintain. Similarly, the more that you normalize content, the higher the amount of hierarchical restructuring becomes necessary when transforming that content into other representations (especially in terms of filters where you have multiple distinct flat structures that may have deep referential relationships). This is not to say that the rules that you're expressing aren't important - in general, you are exactly correct in saying that the best form that a schema can take (what I refer to as a canonical form in the same paper) is one where fields are effectively decoupled as much as possible, and where relationships are explicit. In a simple schema as you propose above, this canonicalization is obvious, but in a high complexity schema (such as a DocBook document) such canonicalization would be both expensive and largely irrelevent, as it is precisely the relationships themselves - the container/contained relationships in particular, that ARE the critical parts of the document. -- Kurt Cagle -- UnderstandingXML -- Kurt Cagle http://www.understandingxml.com
|

Cart



