[Home] [By Thread] [By Date] [Recent Entries]
Paul Prescod wrote: > XML DTDs are in the business of constraining people to the data models and > data that the software is expecting/can deal with. I don't see any big > difference between saying: "This content must be restricted to this set of > characters" and "this content must be a NMTOKEN or base-64 encoded." Put that way, I suppose you are right. As I said before, this could and should be handled as a special case of "The character data of this element must conform to the following regular expression." > Nevertheless, this is clearly a schema problem and CDATA sections seem to > me to be a really bad tool for enforcing this distinction. Particularly because it would mean that the charset of an XML document would become part of its schema: a document in US-ASCII can have only ASCII in its CDATA sections, but if it were transcoded to ShiftJIS, then it could have any JIS X 208 character in the CDATA section. So this means that transcoding arbitrary XML documents *requires* parsing them, because if you are reducing the repertoire, you may need to break up CDATA sections, and you cannot (?) recognize a CDATA section reliably without parsing. (In particular, what looks like a CDATA section start/end could appear as an attribute value, PI data, or comment.) An interesting side effect! -- John Cowan http://www.ccil.org/~cowan cowan@c... You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



