[Home] [By Thread] [By Date] [Recent Entries]

  • From: Richard Tobin <richard@c...>
  • To: xml-dev@l...
  • Date: Wed, 28 Mar 2001 12:37:55 +0100 (BST)

> Could someone explain to me why CDATA section start/end markers were
> taken out of the W3C Infoset?

Two main reasons:

(a) They are not robust in the face of character-set translation.  The
    characters that can appear in them are just the ones in your encoding,
    since character and entity references are not available.  So if you,
    say, include a Cyrillic letter in a CDATA section in your UTF-8
    document, and then translate it to Latin-1, you will have to break
    the CDATA section to use a character reference.

(b) They are purely syntactic sugar.  Nothing (except an editor or similar)
    should treat the following differently:

    AT&#38;T is a large corporation
    <![CDATA[AT&T]]> is a large corporation
    <![CDATA[AT&T is a large corporation]]>

> I was hoping to use the Infoset in writing the spec for XML Script,

Did you intend to give variants like those in (b) different meanings
in XML Script?

-- Richard

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member