[Home] [By Thread] [By Date] [Recent Entries]


Uche Ogbuji wrote:
>>>No.  It's illegal to have ]]> anywhere in character content except at
>>
>>Dang, you're right.  oops.
>>
>>
>>>True, although I think it's simpler to just pick one quotation style
>>>and escape a single character rather than trying to be clever about
>>>using the "correct" quotation mark in each case.
>>
>>Prob'ly so.  That puts us up to < & > " as special.  Still reasonable,
>>I think, even if it doubled my original claim.
> 
> 
> Perhaps, but I think this little exchange also demonstrates my point that it's 
> never as simple as one thinks it is.
> 
> If this sort of thing tripped you up, Rich, imagine the potential for failure 
> by the average programmer.

It's worse than this.  If your infoset contains a carriage return, you 
have to output it as a numeric character reference, otherwise line-end 
normalization will turn it into a line-feed. Similarly, if attribute 
values in the infoset contain line-feeds or tabs, they need to be output 
as numeric character references, otherwise attribute value normalization 
will turn them into spaces.

If you still think it's easy, try serializing the infoset you get from this:

<!DOCTYPE doc [
<!ENTITY e "<?x y&#13;?>">
]>
<doc>&e;</doc>

James


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member