[Home] [By Thread] [By Date] [Recent Entries]


> Or we find an interoperable way to transport/encode the control
> characters (agree on entities or char references or PIs).

I would very much prefer to do this than to allow those naked codes to appear 
in text. I support the idea of finding a way to encode such data, rather than 
include it per se (as per Derek's suggested change in focus).

Numeric character references () are essentially the same as the literal 
data (once parsed the distinction is lost) so I would not support their use.

PI's, while being one mechanism, are application-specific, so are probably 
not ideal.

That leaves us with entities. Perhaps something along the lines of creating a 
"virtual" enitity set in the &Unnnn; space? This was suggested in the ERCS 
days... 

  "an XML 1.1 processor may interpret entity references beginning with the
   letter 'U', followed by 4 hexadecimal characters as representing an
   entity holding the representation of the Unicode Scalar equivalent of 
   the number."

This would provide a standard naming scheme for entities representing code 
points, but leave the exact resolved value undefined. No value is necessary 
anyway, as the entity reference provides all the needed information.




Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member