[Home] [By Thread] [By Date] [Recent Entries]

  • From: Liam R E Quin <liam@w...>
  • To: "Costello, Roger L." <costello@m...>
  • Date: Tue, 17 May 2011 17:07:25 +0200

On Tue, 2011-05-17 at 10:49 -0400, Costello, Roger L. wrote:
[...]
> The following statements apply to "data" not to "markup" (i.e.,
> element names, attribute names).
> 
> 1. Except for unpaired surrogate codepoints and a few control
> characters, you can use any character you want in XML documents.

In particular, codepoint 0 is not allowed.

> 2. The characters don't have to be defined in the Unicode
> specification.

The codepoints do not have to have Unicode characters associated with
them.

> 
> 3. For characters that don't have a visual representation or aren't in
> the Unicode character set, you can use them  via XML's character
> entity mechanism, e.g., &#xffed;
You can do that with any allowed character, and you can also include the
character directly.

> 
> 4. Implementers of XML applications are free to choose which version
> of Unicode they will support. Thus, one implementer of an XML Schema
> validator may choose to support Unicode 2.0, while another implementer
> of an XML Schema validator may choose to support Unicode 2.1. One
> implementer of an XSLT processor may choose to support Unicode 2.0,
> while another implementer of an XSLT processor may choose to support
> Unicode 2.1.

Or the version of Unicode understood may depend on the operating
environment, e.g. on the Java VM in use.
> 
> 5. In XML applications that use regular expressions (e.g. XML Schema,
> XSLT), be careful about using regexes that contain regex categories
> such as Nd. The characters in those regex categories may vary
> depending on which version of Unicode an implementer supports. Thus,
> your application may execute without errors with one vendor's tool and
> fail on another.

That may be what you want, it turns out.  "When our system is upgraded
our schema is ready for it"...

> 6. CREPDL is a technology that allows you to precisely define the
> universe of characters that you want to allow in your XML documents.

You can also do this with an XSD facet.

Liam


-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://www.fromoldbooks.org/
Occasional blog: http://www.barefootliam.org/
The barefoot typographer





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member