[Home] [By Thread] [By Date] [Recent Entries]
At 03:18 PM 10/17/98 -0700, Richard Emberson wrote: >Now in production rule #2 titled Character Range >surrogate blocks are explicitly excluded (along >with FFFF and FFFE). There are no Unicode characters whose numeric values are those which appear in the surrogate blocks; the blocks exist only to ensure the possibility of encoding non-BMP characters unambiguously. The productions in the spec describe the characters themselves, not any particular encoding of them. >There are the extra, beyond 16-bit, characters specified >by the spec in production rule #2 as "[x10000-#x10FFFF]". >Is this how Unicode characters that use the surrogate >blocks get represented in an XML document? Yes. For example, it's legal to have 𐀁 >Short of getting a copy of the Unicode 2.0 spec, is there >anywhere where the conversion algorithm is documented? I strongly recommend getting a copy of the spec. It's fairly priced and a very fine piece of work. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



