Table of contentsAppendices |
2.2 CharactersCharactersA parsed entity contains text, a sequence of Character, which may represent markup or character data. A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO10646]. Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646. The versions of these standards cited in [Normative References] were current at the time this document was prepared. New characters may be added to these standards by amendments or new editions. Consequently, XML processors MUST accept any character in the range specified for Char. Character RangeThe mechanism for encoding character code points into bit patterns MAY vary from entity to entity. All XML processors MUST accept the UTF-8 and UTF-16 encodings of Unicode [Unicode]; the mechanisms for signaling which of the two is in use, or for bringing other encodings into play, are discussed later, in [Character Encoding in Entities]. NOTE: |