[Home] [By Thread] [By Date] [Recent Entries]

  • From: "Costello, Roger L." <costello@m...>
  • To: "xml-dev@l..." <xml-dev@l...>
  • Date: Wed, 26 Dec 2012 20:02:19 +0000

Hi Folks,

The characters in this XML document are encoded using UTF-8:

<?xml version="1.0"?>
<Name>López</Name>

Its encoding can be changed to another encoding using this simple XSLT program:
---------------------------------------------------
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
                           version="1.0">
    
    <xsl:output method="xml"
                         encoding="Shift_JIS"/>
    
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    
</xsl:stylesheet>
---------------------------------------------------

The encoding attribute on <xsl:output> specifies the desired encoding. The rest of the XSLT program simply performs an identity copy operation.

Shift_JIS is the character encoding for the Japanese language. 

iso-8859-1 is a superset of ASCII. It consists of 191 characters (ASCII has 128 characters). It contains the characters for most Western European languages.

I applied the XSLT program to the above XML document, specifying encoding="iso-8859-1" and then encoding="Shift_JS"

Then, using a hex editor I was able to see, at the byte level, the changes that were made to the XML document's encoding.

---------------------------------------------
encoding="utf-8"
  L      ó      p   e   z
4C C3 B3 70 65 7A

Two bytes (C3 B3) used to encode ó 
---------------------------------------------
encoding="iso-8859-1"
 L    ó   p   e   z
4C F3 70 65 7A

One byte (F3) used to encode ó 
---------------------------------------------
encoding="Shift_JIS"
  L   &   #   x    f   3    ;    p   e   z
4C 26 23 78 66 33 3B 70 65 7A

ó is converted to a character reference
---------------------------------------------

Very cool!

For more info, see this excellent article: http://www.opentag.com/xfaq_enc.htm#enc_conv  

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member