[Home] [By Thread] [By Date] [Recent Entries]
Hi Folks,
The characters in this XML document are encoded using UTF-8:
<?xml version="1.0"?>
<Name>López</Name>
Its encoding can be changed to another encoding using this simple XSLT program:
---------------------------------------------------
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml"
encoding="Shift_JIS"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
---------------------------------------------------
The encoding attribute on <xsl:output> specifies the desired encoding. The rest of the XSLT program simply performs an identity copy operation.
Shift_JIS is the character encoding for the Japanese language.
iso-8859-1 is a superset of ASCII. It consists of 191 characters (ASCII has 128 characters). It contains the characters for most Western European languages.
I applied the XSLT program to the above XML document, specifying encoding="iso-8859-1" and then encoding="Shift_JS"
Then, using a hex editor I was able to see, at the byte level, the changes that were made to the XML document's encoding.
---------------------------------------------
encoding="utf-8"
L ó p e z
4C C3 B3 70 65 7A
Two bytes (C3 B3) used to encode ó
---------------------------------------------
encoding="iso-8859-1"
L ó p e z
4C F3 70 65 7A
One byte (F3) used to encode ó
---------------------------------------------
encoding="Shift_JIS"
L & # x f 3 ; p e z
4C 26 23 78 66 33 3B 70 65 7A
ó is converted to a character reference
---------------------------------------------
Very cool!
For more info, see this excellent article: http://www.opentag.com/xfaq_enc.htm#enc_conv
/Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



