Subject: Re: I18N / UTF-8 versus US-ASCII
From: David Carlisle <davidc@xxxxxxxxx>
Date: Tue, 4 Apr 2006 14:01:16 +0100
|
I wrote
> Of course the other cases where you can not use a restricted encoding
> are cases where the element or attribute names use non-ascii characters.
or in comments or processsing instructions or CDATA sections.
An XSL system will just avoid using CDATA sections if it needs to wite a
character reference, but even an "identity" transform will die if there
is a non ascii character in a comment in the source and the stylesheet
has <xsl:output encoding="US-ASCII"/>
er....
After writing the above I made a small test file to demonstrate this but.....
I hope this gets through without having non-ascii character mangled.,
the xml source is supposed to have a latin1-encode e acute.
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- i -->
<x/>
and the stylesheet just copies everything:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output encoding="US-ASCII"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
unfortunately I think both saxon 6 and 8 get this wrong, I'll forward
this to saxon's bug reporting list.
$ saxon comment.xml comment.xsl
<?xml version="1.0" encoding="US-ASCII"?><!-- é --><x/>
saxon6.5.4 seems to have made the comment into text so that it could use a
character reference for the e-acute.
$ saxon8 comment.xml comment.xsl
Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
<?xml version="1.0" encoding="US-ASCII"?><!-- ? --><x/>
saxon 8.7J keeps the comment but converts the non printable character to
a ?, I think that it's supposed to moan with err:SERE0008
David
________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
|