Subject: Smart Quote Encoding
From: "Roger L. Cauvin" <roger@xxxxxxxxxx>
Date: Wed, 12 Sep 2007 11:56:38 -0500
|
I am using Saxon 6.3 and trying to transform some XML using a stylesheet.
The XML is a log file that logs incoming text-only e-mail messages. The
messages sometimes contain special/nonstandard characters, such as smart
quotes. If I want to be able to log the verbatim messages yet still be able
to apply XSLT, what is my best strategy?
With XML such as:
<message-received>
<from><![CDATA[John Spong <jspong@xxxxxxxxx>]]><text>
<text><![CDATA[Descartes said, I think, therefore I am.]]><text>
</message-received>
(The characters are smart quotes.)
I receive the following error when I try to apply transformations:
Fatal error reported by XML parser: illegal XML character U+18
URL: file:/C:/hello/goodbye.log
Line: 8
Column: 116
Error
org.xml.sax.SAXParseException: illegal XML character U+18: illegal XML
character U+18
Transformation failed
The XML file contains the following encoding declaration:
<?xml version="1.0" encoding="ISO-8859-1"?>
I have also tried UTF-8 and US-ASCII encodings, with the same results.
How do I handle any arbitrary text yet still be able to apply
transformations?
--
Roger L. Cauvin
Cauvin, Inc.
Product Management/Market Research
|