Subject: CDATA with different encoding
From: Bartolomeo Nicolotti <bnicolotti@xxxxxxxxx>
Date: Wed, 26 Aug 2009 10:05:38 +0200
|
Hi,
I've searched the archives with CDATA and Encoding but nothing resulted
so here's my post, maybe trivial, really sorry...
We're receiving xml from a supplier encoded in ISO-8859-1 as specified
also by the direvtive:
<?xml version="1.0" encoding="ISO-8859-1" ?>
but the tags body are encoded in UTF-8.
This would cause the parser to fail:
http://www.w3schools.com/xmL/xml_encoding.asp
So the supplier has surrounded the tag bodies with CDATA.
<tag>[CDATA[ ...utf-8 ... ]]</tag>
Is this correct? I.e. is it possible to have a differnt encoding inside
a CDATA section from that of the xml?
I've googled a bit but havent found a clear response
We've built a parser with xmlbean last stable version, but the parser
complain about characters inside the tags that are UTF-8, but are
illegal in ISO-8859-1.
com.siap.DPKWebServices.Util.OTA_literal_HttpPost.queryHttp caught an
exception: 29047814 org.apache.xmlbeans.XmlException
e.toString():org.apache.xmlbeans.XmlException: error: Illegal XML
character: 0x1c
org.apache.xmlbeans.impl.piccolo.io.IllegalCharException: Illegal XML
character: 0x1c
at
org.apache.xmlbeans.impl.piccolo.xml.XMLReaderReader.read(XMLReaderReader.java:169)
at
org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yy_refill(PiccoloLexer.java:3474)
Many thanks
Best regards...
--
Bartolomeo Nicolotti
SIAP s.r.l.
www.siapcn.it
v.S.Albano 13 12049
TrinitC (CN) Italy
ph:+39 0172 652553
centralino: +39 0172 652511
fax: +39 0172 652519
|