Subject: Re: Processing XML with multiple nested CDATA sections
From: "Christopher R. Maden" <crism@xxxxxxxxx>
Date: Thu, 28 Feb 2013 18:59:50 -0500
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/28/2013 06:47 PM, dvint@xxxxxxxxx wrote:
> I have an XML file that is an export from a Wiki site. The
> management information for the wiki is in clear XML, bu tthe
> information contained in the pages (actual content) has been
> wrapped in CDATA sections. Some of these CDATA sections have CDATA
> sections in them. I need to extract the content and create
> individual files for each of the pages.
You will need to preprocess the content, I think.
There is no such thing as a nested CDATA section (just like there is
no such thing as a nested comment). The first ]]> ends the currently
open CDATA section, period. It looks like the generating software
gets around this, if your pasted example, is accurate, by inserting a
space; the nested sections end in ]] >
(bracket-bracket-space-greaterthan).
You could handle this in XSLT, but I think thatbs doing it the hard way.
~Chris
- --
Chris Maden, text nerd <URL: http://crism.maden.org/ >
FIVE TONS OF FLAX
GnuPG fingerprint: DB08 CF6C 2583 7F55 3BE9 A210 4A51 DBAC 5C5C 3D5E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/
iQEcBAEBAgAGBQJRL+92AAoJEEpR26xcXD1exLsIAIzpRmvlh+d7+GT6S2UPmrQD
ts1QKkFi+Dkc2abFum/0esl78ryb5S0fjuZPd0S2qFA82pIphvcKlsk9mcpr/g7x
gCrXUm7d7VJDOF5Wr6ccQIdFlvkyoKqNVsH2l3zKU6BPVktx1j2IMbXi5oIPYHXw
1kObMQ89eEx7iW1jc7/xVKuRyKycwltMeNd4DEMoQnYpg1y+5UgukDESMWXstbY4
u7kTfDeufpU9mKG7VdQbKYzBdn++v5hKw4LiV+tNfWcMTvHxUrjAnZ2xdQyG2tsa
tYqSWhQBDNpz0/W6ESmJdFdHMA41yTsof1mz4F8833qxnwDbr4jOVq6BaiTltpg=
=s6no
-----END PGP SIGNATURE-----
|