[Home] [By Thread] [By Date] [Recent Entries]
Gary,
At 09:56 AM 3/14/2006, you wrote: On 14/03/06, andrew welch <andrew.j.welch@xxxxxxxxx> wrote: Oh the input is *double* escaped? Fun. Two available options: 1. Isolate a transformation step to write a file in which the double-escaping is removed, in effect by resolving the "& amp;" entity to "&" so the file presents an honest "& nbsp;" -- then parse as normally. But this step in the pipeline has either to write a file or pass the data through, say, a SAX filter -- it has to serialize the data somehow, for reparsing: it can't work completely within XSLT's world of trees (the logical view). As Mike just observed, you're having to work with the lexical layer of the markup before it represents what it's supposed to represent (what you, but not the computer, knows it "actually" represents through the double-escaped entities). 2. Use string processing. Since you're using XSLT2.0 this is a reasonable option. A regular expression could be used to match the fake entities and turn them into something more useful. Probably this process would have to write a file too, to be parsed again, unless you used some kind of internal lookup table to take the place of the set of entity declarations (which are only available to a parser). I hesitate to say more, as XSLT 2.0 gives much better facilities for handling such things than 1.0 did. (I could tell you about 1.0 tricks, but why?) But since I haven't tried them out myself, I can only direct your attention to them. Note that both these approaches assume that your files actually parse. The error message you reported before suggests they don't. But maybe you have the unescaping thing working and need to invoke the entity declarations on the output to get it to parse properly -- that error message was upon parsing the *output*? (You're not the only one confused now.) Cheers, Wendell
|

Cart



