Subject: Re: recursive replacing strings with nodes
From: James Cummings <james+xsl@xxxxxxxxxxxxxxxxx>
Date: Fri, 19 Feb 2010 14:24:59 +0000
|
On Fri, Feb 19, 2010 at 11:57, Martin Honnen <Martin.Honnen@xxxxxx> wrote:
> Here is a stylesheet trying to solve that
Wow, that seems to do what I want! See comments inline where I try to
understand what is going on (so that when I google for this in a
couple years I can see what I thought was happening!). Thanks
Martin. (Someone off-list sent me a perl script that might accomplish
the same thing... but I'd prefer to do it in XSLT if possible ;-) )
For posterity:
> <xsl:stylesheet
> B xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> B version="2.0"
> B xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> B xmlns:mf="http://example.com/2010/mf"
> B xmlns:functx="http://www.functx.com"
> B exclude-result-prefixes="xsd mf functx">
>
> B <xsl:function name="functx:escape-for-regex" as="xsd:string"
> B B B B B B B B >
> B B <xsl:param name="arg" as="xsd:string?"/>
>
> B B <xsl:sequence select="
> B B replace($arg,
> B B B B B B '(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\))','\\$1')
> B "/>
> B </xsl:function>
Include the functx:escape-for-regex to escape the strings because I
intentionally made sure some of the strings in my sample had
regex-nasty characters like + and such (because my real input does).
> B <xsl:param name="abbr-url" as="xsd:string"
select="'test2010021902.xml'"/>
> B <xsl:variable name="abbr" as="element(abbr)*"
> select="doc($abbr-url)/root/choice/abbr"/>
Load nodeset of abbr elements as a variable from the lookuptable file
storing them as elements
Define the mf:replace function:
> B <xsl:function name="mf:replace" as="node()*">
> B B <xsl:param name="str" as="xsd:string"/>
> B B <xsl:param name="abbr" as="element(abbr)*"/>
which has two parameters a string and an abbr element
> B B <xsl:choose>
> B B B <xsl:when test="$abbr">
> B B B B <xsl:analyze-string select="$str"
> regex="{functx:escape-for-regex($abbr[1])}">
analyze string provided looking for the first abbr (escaped for any regex)
> B B B B B <xsl:matching-substring>
> B B B B B B <xsl:copy-of select="$abbr[1]/../expan/w"/>
> B B B B B </xsl:matching-substring>
when it matches, go up to parent and copy-of the content of expan/w
> B B B B B <xsl:non-matching-substring>
> B B B B B B <xsl:sequence select="mf:replace(., $abbr[position() gt
1])"/>
> B B B B B </xsl:non-matching-substring>
when it doesn't match take the next thing in the implicit sequence
inside the abbr recursively calling mf:replace()
> B B B B </xsl:analyze-string>
> B B B </xsl:when>
> B B B <xsl:otherwise>
> B B B B <xsl:value-of select="$str"/>
> B B B </xsl:otherwise>
If there isn't $abbr then put our the string.
> B B </xsl:choose>
> B </xsl:function>
>
standard copy-all template:
> B <xsl:template match="@* | node()">
> B B <xsl:copy>
> B B B <xsl:apply-templates select="@*, node()"/>
> B B </xsl:copy>
> B </xsl:template>
>
Everytime you come across a seg kick this off by copying it and for
its contents making a sequence of mf:replace()
> B <xsl:template match="seg">
> B B <xsl:copy>
> B B B <xsl:sequence select="mf:replace(., $abbr)"/>
> B B </xsl:copy>
> B </xsl:template>
>
> </xsl:stylesheet>
>
Thanks Martin!
-James
|