Subject: Re: mixed content grouping by whitespace
From: James Cummings <james@xxxxxxxxxxxxxxxxx>
Date: Mon, 12 Apr 2010 10:37:17 +0100
|
On Sun, Apr 11, 2010 at 20:17, Imsieke, Gerrit, le-tex
<gerrit.imsieke@xxxxxxxxx> wrote:
> I applied a two-step process:
> 1. Mark up whitespace using intermediate <seg @type="sep"> </seg>;
> 2. group adjacent WS (and non-WS) nodes, put the non-WS groups in a newly
> created w element.
Two solutions from Gerrit and Ken... but I've got some questions to
help my understanding...
> B <xsl:template match="tei:seg" >
> B B <xsl:copy>
This is taking place in an xsl:copy to copy the surrounding tei:seg
element, right?
> B B B <xsl:variable name="sep">
> B B B B <xsl:apply-templates mode="sep" />
> B B B </xsl:variable>
This is the first pass, it goes and creates the whitespace
seg/@type='sep' with a matching string and just puts out the text
content with a non-matching string. Elements being copied with a
copy-all template
> B B B <xsl:for-each-group select="$sep/node()"
> B B B B group-adjacent="boolean(self::tei:seg[@type='sep'])">
This groups the nodes in the variable you've created by the boolean
(so the truth or falsehood of whether the pattern matches? I didn't
know you could do that in a group-* pattern) of the existence of the
segs you've created on tei:seg/text() which mark the whitespace.
> B B B B <xsl:choose>
> B B B B B <xsl:when test="current-grouping-key()">
> B B B B B B <xsl:value-of select="current-group()" />
> B B B B B </xsl:when>
When it is one of those whitespace segs, then just put out the value
of the whitespace, temporary element vanishes.
> B B B B B <xsl:otherwise>
> B B B B B B <w xmlns="http://www.tei-c.org/ns/1.0">
> B B B B B B B <xsl:apply-templates select="current-group()"/>
> B B B B B B </w>
> B B B B B </xsl:otherwise>
Otherwise, wrap it in a word element.
> B <xsl:template match="tei:seg/text()" mode="sep">
> B B <xsl:analyze-string select="." regex="\s+">
> B B B <xsl:matching-substring>
> B B B B <seg type="sep" xmlns="http://www.tei-c.org/ns/1.0">
> B B B B B <xsl:value-of select="."/>
> B B B B </seg>
> B B B </xsl:matching-substring>
> B B B <xsl:non-matching-substring>
> B B B B <xsl:value-of select="."/>
> B B B </xsl:non-matching-substring>
> B B </xsl:analyze-string>
> B </xsl:template>
analyze-string on whitespace on text nodes inside tei:seg, when it is
a match wrap it in a new seg, otherwise, just put it out. This is in
mode 'sep' and is only applied inside the sep variable above.
Cool! Thanks Gerrit, I certainly learned something new!
-James
|