Subject: RE: xslt 2, grouping and more on indexing.
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Fri, 27 Aug 2004 12:29:09 +0100
|
I think we've had a very similar problem to this before.
David Carlisle suggested first converting the markup to textual delimiters,
then doing the regex processing, then converting the delimiters back to
markup.
I suggested processing each text node independently to add markup.
In this situation I think the latter approach works better: instead of
<xsl:copy-of select="dp:sep(text()[1])"/>
do an apply-templates (perhaps in a special mode) that does a recursive
descent of the entry subtree, applying regex processing to each text node
that you find.
Michael Kay
> -----Original Message-----
> From: David.Pawson@xxxxxxxxxxx [mailto:David.Pawson@xxxxxxxxxxx]
> Sent: 27 August 2004 11:33
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: xslt 2, grouping and more on indexing.
>
>
> Given
>
> <index>
> <head>INDEX</head>
> <ientry1>abbreviation point</ientry1>
> <ientry2>and use of letter sign, 5.6.2, 5.6.6</ientry2>
> <ientry2>in abbreviations, 7.1.1-2</ientry2>
> <ientry2>in designations, 5.6.14</ientry2>
> <ientry2>in italicized abbreviations, 5.5.16</ientry2>
> <ientry2>in personal initials, 7.1.6</ientry2>
> <ientry2>in postcodes, 7.1.6</ientry2>
> <ientry2>in reference abbreviations, 7.3.2-3, 7.3.6</ientry2>
> <ientry2>in references, 7.3.1-4, 7.3.6</ientry2>
> <ientry2>in Welsh, Appendix I (E)</ientry2>
> <ientry3>and elided vowels, 5.1.7</ientry3>
> <ientry3>French, Appendix II (A)</ientry3>
> <ientry3>German, Appendix II (B)</ientry3>
> <ientry3>in foreign ordinal terminations, 6.6.2, 6.7.5</ientry3>
> <ientry1>French, Appendix II (A); see also <i>accented letters</i>;
> <i>accent sign</i>; <i>foreign, words and names</i></ientry1>
>
>
> I need to group by level, i.e. level 2 inside 1, 3 inside 2.
> No problem with xslt 2.
>
> The problem is the markup within the entries.
> I'm processing the 'text' of the entries to mark up the references,
> which spoils the markup (<i> in the example).
>
> current output is
>
> <ientry1>
> <ent>foreign</ent>
> <ientry2>
> <ent>alphabets, see </ent>
> </ientry2>
> <ientry2>
> <ent>ordinal numbers, <r>6.6.2</r>, <r>6.7.5</r>
> </ent>
> </ientry2>
> <ientry1>
> <ent>French, <r>Appendix II (A); see also accented
> letters; accent sign;
> foreign, words and names</r>
> </ent>
> </ientry1>
>
>
>
> Any suggestions please?
> How to use text... and copy-of instead of
> Current code below
>
> <xsl:template match="index">
> <head>INDEX</head>
> <xsl:for-each-group select="*"
> group-starting-with="ientry1" >
> <ientry1>
> <xsl:copy-of select="dp:sep(.)"/>
> <xsl:for-each-group
> select="current-group()[position()>1]"
> group-starting-with="ientry2">
> <ientry2><xsl:copy-of select="dp:sep(text()[1])"/>
> <xsl:for-each-group
> select="current-group()[position()>1]"
> group-starting-with="ientry3">
> <ientry3><xsl:copy-of select="dp:sep(text()[1])"/>
> </ientry3>
> </xsl:for-each-group>
> </ientry2>
> </xsl:for-each-group>
> </ientry1>
> </xsl:for-each-group>
> </xsl:template>
>
>
>
> <xsl:function name="dp:sep">
> <xsl:param name="ientry" as="node()*"/>
> <ent>
> <xsl:analyze-string select="$ientry" regex="([0-9][.0-9\-]*)">
> <xsl:matching-substring>
> <r><xsl:value-of select="regex-group(1)"/></r>
> </xsl:matching-substring>
> <xsl:non-matching-substring>
> <xsl:analyze-string select="." regex="(Appendix .*$)">
> <xsl:matching-substring>
> <r><xsl:value-of select="regex-group(1)"/></r>
> </xsl:matching-substring>
> <xsl:non-matching-substring>
> <xsl:value-of select="."/>
> </xsl:non-matching-substring>
> </xsl:analyze-string>
> </xsl:non-matching-substring>
> </xsl:analyze-string>
> </ent>
> </xsl:function>
>
>
>
>
>
>
>
> Regards DaveP.
>
> **** snip here *****
>
> --
> DISCLAIMER:
>
> NOTICE: The information contained in this email and any
> attachments is
> confidential and may be privileged. If you are not the intended
> recipient you should not use, disclose, distribute or copy any of the
> content of it or of any attachment; you are requested to notify the
> sender immediately of your receipt of the email and then to delete it
> and any attachments from your system.
>
> RNIB endeavours to ensure that emails and any attachments
> generated by
> its staff are free from viruses or other contaminants. However, it
> cannot accept any responsibility for any such which are transmitted.
> We therefore recommend you scan all attachments.
>
> Please note that the statements and views expressed in this email and
> any attachments are those of the author and do not
> necessarily represent
> those of RNIB.
>
> RNIB Registered Charity Number: 226227
>
> Website: http://www.rnib.org.uk
|