Subject: RE: User-defined function for linenumber
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 1 Aug 2007 09:39:38 +0100
|
This feels horrendously inefficient. Why not instead implement a SAX filter
that adds the line number as an extra attribute to every element?
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: jesper.tverskov@xxxxxxxxx
> [mailto:jesper.tverskov@xxxxxxxxx] On Behalf Of Jesper Tverskov
> Sent: 01 August 2007 09:11
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: User-defined function for linenumber
>
> Hi list
>
> I am trying to make a user-defined function that can return
> the linenumber of a node (yes I know Saxon has an extension
> function doing the same). So far my solution works for
> element nodes and that is good enough for now.
>
> But I am using the analyze-string element. I would like to
> find a solution not using analyze-string in order to get a
> solution that would also work when the expressions are
> modified and transferred to Schematron. I am not sure if it
> is possible? Some clever REGEX?
>
> If the document does not contain the element node in question
> also as text inside comments, CDATA sections and PI's, I can
> do without analyze-string. I use analyze-string only to
> neutralize false positives simply by deleting all "<"
> found inside comments, CDATA sections and PIs.
>
> It is possible to do without analyze-string under all circumstances?
>
> My function works like this:
>
> I load the document as unparsed text and deletes all "<"
> from comments, CDATA sections and PIs to avoid false
> positives. I then use the node name (e.g.: "p") to split the
> string and make a new string of the items until the node
> number (e.g.: the third "p"). I then count the characters,
> delete all linefeeds, count again, and subtract to get the
> count of linefeeds until the element node in question.
>
> My function looks like this:
>
> <xsl:function name="please:linenumber">
> <xsl:param name="document-uri"/><!-- similar to
> document-uri() -->
> <xsl:param name="node-name"/><!-- e.g.: 'p' -->
> <xsl:param name="node-number"/><!-- e.g.: '3', that
> is the third p -->
> <xsl:variable name="unparsed"
> select="unparsed-text($document-uri)"/>
> <xsl:variable name="unparsed2">
> <xsl:analyze-string select="$unparsed"
> regex="<!--.*?-->|<!\[CDATA\[.*?\]\]>|<\?.*?\?>"
> flags="s">
> <xsl:matching-substring>
> <xsl:value-of select="replace(., '<', '')"/>
> </xsl:matching-substring>
> <xsl:non-matching-substring>
> <xsl:value-of select="."/>
> </xsl:non-matching-substring>
> </xsl:analyze-string>
> </xsl:variable>
> <xsl:value-of
> select="string-length(string-join(subsequence(tokenize($unparsed2,
> concat('<', $node-name)), 1, $node-number), ' ')) -
>
> string-length(replace(string-join(subsequence(tokenize($unparsed2,
> concat('<', $node-name)), 1, $node-number), ' '), '
', ''))"/>
> </xsl:function>
>
> Cheers
> Jesper Tverskov
> http://www.xmlplease.com
|