I do not know any answer to the question, and without the data (and perhaps
some further information about your system, like default heap space) I cannot
reproduce the problem.
But my first instinct, for no intelligent reason whatsoever, is to use
template application for flow of control:
<xsl:template match="/">
<xsl:apply-templates select="//*[ not(*) ][. eq 'DNKK']"/>
</xsl:template>
<xsl:template match="*">
<result>
<xsl:sequence select="."/>
<parent><xsl:value-of select="name(..)"/></parent>
</result>
</xsl:template>
For all I know that is worse, not better; but it is what I would try first. I
also might try things like
*
using <xsl:copy> instead of <xsl:sequence>
*
putting the parent name on an attribute of <result> instead of as a child
element
*
actually selecting text nodes, rather than elements
*
learning streaming and using EE (as already suggested)
*
divide-and-conquer: on a first pass knock out portions of the tree that are
irrelevant or divide input file into several smaller pieces
________________________________
> Hi Folks,
>
> I have an XSLT program that locates all leaf elements which have the string
value 'DNKK'. My program outputs the element and the name of its parent:
>
> <xsl:template match="/">
> <results>
> <xsl:for-each select="//*[not(*)][. eq 'DNKK']">
> <result>
> <xsl:sequence select="."/>
> <parent><xsl:value-of select="name(..)"/></parent>
> </result>
> </xsl:for-each>
> </results>
> </xsl:template>
>
> The input XML document is large, nearly 5GB.
>
> When I run my program SAXON throws the OutOfMemoryError message shown
below.
>
> To solve the OutOfMemoryError I could add to my heap space (-Xmx) when I
invoke Java. But I wonder if there a way to write my program so that it is
more efficient (i.e., doesn't require so much memory)?
>
Can you use Saxon EE so that it is worth pondering XSLT 3 with streaming?
|