Subject: RE: Spotting "cousin marriages" in a tree
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Wed, 28 Jul 2004 23:57:33 +0100
|
You may be hoping for too much. Graph algorithms such as looking for cycles
often have complexity of O(n^2) or worse, whatever language they are
implemented in.
Michael Kay
> -----Original Message-----
> From: Phil Endecott [mailto:spam_from_xslt_list@xxxxxxxxxxxx]
> Sent: 28 July 2004 22:19
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Spotting "cousin marriages" in a tree
>
> Dear XSLT experts,
>
> I have a flat input like this:
>
> <things>
> <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
> <thing id="2"/>
> <thing id="3"/>
> </things>
>
> I want to match up the ids with the idrefs to build a tree that looks
> like this:
>
> <tree id="1">
> <tree id="2"/>
> <tree id="3"/>
> </tree>
>
> I could do it like this:
>
> <xsl:template match="thing">
> <tree>
> <xsl:for-each select="child">
> <xsl:apply-templates
> select="/things/thing[@id=current()/@idref]"/>
> </xsl:for-each>
> </tree>
> </xsl:template>
>
> But that is O(n^2), so I'd instead do it like this:
>
> <xsl:key name="things-by-id" match="thing" use="@id"/>
> <xsl:template match="thing">
> <tree>
> <xsl:for-each select="child">
> <xsl:apply-templates select="key('things-by-id',@idref)"/>
> </xsl:for-each>
> </tree>
> </xsl:template>
>
> which is O(n log n).
>
> This is all fine. But very rarely my input may not describe a pure
> tree; it may contain "cousin marriages" or more "incestuous"
> relationships like this:
>
> <things>
> <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
> <thing id="2"> <child idref="4"/> </thing>
> <thing id="3"> <child idref="4"/> </thing>
> <thing id="4"/>
> </things>
>
> If you apply either of the above templates to this, thing 4
> appears in
> the output twice:
>
> <tree id="1">
> <tree id="2">
> <tree id="4"/>
> </tree>
> <tree id="3">
> <tree id="4">
> </tree>
> </tree>
>
> which is very bad in my application. What I need is
> something like this:
>
> <tree id="1">
> <tree id="2">
> <tree id="4"/>
> </tree>
> <tree id="3">
> <error/>
> </tree>
> </tree>
>
> I can't afford for this to become an O(n^2) operation.
>
> The obvious thought is that I need to check at the point of recursion
> whether the thing I am about to process has already been output. But
> there is no "preceding-in-the-output" axis or other way I can
> think of
> to note which things have already been output.
>
> An alternative is to apply a test to the other input elements
> to see if
> other things have the same thing as this one as a child, but
> that fails
> because my input may legitimately contain redundant nodes like this:
>
> <things>
> <thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
> <thing id="2"/>
> <thing id="3"/>
> <thing id="9"> <child idref="2"/> </thing>
> </things>
>
> In this case I want the same output as the very first example, i.e.
> thing 9 doesn't appear in the output at all.
>
> I can do a multi-pass process using exsl:node-set(),
> generating the tree
> with the duplicates in it and then deleting them if they are
> duplicates,
> i.e. something like this:
>
> <xsl:template match="/">
> <xsl:variable name="n">
> <xsl:apply-templates select="things/thing[1]"/> (as above)
> </xsl:variable>
> <xsl:apply-templates select="exsl:node-set($n)"
> mode="remove-dupes"/>
> </xsl:template>
>
> <xsl:template match="tree" mode="remove-dupes">
> <xsl:choose>
> <xsl:when test="preceding::tree[@id=current/@id]">
> <error/>
> </xsl:when>
> <xsl:otherwise>
> <tree id="{@id}">
> <xsl:apply-templates mode="remove-dupes"/>
> </tree>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
>
> There are two problems with this. First, generating the intermediate
> result could run forever if the input is seriously malformed (e.g. a
> loop, child refers to parent); checking as it is generated
> would avoid
> this. Second, the test "preceding::tree[@id=current/@id]" is O(n^2).
> Can I use a key to avoid this? How can a key be applied to an
> exsl:node-set() result?
>
> XSLT often needs some lateral thinking to come up with a good
> solution.
> I hope that someone who's read this far has a trick to
> solve this for me.
>
> Thanks in advance!
>
> --Phil.
|