Subject: RE: xsl:for-each-group: start groups depending on number of group members?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Mon, 30 Apr 2007 14:28:03 +0100
|
It's such a high-level description of the problem that it's hard to be
specific about how to tune the performance, but instinctively my reaction
would be to look for a multi-pass approach: preprocess the data to compute
properties of each node that will make the subsequent grouping operation
simpler and more efficient.
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Yves Forkl [mailto:Y.Forkl@xxxxxx]
> Sent: 30 April 2007 14:13
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: xsl:for-each-group: start groups depending
> on number of group members?
>
> Wendell,
>
> you wrote:
>
> > While you can't restrict preceding-sibling to look only at
> members of
> > the current group, you might be able to get somewhere with
> either of
> > these approaches:
> >
> > * The XPath 2.0 "intersect" operator can return those
> members common
> > to two sequences of nodes, so (preceding-sibling::node() intersect
> > current-group()) will return just those members of the
> current group
> > that are on the preceding-sibling axis relative to the context.
>
> Thank you very much for this hint! The intersection of the
> group members and those not having a preceding sibling of a
> specific sort is what I was looking for. This makes my demo
> template look like:
>
> <xsl:template match="B" mode="groups_at_root_level">
> <B_new>
> <xsl:variable name="this_group" select="current-group()"/>
> <xsl:for-each-group
> select="$this_group"
> group-starting-with="
> B|sub[not($this_group intersect preceding-sibling::A)]">
> <xsl:apply-templates select="current-group()"/>
> </xsl:for-each-group>
> </B_new>
> </xsl:template>
>
>
> > * If, rather than using grouping constructs to select from
> the nodes
> > in the source, you processed them into temporary trees, you could
> > construct those trees exactly the way you wanted, including nesting
> > elements in such a way that preceding-sibling would be
> useful. Such as:
> >
> > <xsl:variable name="intermediate">
> > <xsl:for-each-group select="*" group-by=".">
> > <group>
> > <xsl:copy-of select="current-group()"/>
> > </group>
> > </xsl:for-each-group>
> > <xsl:variable>
> >
> > <xsl:for-each select="$intermediate/group">
> > ... inside each group element, members of the group appear as
> > siblings ...
> > </xsl:for-each>
>
> That seems to be a neat approach, too, at least from a
> general point of view. However, in my case, the existence of
> preceding siblings is important for determining whether an
> item is allowed to start a group or not. So "unconditionally"
> starting a group on any instance of an element would yield a
> number of groups that would have to be resolved afterwards
> into members of other groups, because only looking at the
> siblings of the group starter will reveal that in fact it
> should not have fulfilled this role. xsl:for-each on "group"
> instances would then be quite
> difficult: you can't process any group after you have
> processed them all, because you need to make sure that you
> don't miss any "late" member from a group that had to be resolved...
>
> Unless you have "unstable" groups, this approach is
> definitely very interesting.
>
>
> > But I'm not sure either of these are actually necessary
> here. You have
> > only presented your problem in fragmentary form, so it's
> hard to say;
> > but to get the result you say you want, I'd do something
> much simpler:
> >
> > [snip]
>
> Thank you (as well as Andrew) for proposing simple and
> elegant solutions
> that accomplish the basic grouping task. Unfortunately, I
> can't use them
> because the grouping I'm doing is far more complicated. (E.g.
> repeated
> grouping based on the same element; grouping highly depends
> on preceding
> instances; dynamic creation of multiple group containers
> etc.) Trying to
> leave out the less relevant details, I crafted a demo that would just
> show my minimal requirements, however strange they might
> seem. Sorry for
> the confusion.
>
> Let me be more elaborate on my grouping criteria. Rather than just
> matching an element I always need it to meet some condition,
> so instead of:
>
> group-starting-with="
> B|sub[not($this_group intersect preceding-sibling::A)]"
>
> I actually have more something like:
>
> group-starting-with="
> B|sub[$condition1 and
> not($this_group intersect
> preceding-sibling::A[$condition2])]"
>
> What I am curious about is how I could optimize my stylesheet runtime
> behaviour (I'm using Saxon 8.8) by computing some values only
> once, e.g.
> using a variable declared before xsl:for-each-group, given that:
>
> - the negated expression appears several times within the attribute
> value (think of it like duplicating the above code for "sub"), while
> $condition1 is rather singular
>
> - the number of instances matching unconstrained
> preceding-sibling::A[$condition2] is rather large, whereas within the
> grouping candidates it is small or zero
>
> - the value of preceding-sibling::A[$condition2] depends, as far as I
> have understood, on the item that xsl:for-each-group
> currently examines,
> so it can't sensibly be evaluated beforehand
>
> Any ideas on this?
>
> Yves
|