Subject: RE: how to sort a list of xpaths
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Sun, 20 Jan 2008 09:18:00 -0000
|
You don't actually say what sort order you want, but the implication is that
you want an order such that if one rule subsumes another (in the sense that
A subsumes B if the set of things matched by B is a subset of those matched
by A) then B should precede A in the sort order.
In the example you have given, it doesn't require a schema to determine that
//class subsumes /section/class or that /section/class subsumes
/section/class[@type='bb']. Of course, given a schema you can do more
sophisticated analysis, but I should start with the basics first.
Although you've expressed your problem in terms of xpaths, your examples are
all XSLT patterns, and you describe the semantics in terms of matching; so
another useful simplification would be to restrict yourself to patterns.
I think the problem then becomes tractable provided you don't try to be too
clever, for example trying to detect that A[contains(., 'abc')] subsumes
A[starts-with(., 'abc')]. However, it's not going to be easy. Your first
step is to get an XML representation of the expression tree (for example,
use XQueryX, or use the output of Saxon "explain"); then find a sort routine
that uses a callback to compare two items in the sequence; and implement
this callback to compare the two expressions. This would use a number of
rules for example that EXP subsumes EXP[P] and that //EXP subsumes PATH/EXP.
There might be quite a few of these rules.
I've actually been thinking of doing this kind of analysis for XSLT template
matching for a while, but there's nothing in Saxon that currently does it,
beyond the basic logic to calculate the default priority of the match
pattern.
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Mark Hutchinson [mailto:mark@xxxxxxxxxxx]
> Sent: 20 January 2008 07:11
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: how to sort a list of xpaths
>
>
> I have an application that uses xpaths to identify which rule
> that should be executed for a given tag in a source file.
>
> eg given a source file of:
> <section>
> <class type="AA">
> <entry>entry text</para>
> </class>
> <class type="BB">
> <entry type="italic">text text</para>
> <entry>text text</entry>
> <class>
> <entry>subclass entry text</entry>
> </class>
> </class>
> </section>
>
> My application might have the following rules :
>
> 1. /section/class[@type="AA"]/entry
> 2. //class
> 3. /section/class[@type="BB"]/entry
> 4. //entry
> 5. //entry[@type="italic"]
>
> My application searches the rules from the top down -
> stopping from checking any further once a match has been
> found. ie the first class
> (AA) matches with rule 1 and this is correct. The second
> class (BB) matches with rule 2 which is, obviously, not correct.
>
> Here's my question (finally!) - do anyone have a utility or
> xslt, perl etc that can sort these rules based on a schema?
>
> Regards
> Mark H.
|