Subject: Re: Question on duplicate node elimination
From: Lars Huttar <lars_huttar@xxxxxxx>
Date: Mon, 23 Aug 2010 15:54:05 -0500
|
On 8/22/2010 5:12 PM, Hermann Stamm-Wilbrandt wrote:
>> I'm not sure what you find surprising about the results you are seeing.
>> What results would you expect?
> Not surprising.
>
> But how could the algorithm step of "duplicate elimination" be done?
> How can the duplicates be determined and removed, correctly?
>
If I'm understanding your question correctly (are you trying to
implement an XPath processor in XSLT 1.0?) I think it's impossible, if
you create the rtf simply using xsl:copy-of. Because as Mike said, once
you've copied nodes, the copies are distinct; there's no information in
the rtf(s) to distinguish copies of the same node from copies of
identical twins.
Could you create the rtf using a "special" attribute that preserves the
id of the node which you are copying? E.g.
<xsl:attribute name="originalID" namespace="http://hsw.org/specialNamespaceURI">
<xsl:value-of select="generate-id()" />
</xsl:attribute>
Then you could use that originalID attribute to determine what nodes were identical in the original, and strip out the originalID attribute after using it.
But I guess this would only work on elements, since only elements can have attributes...
Lars
> Perhaps I was not clear enough with my question.
> How can this step (p. 40 from [1]) be implemented in XPath 1.0 plus
> eslt:node-set():
> A location step identifies a new mode-set relative to the context node-set.
> The location step is evaluated against each node in the context node-set,
> and the union of the resulting node-sets becomes the context node-set for
> the next step. Location steps consist of an axis identifier, a node test
> and zero or more predicates (see Figure 3-4). ...
>
>
> [1]
> http://www.theserverside.net/tt/books/addisonwesley/EssentialXML/index.tss
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Developer, XML Compiler, L3
> WebSphere DataPower SOA Appliances
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH
> Vorsitzender des Aufsichtsrats: Martin Jetter
> Geschaeftsfuehrung: Dirk Wittkopp
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
>
>
> From: Michael Kay <mike@xxxxxxxxxxxx>
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Date: 08/22/2010 11:53 PM
> Subject: Re: Question on duplicate node elimination
>
>
>
> I'm not sure what you find surprising about the results you are seeing.
> What results would you expect?
>
> xsl:copy-of creates a new node. Copying the same node twice creates two
> copies with distinct identity. Is that the issue?
>
> Michael Kay
> Saxonica
>
> On 22/08/2010 22:25, Hermann Stamm-Wilbrandt wrote:
>> Hello,
>>
>> I have a question on duplicate node elimination.
>>
>>> From the XPATH 1.0 specification:
>> ...
>> * node-set (an unordered collection of nodes without duplicates)
>> ...
>> An initial sequence of steps is composed together with a following step
> as
>> follows. The initial sequence of steps selects a set of nodes relative to
> a
>> context node. Each node in that set is used as a context node for the
>> following step. The sets of nodes identified by that step are unioned
>> together. The set of nodes identified by the composition of the steps is
>> this union.
>> ...
>>
>> So "are unioned together" results in a node-set and that does not contain
>> duplicates.
>>
>> Now how can this algorithm step be realized in XPATH 1.0 plus
>> exslt:node-set
>> funtion?
>> (this would work in browsers with the technique from David Carlisle [1])
>>
>>
>> This is the output for below stylesheet simple.xsl on file simple.xml.
>> For the nodes four node /a/b/c their parents are copied into an
>> intermediate
>> result. But xsltproc and xalan show that the four nodes are different by
>> the
>> their generate-id() values, whereas the first pair and last pair are
>> representations of the same node.
>>
>> xsltproc xalan
>> 1: id2659470 1: AbT0
>> 2: id2659470 2: AbT0
>> 3: id2659354 3: AbT1
>> 4: id2659354 4: AbT1
>>
>> 1: id2659234 1: AbT2
>> 2: id2659244 2: AbT3
>> 3: id2659254 3: AbT4
>> 4: id2659264 4: AbT5
>>
>> 1:<b> 1:<b>
>> <c>1</c> <c>1</c>
>> <c>2</c> <c>2</c>
>> </b> </b>
>> 2:<b> 2:<b>
>> <c>1</c> <c>1</c>
>> <c>2</c> <c>2</c>
>> </b> </b>
>> 3:<b> 3:<b>
>> <c>1</c> <c>1</c>
>> <c>2</c> <c>2</c>
>> </b> </b>
>> 4:<b> 4:<b>
>> <c>1</c> <c>1</c>
>> <c>2</c> <c>2</c>
>> </b> </b>
>>
>>
>>
>> $ cat simple.xml
>> <a>
>> <b>
>> <c>1</c>
>> <c>2</c>
>> </b>
>> <b>
>> <c>1</c>
>> <c>2</c>
>> </b>
>> </a>
>> $ cat simple.xsl
>> <xsl:stylesheet version="1.0"
>> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>> xmlns:exsl="http://exslt.org/common"
>>
>> <xsl:output omit-xml-declaration="yes"/>
>>
>> <xsl:template match="/">
>> <xsl:variable name="rtf">
>> <xsl:for-each select="/a/b/c">
>> <xsl:copy-of select=".."/>
>> </xsl:for-each>
>> </xsl:variable>
>>
>> <xsl:for-each select="/a/b/c">
>> <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>> <xsl:value-of select="generate-id(..)"/><xsl:text> </xsl:text>
>> </xsl:for-each>
>>
>> <xsl:text> </xsl:text>
>>
>> <xsl:for-each select="exsl:node-set($rtf)/*">
>> <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>> <xsl:value-of select="generate-id(.)"/><xsl:text> </xsl:text>
>> </xsl:for-each>
>>
>> <xsl:text> </xsl:text>
>>
>> <xsl:for-each select="exsl:node-set($rtf)/*">
>> <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>> <xsl:copy-of select="."/><xsl:text> </xsl:text>
>> </xsl:for-each>
>> </xsl:template>
>>
>> </xsl:stylesheet>
>> $
>>
>>
>> [1] http://dpcarlisle.blogspot.com/2007/05/exslt-node-set-function.html
>>
>>
>> Mit besten Gruessen / Best wishes,
>>
>> Hermann Stamm-Wilbrandt
>> Developer, XML Compiler, L3
>> WebSphere DataPower SOA Appliances
>> ----------------------------------------------------------------------
>> IBM Deutschland Research& Development GmbH
>> Vorsitzender des Aufsichtsrats: Martin Jetter
>> Geschaeftsfuehrung: Dirk Wittkopp
>> Sitz der Gesellschaft: Boeblingen
>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
> X-Quarantine ID /var/spool/MD-Quarantine/18/qdir-2010-08-22-18.13.01-001
|