Subject: Re: Testing 2 XML documents for equality - a solution
From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx>
Date: Wed, 30 Mar 2005 08:40:58 -0800 (PST)
|
Hi David,
Thanks a lot for your observations..
Please read my response below your comments..
> I don't think the stylesheet really works.
> For example for attribute nodes you just concatenate
> the names and
> values so even if you could be sure that the order
> of attribute nodes
> was preserved (you can't be sure of this) then
> x="2" and x2="" would be considered equal.
Thanks a lot for pointing this bug! To correct this I
propose this alternative code (for both the
documents).
<xsl:for-each select="$doc1//@*">
<xsl:value-of select="name()"
/><xsl:text>
</xsl:text><xsl:value-of select="."
/>
</xsl:for-each>
(i.e. introducing an extra character between attribute
name and value, which is unlikely to occur in the
attribute value; for e.g. a newline character)
> Also your ignore white space test ignores far to
> much:
>
> <xsl:for-each
>
select="$doc1//node()[not(normalize-space(self::text())
> = '')]">
> <xsl:value-of select="name()"
> /><xsl:value-of select="." />
>
> consider the 2 document fragments
>
> <x>
> <a/>
> </x>
>
>
> <y>
> <b/>
> </y>
>
> in the first document the nodes x and a and both the
> text nodes all
> satisfy
> normalize-space(self::text())= ''
> so the for-each will be empty.
> Similarly in the second fragment.
>
> so presumably these documents will compare equal,
> which seems strange.
These documents are reported not equal! I think here I
am right! For this example, the $doc1//node() path
expression returns 4 nodes (2 element nodes and 2
"white space text nodes"). The "white space text
nodes" will be filtered by the predicate
[not(normalize-space(self::text()) = '')] ..
> Conversely you can not be sure that
> <x a="2" b="3"/> will compare equal to
> <x a="2" b="3"/>
> as teh attribute may be reported in one order for
> doc1 and teh other
> order for doc2.
I agree that the XML parser is not expected to report
attribute nodes in same order. But I guess we can
reasonably assume that a "specific XML parser" would
report attributes in same order. It must be having a
specific algorithm for this, whose outcome will be
predictable. I know I cannot theoretically prove
this.. But can you provide any practical evidence when
XML parser reports attributes in different order.. So
since 2 documents are being processed by the same
parser, the outcome will always be predictable!
I have tested the same example with a single product
multiple times, and always I am getting same result..
Regards,
Mukul
> David
>
>
>
________________________________________________________________________
> This e-mail has been scanned for all viruses by
> Star. The
> service is powered by MessageLabs. For more
> information on a proactive
> anti-virus service working around the clock, around
> the globe, visit:
> http://www.star.net.uk
>
________________________________________________________________________
>
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
|