[Home] [By Thread] [By Date] [Recent Entries]
I have a large collection of XML documents, and want to find and group any duplicates. The obvious but slow way of doing this is to just compare them all to each other. Is there a better approach? Particularly, is there any APIs or standards for "hashing" a document so that duplicates could be identified in a similar way to what you'd do with a hash table? Thanks, Eric
|

Cart



