[Home] [By Thread] [By Date] [Recent Entries]
Hi Roland,
At first, I thought it wasn't so trivial. But when I tried to implement it, I quickly found out that your request is actually quite easy. From what I understand, you want to process the 15 documents one by one. If you don't do that, and process them all at once instead, you have two viable options in XSLT 1.0: 1. Process twice: first output all identifiers to one file plus the name of the file it first appeared in. Then, use normal Muenchian when you process the identifiers again (to dedup them) and, in the same run, use the filenames to reopen the sources and select only the blocks with the correct identifier. 2. Process them all at once with using the node-set extension instruction from EXSLT (or if you use a microsoft processor: msxml:nodeset) and use Muenchian grouping. If your files are real large, either option may pose a memory problem. The second option may yield quite a performance hit when it has to do the nodeset transform on all documents. I suppose you test it first on a small set and then try a larger. I tried option 2, because that seemed the easiest to implement. In the example below I use a parameter for the input files (set in the xslt for ease of testing). You probably have your own preferred way to get all documents through the pipeline. Come to think of it, if all you need are copies of these blocks, it almost looks simpler than an attempt in XSLT 2.0 with for-each-group (that's the first time ever I say something like that, and perhaps the only and last time too ;) Unfortunately, XPath 1.0 does not have the possibility to include comments in an xpath. But for clarity, here's a little explanation on the "core" of the little XSLT stylesheet further down. (: node set of all documents :) exslt:node-set($all-input) (: all idTag nodes :) //idTag (: muenchian grouping :)
[generate-id(.) = generate-id(key('idtag', .)[1])](: get the parent block:) /..
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0"> <xsl:key name="idtag" match="idTag" use="." /> <xsl:output indent="yes" /> <xsl:param name="input"> <file href="muenchian-multipledocs1.xml" /> <file href="muenchian-multipledocs2.xml" /> <file href="muenchian-multipledocs3.xml" /> <file href="muenchian-multipledocs4.xml" /> </xsl:param> <xsl:template match="/" name="main"> <xsl:variable name="all-input"> <xsl:apply-templates select="exslt:node-set($input)/*" /> </xsl:variable> <root> <xsl:copy-of select=" exslt:node-set($all-input) //idTag [generate-id(.) = generate-id(key('idtag', .)[1])] /.."/> </root> </xsl:template> <xsl:template match="file"> <xsl:copy-of select="document(@href)" /> </xsl:template> </xsl:stylesheet> Have fun with it! Cheers, -- Abel Braaksma Meyer, Roland 1. (NSN - DE/Germany - MiniMD) wrote: Hi,
|

Cart



