Thanks to all who sent suggestions. It looked like David's suggestion of
using Xalan's no-caching feature would let me move forward, but the thing
still ground to a halt. So for now I'm following Charles' line and have
written a PHP script to do the job. Too bad, though; I was looking forward
to being able to say I'd done the whole job in XSL. Ultimately I see I need
to get further into JAXP and learn to do these things properly. My
conclusion is that even where XSL isn't the right tool for full-scale
production, it's an awfully handy prototyping tool.
Peter
Peter Binkley
Digital Initiatives Technology Librarian
Information Technology Services
4-30 Cameron Library
University of Alberta Libraries
Edmonton, Alberta
Canada T6G 2J8
Phone: (780) 492-3743
Fax: (780) 492-9243
e-mail: peter.binkley@xxxxxxxxxxx
> -----Original Message-----
> From: Michael Kay [mailto:mhk@xxxxxxxxx]
> Sent: Wednesday, April 09, 2003 1:39 PM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: RE: 10,000 document()'s
>
>
> I would suggest writing a SAX filter that invokes the XSLT
> transformations (one transformation for each file) via JAXP,
> gets the result back in a StringWriter, and adds an element
> containing the word count to the output stream.
>
> Michael Kay
> Software AG
> home: Michael.H.Kay@xxxxxxxxxxxx
> work: Michael.Kay@xxxxxxxxxxxxxx
>
> > -----Original Message-----
> > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of
> > Peter Binkley
> > Sent: 08 April 2003 17:06
> > To: 'xsl-list@xxxxxxxxxxxxxxxxxxxxxx'
> > Subject: 10,000 document()'s
> >
> >
> > I need advice on how to tackle this problem: I've got a file
> > that contains a list of about 10,000 other files, and I want
> > to process the list so as to add a wordcount for each of the
> > external files. Something like this:
> >
> > Input:
> >
> > <files>
> > <file>
> > <filename>/path/to/file/2844942.xml</filename
> > <file>
> > <file> .... </file>
> > <files>
> >
> > Output:
> >
> > <files>
> > <file>
> > <filename>/path/to/file/2844942.xml</filename
> > <wordcount>2938</wordcount>
> > <file>
> > <file> .... </file>
> > <files>
> >
> > The obvious approach is to use a for-each loop that includes
> > a variable that opens the external file using a document()
> > call. The problem is that the process inevitably runs out of
> > memory, both with Saxon and Xalan. It seems that the
> > variables are passing out of scope and being destroyed as
> > they should, but I gather from a posting by Michael Kay
> > (http://www.biglist.com/lists/xsl-list/archives/200212/msg0050
> 7.html) that all of those document() source trees are
> remaining in memory throughout the transformation, adding up
> to megabytes of data.
>
> Can anyone suggest a strategy? The process doesn't have to be
> fast, it just has to finish.
>
> Peter Binkley
> Digital Initiatives Technology Librarian
> Information Technology Services
> 4-30 Cameron Library
> University of Alberta Libraries
> Edmonton, Alberta
> Canada T6G 2J8
> Phone: (780) 492-3743
> Fax: (780) 492-9243
> e-mail: peter.binkley@xxxxxxxxxxx
>
>
>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|