[Home] [By Thread] [By Date] [Recent Entries]


<snip/>

> It appears that what is needed, is :
> -also 4 kind of sizes
> -a mean to read forward SAX events
> To achieve this, I intend to write a cache that could store some events 
> (limited to 100 or 1000 or whatever you set as a default parameter) ; 
> thus, when a size is requested, the engine goes on reading the input 
> until the information is known, then the step is evaluated and later, 
> the events stored will be fetched.
> This is a smart strategy because it is not limited to count(), but to 
> any operations that expect more reading, thus a predicate that contains 
> following-sibling:: may also be considered. The idea is to use the cache 
> only when it is explicitely expected (putting all a document in a cache 
> wouldn't be SAX, but DOM). The events could also be cached in a tree 
> fragment, I don't know yet what is the best way to achieve this.
> 
> Of course there are examples when the information expected is not 
> reachable in the limit of the cache size, or lost because it has been 
> previously read, but it will help in many other examples.
> As I have some code that allows to pour SAX events into DOM trees, I'll 
> provide a smart mean to match a pattern on the SAX entry, and process 
> the subtree with full XPath capabilities ; this might be very helpfull 
> for very large documents.
> 
> What do you think of such a strategy ?
> Did you made something similar in Saxon ?

<snip/>


You might want to take a look at XML Path Event API (XPEA) [1] by Karl 
Waclawek. Also, there is some work in .NET about caching based XmlReader 
implementations (I think Oleg or Dare had a blog post on this). In 
general decoupling the caching allowed for a fair amount flexibility 
down the road.

Cheers,
Jeff Rafter

[1]http://sourceforge.net/projects/xpea

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member