[Home] [By Thread] [By Date] [Recent Entries]
--- Michael Kay <mike@s...> wrote: > > my observations - for small messages and documents > it all seems to be > > fast enough (haven't had any performance issues to > cause me > > to measure it). > > One thing I discovered (or rather, which Wolfgang > Hoschek pointed out to me) > is that if you are parsing lots of small documents > you can get a big saving > by reusing the XML parser rather than instantiating > a new one for each > parse. This might be parser-specific, but certainly > for Xerces there seems > to be a big initialisation cost. Indeed. Many of internal data structures (for element stacks, attribute map, string canonicalization maps, grammar caches) can be reused, as long as something binds these together. This is often most conveniently done by allowing either instances (SAX in general), or factories (StAX) that create them, to be reused. Difference in performance can be 3x for small documents (I noticed this when comparing Sax and StAX impls -- without recycling, Xerces seemed to performed badly). Interesting thing is, when used in 'correct' way, Xerces is a very fast xml parser; even though performance is not its top goal (if I understand it correctly, it's correctness, complete support for standards etc). -+ Tatu +- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
|

Cart



