[Home] [By Thread] [By Date] [Recent Entries]


>
>> On the standard textual XML front: As has been noted, Xerces and   
>> woodstox can be made to run quite fast, but in practise, few  
>> people  know how do configure them accordingly, and to do so  
>> reliably, and without conformance compromises.
>
> A red herring.  Xerces' defaults are an issue unrelated to the  
> merits of stimulating software developers to use modern C++  
> features instead of sticking to slow 90's features.
>
> (In any case, these optimisations are potentially also applicable  
> to binary XML parsing as well as to real XML processing.)
>
>> Most users can't  afford to study the complex reliability vs.  
>> performance interactions  of myriads of more or less static tuning  
>> knobs.
>
> Same fish.

Fish or not, it reflects the priorities of reality. It's not good  
enough to just provide low-level infrastructure regardless of  
usability concerns.

The bottom line is that more often than not, parser performance  
problems are a result of folks using the parser with inappropriate  
configuration. Why? Because typically the APIs are a huge complex  
mess, designed with little respect for clarity and performance in  
mind. As a user, how do I cache DTDs or schemas? How can I safely  
reuse parser data structures in efficient, thread-safe, memory  
bounded manners? How do I deal with parsers implementing poorly  
specified ambigous "standard" interfaces in varying manners, in more  
or less subtle ways. If it isn't obvious how to take full advantage  
of a parser's theoretic performance capabilities, it mostly won't  
happen in reality, and no amount of internal SSE optimizations will  
change that.

A new performance oriented parser implementation must come with a  
straightforward API, or else it will matter little in practise.

Wolfgang.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member