[Home] [By Thread] [By Date] [Recent Entries]
Here's my take on it... Note that what I'm saying here reflects the necessities of supporting really bad C++ implementations, not my personal feelings. If it were up to me, I'd say use every modern service of C++ and those who don't have compliant C++ implementation can have a good reason to get one. But, by an unfortunate decision, I was not made the ruler of the world... Go figure! 1) I don't mind that we just start of with SAX2 I guess. It makes sense this late in the game perhaps to just concentrate on SAX2. 2) We would prefer that all data come out of the SAX interfaces as raw wchar_t strings. This is the most flexible mechanism and does not lock people into using any particular implementation of a string object. It also has the highest potential performance for those folks who never need to put it into anything more formal than a raw array. 3) We agree with the basic desire to avoid object ownership issues, but wouldn't worry about them if they are well documented. Object ownership is just a fundamental issue in C++ and if you don't understand them you probably are going to blow your own foot off no matter what. 4) We would be concerned about some of the SAX2 stuff wrt setting features (I think its features) via an abstracted object interface because its a little bit sticky. It can be done, but the point still arises of where does the desirability of being the same as the Java interface end and the desireability of having a very natural interface for your own language begin? I.e. just don't make it so Java'esque that it requires a lot of trickery to make work on C++. Don't require some common base class. 5) If you wanted to templatize the interface over the character type, we wouldn't mind particularly. But, considering that any implementation of the interface would *always* use the same instantiation, why bother? Just typedef the character type and let each implementation drive it. Its not likely that a particular build of a particular implementation would need to change this on the fly, right? 6) The issue of handler ownership is something we punted on. As far as we are concerned, handlers installed on the SAXParser belong to the caller because in most cases one object implements a number of handlers. 7) The names of methods of the handlers need to be non-ambiguous to avoid problems. So DocType handlers should use DocTypeCharacters() or DTDCharacters() or whatever, and Document handlers should use DocCharacters() or some such thing. Its just not worth the paranoia of how implementations would deal with multiple mixed in interfaces having the same named methods. If the processing should be common, the class implementing both handlers can delegate to a private method. 8) I disagree with the contention that unsigned shouldn't be used in interfaces. If the thing being modeled is unsigned, use unsigned because you are modelling the type desired. I would personally typedef (by logical usage) all of the fundamental types used by the interfaces and let the implementation drive them. 9) APIs such as getType() or getValue() should return a "const wchar_t*" so that the caller uses the returned value directly. The overhead of copying the return (and having to clean it up) would probably be unacceptable (actually it wchar_t would be some defined type that is driven by the implementation.) Yes this involves ownership issues, but as I said, this is fundamental to C++, so people should probably just 'get over it' :-) 10) I believe that its better to have the interfaces remain pure virtual and provide a HandlerBase. This lets people who want to be sure that they've overridden everything be told so by the compiler, and it allows selective overriding by using HandlerBase where desired. 11) The class names (since we can't afford to use C++ namespaces) should be expanded to include a SAX prefix to avoid clashes. So SAXParser and SAXLocator and SAXAttributeList and so on. 12) We added reset() methods to all the handlers. The reason being that, on the start of a new parse operation, each handler might need to reset its internal state. We assume that the handlers might be completely unknown to the code that kicks off the parse event and we didn't want them to have to assume that the order of events wouldn't change over time (i.e. we didn't want them to just pick what they think will be the first event and reset from that.) That's all I can think of at the moment. I haven't had enough time to look at SAX2 closely so I don't know what there might be problematic to us in the C++ world. But, I still think that its good enough to just pick up at SAX2 as long as SAX2 can be reconcilled with the needs of the C++ world. ---------------------------------------- Dean Roddey Software Weenie IBM Center for Java Technology - Silicon Valley roddey@u... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



