[Home] [By Thread] [By Date] [Recent Entries]
As I've written before I've been working on a callback-based streaming XML parser that is sort of DOM-like, specifically for reading application data from XML files where you know what the object hierarchy is. At first I tried layering my work over expat to no avail, since expat uses a push model and I needed a pull model (so a subelement could parse its subtree by itself). Now that I am almost finished with the first cut and about to release it, I was explaining the solution to a colleague who said I should have tried to build it on expat anyway. The only way I could have done that and keep the sub-element parsing model that I want is to have expat parse entire document into one big internal memory buffer. One of the advantages of my solution is you only need a small file buffer, since it's streaming. If I have a data file with 100,000 elements in it, I would need to store the entire file in memory, along with a couple of extra megabytes of housekeeping data. My colleague said "so what?". So, here is my plea for feedback about memory usage concerns. My current solution works as designed and streams into a small buffer, but it only supports ASCII and doesn't validate, and relies on C-style callback functions which can require slightly more code. If I wasn't worried about the memory I could rewrite my design on top of expat (gaining all of its benefits, and presumably validation in the future), and provide an optional DOM-like interface (without all the extra DOM mumbo-jumbo). It would be possible to combine the streaming and the parsing and throw away the housekeeping data when the elements are no longer needed (such as when we've moved on to the next subelement tree). What do people think? Spare the memory and provide a simpler (and slightly less capable) solution or store the entire thing in memory and use the nice stuff in expat and give more features? -- Paul Miller - stele@f... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



