[Home] [By Thread] [By Date] [Recent Entries]
----- Original Message ----- From: Paul Miller <stele@f...> > I'm trying to develop a tag-based front-end to expat and having no luck. > I'd like to be able to parse an XML document in nestable chunks, by > calling into a nestable parser. In other words, I'd like to start > parsing, then branch to a function to handle a specific element, parsing > in there until that element is closed, then fall back out of the > function to continue parsing the rest of the document. > I take it that you want to be able to ignore part of the doument, and only process the pieces you are interested in. Is that right? Then each piece would be valid XML if it were enclosed in a root element. You don't need to literally do what you have suggested. That is, "parse in there...". You do need to parse handle the elements of different pieces differently. Three approaches come to mind. 1) Preprocess to extract just the pieces you want, wrap them in root elements so they are complete documents, then run expat (or whatever) separately on them using SAX. The preprocess should be fast and easy, and perhaps could be done using regular expressions, or SAX. Alternatively, if the xml is relatively simple, don't wrap the fragments, and process them using regular espressions insstead. (Search this archives of this group for the last few months to find a reference to "shallow parsing using regular expressions"). 2) You really are talking about a state machine, I think. That is, if you have reached the right piece of the document, you go to a different manner of handling the elements (they will still parse the same, it's just the handling that would be different). So you could explicitly maintain a state variable and have the SAX (or whatever) callbacks behave differently according to the state. This would be conceptually simple but might be a pain to implement depending on how many different element handlers you will use. 3) Again as a state machine, you could use a function pointer to specify the callbacks, and when you change state you change the function pointers to point to different handlers. I don't know whether you would have to modify expat to do this or not, but changes should be minor if needed. Regards, Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|

Cart



