[Home] [By Thread] [By Date] [Recent Entries]
I really like what you've done, but the language you're using is not XPath ( neither it is a subset of XPath ) and I see a problem here ( I think I also have some kind of solution to that problem and I'l express it in my next letter ) Rgds.Paul. ----- Original Message ----- From: "Niels Peter Strandberg" <nielspeter@n...> To: <xml-dev@l...> Sent: Wednesday, November 28, 2001 5:40 AM Subject: SaxXPathFragmentFilter - Reduse large DOM trees using a SAX XPath cutter! > I have made an experimental SAX XMLFilter. It allows you to "filter" out > the information in an xml document that you want to work with - using > xpath - and skip the rest. You can place the filter anywhere in your > application where a XMLFilter can be used. > > - I don't know if this has already been done by others? > > The whole idea is to "filter" out the fragments from the xml document > that you specifies using an xpath expression. ex. > SaxXPathFragmentFilter(saxparser, "/cellphone/*/model[@id='1234']", > "result"). Build a dom tree from the result, or why not feed the sax > event into a xslt transformer and do some xslt transformations. > > The big win is that you don't have to build a large dom tree, if you > only needs part of the information in a large xml document. You just > specify what fragments you want using xpath and the result will be a > much smaller dom tree, witch requires less processing, memory etc. > > Let us say that you have a large document with spare parts to Volvo > vehicles. You want to do a list of engine parts for the S80 car model. > What you do is specify the xpath (locationpath) that you want to cut out > from the document ex. "/catalog/cars/s70/parts/engine". > > // your sax parser here > XMLReader parser = > XMLReaderFactory.createXMLReader( > "org.apache.xerces.parsers.SAXParser"); > > // Get instances of your handlers > SAXHandler jdomsaxhandler = new SAXHandler(); > > String xpath = "/catalog/cars/s70/parts/engine"; > String rootName = "s70engineparts"; // this will be the new > root. > > // set SaxXPathFragmentFilter > SaxXPathFragmentFilter xpathfilter = > new SaxXPathFragmentFilter(parser, xpath, > resultrootname); > xpathfilter.setContentHandler(jdomsaxhandler); > > // Parse the document > xpathfilter.parse(uri); > > // get the Document > Document doc = jdomsaxhandler.getDocument(); > > > This SaxXPathFragmentFilter is pure experimental. It is spaghetti code. > I just sat down with an idea and started to code, and the code is not > very pretty. It needs to be rewritten. > > > The xpath support is very limited for now. Here is the xpath you can do > today with this filter: > "/a/b" - An absolute path. > "/a/*/c" - An absolute path but where element no 2 "*" could be > any element. > "/a/*/c[@att='value']" - If element c has an attribute with 'value'. > "/a/*/c[contains='value']" - If element c first child node is a > text node that contains 'value'. > "/a/*/c[starts-with='value']" - If element c first child node is a > text node that starts with 'value'. > "/a/*/c[ends-with='value']" - If element c first child node is a > text node that ends with 'value'. > "/a/*/c['value']" - If element c first child node is a text node > that is 'value'. > "/a/*/c[is='value']" - As above. > > As you can see the xpath options is very limited. But I think that when > I find a way to implement the "//" pattern, the filter will be even more > powerful. > > I have problems with building a dom tree from the result using xerces > and saxon. But with jdom it works great. This needs to be fixed. > > You can not rely on that the result is allways correct, so don't use > this in any application, just use if for expermentation. > > You can find the code at: > http://www.npstrandberg.com/projects/saxxpathfragmentfilter/saxxpathfragment filter. > tar.gz > > My goal with this filter is to keep it realiable, simple, fast and > clean. If you want to contribute to this project, then you will be > wellcome. The filter will be realeased under som kind of opensource > license (if we get that fare!). > > Test it an give me some feedback, on what you think. > > > Regards, Niels Peter Strandberg > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> >
|

Cart



