[Home] [By Thread] [By Date] [Recent Entries]
On 26/01/2011 15:56, Costello, Roger L. wrote: > Hi Folks, > > It is my understanding that there are 3 flavors of regular expression parsers [1]: > > 1. Nondeterministic Finite Automaton (NFA) > > 2. Deterministic Finite Automaton (DFA) > > 3. Backtracking > > Which flavor of regular expression parser does SAXON use? > Saxon uses the DFA algorithm described in http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html modified by a system of counters to handle minOccurs/maxOccurs constraints, which is inspired by subsequent work by Thompson and Tobin: http://www.cogsci.ed.ac.uk/~ht/XTech_2006_paper.pdf but does not follow it slavishly. I'm not sure about your three categories, by the way. I think that if you use an NFA then you need some kind of backtracking (either that or you investigate multiple forwards paths in parallel, which amounts to the same thing.) Michael Kay Saxonica
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



