[Home] [By Thread] [By Date] [Recent Entries]
> ___Note that a SystemLiteral can be > parsed without scanning for markup.___ > I wonder what the last sentence really wants to say. > > I think I must have missed certain backgroud knowledge about it. > > Could you give me some explanation ? > > Thank you! Well, I can't tell you why this sentence is _there_ but I suspect it is referring in the most part to entity literal values. Entity literals may contain markup constructs as in: <!ENTITY foo "<foo></foo>"> What the sentence is saying is that you can "parse" this literal without scanning for the markup. While this is technically true there are some caveats. You see, entity literals when referenced (i.e., &foo;...) must be wellformed on their own-- but only when they are referenced-- (this one I learned from Richard Tobin)... so technically you can parse without scanning for markup, but eventually you need to at insert this at the point that it is referenced and then check that it is wellformed (which can be done by simply doing a standard parse and checking that the start and end tags are well balanced-- which I learned from Bob Foster). Another important caveat is that while parsing but not scanning-- you do actually need to scan for entity references-- character references and parameter entities need to be expanded and you need to check that regular entities actually refer to declared entities (even if the current entity your are parsing is never referenced). Though very few parsers actually go to the trouble on that point... I am babbling... I spent a ton of time on this about a month and a half ago... it is useful stuff, which if you want to read-- simply look for a bunch of silly questions from me and look at the very brilliant responses from the smart folks... Cheers, Jeff Rafter
|

Cart



