[Home] [By Thread] [By Date] [Recent Entries]
On Wed, 28 Aug 2013 10:23:59 -0700, Lauren Wood <lauren@t...> wrote: >The hard part with these fixes is knowing when to stop. The law of >diminishing returns kicks in fairly quickly on error conditions, Yes, it does. But not as quickly as the Draconian error handling that is the practice for XML, as opposed to MicroXML: https://dvcs.w3.org/hg/microxml/raw-file/tip/spec/microxml.html#characters 4.2 Parser Conformance ... A MicroXML parser MAY perform error correction, by providing an abstract data model even for sequences of bytes that are not conforming MicroXML documents. It MUST, however, still comply with the requirement of the first paragraph to report that the sequence of bytes is not a conforming MicroXML document. We find that the ability to report and continue, rather than fall over dead at the first error, saves us a lot of time. When we get all the errors reported on the first pass of a new doc, we can correct them all at once, instead of one at a time with reprocessing in between. >especially when the schema isn't constrained. For example, it's much >easier to correctly correct the missing end tags when the schema is >constrained (e.g., you at least know which elements are meant to be >empty, and which not). We have two layers here. The parser just creates a data model, so it is concerned only with the text being well-formed MicroXML. Then the processor sees whether it is also sensible. If for example the slash is omitted at the end of an empty-element tag, the parser would treat it as a start tag, and provide an end tag before the next end tag it saw. That would put the intervening text inside the empty element, where it would probably not be sensible. But the processor, which knows what elements should be empty based on their properties, would complain too. With the well-formedness error from the XML parser, immediately followed by the "Non-text element <data> has text:" complaint from the processor, the writer has a very good idea of what happened. >In my experience if your parser makes the wrong >choice and therefore 'corrects' the wrong thing, or corrects it in the >wrong way, the resulting mess can be difficult to fix properly. Of >course, depending on your downstream processing, that may or may not matter. If you get an error from parser or processor, it is best to act on it immediately and fix the source of the problem. Since we are talking about processing times that are typically seconds, not even minutes, what happened downstream doesn't really matter. For example, the module that reads the MicroXML produces a parse tree as a file that the next module, that writes the HTML or Word file, digests. If the XML reader reports a problem, you don't even bother looking at the output the second module made five seconds later. And the error report is hard to overlook; we put it up in your editor right in front of you at the end of processing. With source line numbers. ;-) -- Jeremy H. Griffith <jeremy@o...> DITA2Go site: http://www.dita2go.com/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



