[Home] [By Thread] [By Date] [Recent Entries]
On 12/9/13 9:39 PM, Amelia A Lewis wrote: Yes, you should. The shape of the world does actually differ for many cases.I should probably avoid this argument. *sigh* No, you can't do any of those things. However, there is an enormous class of tasks for which those issues simply do not matter.On Mon, 09 Dec 2013 17:08:14 -0500, Simon St.Laurent wrote:Yes, it's true that writing applications that apply regular expressions or other text processing to "complete" XML can be dangerous. That doesn't mean that people doing that are stupid or poorly trained, however, and neither does it mean that they haven't tried their local XML toolsets first and found them wanting.Simon, I'm afraid that I have to differ with you. Anyone who uses regular expressions for a grammar that relies extensively on parity is either stupid or poorly trained. Sure, you can do text processing (== processing of element names, attribute names, attribute values, and text node contents (without distinguishing reliably between them)) using regular expressions. You can't reliably establish XML structure, because the syntax of XML is specified by a grammar that cannot be handled by a finite automaton, that is not a regular grammar. They arise for two reasons: 1) People are performing tasks that are simple enough that even those drawbacks will not get them in trouble. 2) People are applying these tools to subsets of XML for which these issues are unlikely to apply. Yes, in general, XML is capable of infinite headache-inducement for those foolish enough to approach it with regexes or pretty much any tools that were not written as XML parsers. My time spent writing a markup parser taught me many of them. However, the subset of cases is common enough that condescension is foolish. Yes. But vast quantities of processing work in contexts where there are no screws involved, just nails. That is even true for... gasp... markup. (And yeah, natural language processing is hard. That's not a surprise.)Using regular expressions to handle XML (except in specialized circumstances, possibly including "s/Soviet Union/Russian Federation/g", but almost certainly not including "s/soviet/russian/gi" because the latter (apart from demonstrating a lamentable historical illiteracy (speaking as a formally-trained historian of the Soviet Union, once upon a time)) is too apt to change attribute or element names) is, to follow the pattern of analogy common in recent threads, about the equivalent of handing a carpenter framing lumber and screws and watching him whip out his ... hammer. A carpenter who does so (except in specialized circumstances) is aptly regarded as stupid or poorly trained (generically: not competent to handle the problem). More to the point, the structure such a carpenter creates is going to *fail*, which means it is appropriate for other carpenters to say "that ain't right." I get that people on an XML list freak out when people don't follow all the rules we think we've established. We need to find a better way to handle our freaking out than sputtering about "either stupid or poorly trained" people who "don’t have the inclination, patience or capability to fully understand your language of choice." It makes us look bad, not them. It hurts our cause(s), and doesn't help theirs. That attitude is exactly why I've largely given up speaking about XML to broader audiences and retreated to "markup". It doesn't carry the elitist baggage or visions of infinite complexity. Thanks, -- Simon St.Laurent http://simonstl.com/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



