[Home] [By Thread] [By Date] [Recent Entries]

  • To: xml-dev@l...
  • Subject: RE: If XML is too hard for a programmer, perhaps he'd b e better off as a crossing guard
  • From: Sean McGrath <sean.mcgrath@p...>
  • Date: Sat, 29 Mar 2003 06:06:33 +0000

[Jeff Lowery]
 >This thread sounds more like an argument for full-document validation prior
 >to processing, or at the very least making sure you've done document version
 >checking.  Once that happens, regexing should be fine (assuming the
 >programmer understands the schema or specification the version info is based
 >on).

Not so. A valid document can be, lexically speaking, from outer space. You 
*do* need
to worry if regexp are being used, - even for valid documents.

Example Pulse:

<!DOCTYPE pulse [
<!ELEMENT pulse (#PCDATA)>
<!ENTITY LetsHaveOneOfThese "2">
]>
<pulse      >
&#55;<!-- Ode to the lump of green putty, I found in my armpit, one 
mid-summer morning.-->
<![CDATA[ 0]]>
&LetsHaveOneOfThese;
</pulse
 >

What I'd love to see would be an XML Lint tool - a parser with the ability 
to produce messages
on stderr if it sees stuff in the XML that could trip up regexp processing.

Perl types could then knock themselves out with a beautifully crafted 
McCarthy conditional
as a guard command to their regexp:-)

regards,
Sean

http://seanmcgrath.blogspot.com



Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member