Re: regular expressions

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: xml-dev@l...
Subject: Re: regular expressions
From: Bob Foster <bob@o...>
Date: Thu, 29 Jan 2004 19:38:01 -0600
Cc: David Tolpin <dvd@d...>
In-reply-to: <4019A6ED.3000803@o...>
References: <200401292002.i0TK2kHX080757@a...> <4019A6ED.3000803@o...>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007

David Tolpin wrote:

 >>>    s-pattern="""
 >>>      comment = "\(([^\(\)\\]|\\.)*\)"
 >>>      atom = "[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+"
 >>>      atoms = atom "(\." atom ")*"
 >>>      [...]
 >>>
 >>>Why isn't it done?
 >>
 >>
 >>HyLex used a similar syntax for regular expressions.
 >>I've always wondered why the idea never caught on elsewhere.
 >>(Then again, none of the ideas from HyTime ever really
 >>caught on...)
 >
 >
 > In fact, I've implemented it in an extension datatype library for my 
Relax
 > NG validator; it is only 70 lines of code in Scheme, after all. Proved
 > to be very useful for debugging.

Very clever. But a naive implementation would just recursively 
concatenate the strings to make a single regex strings. Could you 
elaborate on the debugging advantage, i.e., how it makes it easier for a 
schema writer to debug regular expressions?

Jeni Tennison used the same idea with a slightly different syntax in her 
DTLL proposal (I've lost the URL). Her idea had the added twist that an 
application could receive the results of the regular expression parse as 
a structured result, e.g., through a SAX API. Thus, using your example, 
the string "(David Tolpen)David.Tolpin@n..." might produce the 
'infoset':

<start>
   <comment>(David Tolpen)</comment>
   <local-part>
     <atoms>
       <atom>David</atom>.<atom>Tolpin</atom>
     </atoms>
    </local-part>@<domain>
     <atoms>
       <atom>nospam</atom>.<atom>net</atom>
     </atoms>
    </domain>
</start>

This still seems a fruitful avenue to explore.

Bob Foster
http://xmlbuddy.com/

Follow-Ups:
- Re: regular expressions
  - From: David Tolpin <dvd@d...>

References:
- regular expressions
  - From: David Tolpin <dvd@d...>
- Re: regular expressions
  - From: Bob Foster <bob@o...>

Prev by Date: Re: regular expressions
Next by Date: Re: regular expressions
Previous by thread: Re: regular expressions
Next by thread: Re: regular expressions
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >