[Haskell-cafe] Re: Interest in helping w/ Haskell standard

Wolfgang Jeltsch wolfgang at jeltsch.net
Sat Oct 15 06:51:24 EDT 2005


Am Freitag, 14. Oktober 2005 16:25 schrieben Sie:
> On Fri, Oct 14, 2005 at 04:20:24PM +0200, Wolfgang Jeltsch
> <wolfgang at jeltsch.net> wrote:
>
> > I always couldn't understand why one has to write regular
> > expressions as strings
>
> Because the language used inside these strings is standard,
> multi-language, widely used and documented?

Well, in my opinion, the standard regexp syntax is rather awkward so that 
diverging from the standard might be a good thing.  However, my proposal was 
not about introducing a new syntax.  If I had just used a different syntax, I 
had used strings for representing regexps as well.  But my main point is to 
not use strings for representing regexps at runtime because this means that 
parsing is done at runtime.  This might result in a loss of efficiency.  In 
addition, no syntax checks can be done at runtime.  The situation gets worse 
if you try to manipulate regular expressions.

Now lets consider using an algebraic datatype for regexps:

	data RegExp
		= Empty | Single Char | RegExp :+: RegExp | RegExp :|: RegExpt | Iter RegExp

Manipulating regular expressions now becomes easy and safe – you are just not 
able to create "syntactically incorrect regular expressions" since during 
runtime you don't deal with syntax at all.

In addition, the usage of a special datatype can provide more flexibility.  
Representing regexps as strings means that regexps can only denote sets of 
strings.  In contrast, the above datatype could easily be extendend to allow 
arbitrary lists instead of just strings:

	data RegExp token
		= Empty | Single token | RegExp token :+: RegExp token | ...

If you really need a Perl-like syntax for regular expressions, the strings 
representing the regexps should be parsed at compile-time and transformed 
into expressions of a special regexp datatype like the one above.

However, I don't like the idea of extending the language with a special regexp 
syntax.  Why handle a specific, albeit common, syntax for a special case of 
regexps (string-only) special?  What about other things than regexps?  Should 
they also get a language extension?

I'd say that the better way would be to use Template Haskell for this purpose:

	myRegExp = $(regExp "[a-z0-9]")

This way, special syntaxes are not hard-wired into the language but can be 
activated by importing a corresponding module.

Best wishes,
Wolfgang


More information about the Haskell-Cafe mailing list