[Haskell] regular expression syntax - perl ain't got nothin on haskell

Ganesh Sittampalam ganesh at earth.li
Mon Mar 8 18:25:24 EST 2004


On Tue, 24 Feb 2004 07:18:58 -0800 (PST), Hal Daume III <hdaume at ISI.EDU>
wrote:

>just as another sample point...
>
>i write 99% of my code in either haskell or perl.  haskell tends to be for 
>the longer programs, perl tends to be for the shorter ones, though the 
>decision is primarily made for only one reason:
>
>  - if the overhead to write the string processing code in haskell
>    is outweighed by the overall length of the program, use haskell.
>    otherwise, use perl.
>
>i would be very very happy to abandon perl all together, but, for the most 
>part, this isn't a niche haskell has been able to fit well in to yet.

Another sample point:

I hacked together a perl script to do a particular task in about 30 minutes,
including fixing algorithmic issues with the problem I wanted to solve.

I then thought I'd try porting it to Haskell; I started out by doing the
really dumb conversion of mutable variables to IORefs, hashes to FiniteMaps,
and Perl regular expressions to Text.Regex (i.e. GNU extended regexps). I'd
forgotten about this thread at the time, otherwise I might have tried one of
the cleverer options.

Some observations:

(1) It took me several hours to get it working. Mostly this was because
debugging was difficult - firstly, I got an unhelpful type error message
from GHC followed by problems with making the code I developed with GHC 5
work with GHC 6 so I could show someone else the problem. Then I had a
syntax error in one of my regular expressions, which led to a run-time error
with no information about which regular expression the error was in or where
the error was. Finally debugging semantic problems with the regular
expressions wasn't very pleasant.

(2) The code ran three times as slowly. Profiling it suggests that the time
is being wasted in the regexp matches; quite possibly the main cost is in
marshalling Haskell strings to C strings. The comments in Text.Regex.Posix
suggest a PackedString interface should be provided; I should try making one
and seeing if things are better.

(3) The code was twice as long. Mostly this was for obvious reasons; the
translation of mutable variables to IORefs leads to some overhead in reading
from them, and perl has nice syntax for manipulating hashes.

I don't really have any point, except that it would be nice if it hadn't
turned out that Perl was clearly the better choice :-/

Ganesh


More information about the Haskell mailing list