replacing guile with haskell?

Thu Oct 23 19:10:35 EDT 2003

At 20:40 21/10/03 -0400, ajb at spamcop.net wrote:
>I've done it for regular expressions (e.g. lex, alex etc), but not for
>regexps (e.g. Text.Regex).  This particular terminology overload annoys
>me no end, by the way.
>
>I checked the code into haskell-libs.  It hasn't propagated through to
>the web view of the repository yet, but you should eventually be able
>to find it here:
>
>     http://cvs.sourceforge.net/viewcvs.py/haskell-libs/libs/text/
>
>Look for Dfa.lhs.

Thanks for this... it turns out I have an immediate use for it.  I've 
downloaded it but I'm having a little trouble figuring out how to drive 
it.  Is there any description anywhere?

Absent that, this is what I am figuring (am I getting this right?):

+ The supplied regex is a value of type Re t, which describes an expression 
to match a sequence of tokens of type t.  In common use, t might be Char.

+ The constructors for Re are:
     ReOr [Re t]    -- matches any one of the Re's in the list
   | ReCat [Re t]   -- matches a sequence of the RE's in the list
   | ReStar (Re t)  -- matches zero or more of the given Re
   | RePlus (Re t)  -- matches one or more of the given Re
   | ReOpt (Re t)   -- matches zero or one of the given Re
   | ReTerm [t]     -- matches a sequence of tokens exactly matching
                       the given list.

That last is very much guesswork:  is it true that
     ReTerm ts = ReCat $ map (\t ReTerm [t]) ts
?

A function to construct an (Re Char) from a simple textual representation 
would be handy.  Maybe I'll tackle that.

...

Having got an Re value, matchRe is a function that applies it (after 
compilation) to a sequence of 't', returning True if the expression is 
matched, otherwise False.

Is this about right?

There's another function matchRe2, which seems to do something recursive 
with the regular expression but I can't figure out what.  Is it just a 
different implementation strategy?  It does seem to give the same answers.

...

I also noticed this comment in the code:
[[
Utility typeclasses for enforcing all the constraints we need
on our monad's free type variables.  Note that this requires both
-fglasgow-exts and -fallow-undecidable-instances in GHC to work
properly.
]]

I was wondering if the use of -fallow-undecidable-instances might cause 
problems with Hugs, though it does seem to work.

I'm also wondering if there's a way to use this to match a leading 
subsequence (rather than the entire sequence of tokens supplied), and to 
discover which parts of the regexp have been matched.

...

[later]

I've also looked at the Haskell Dynamic Lexer Engine
(http://www.nondot.org/sabre/Projects/HaskellLexer/) that Derek pointed out 
... it looks a closer fit to my current goals, though I do like the 
"purity" of your approach (i.e. its focus on the core engine).

#g

------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact