[Haskell-cafe] Efficient parallel regular expressions

Krasimir Angelov kr.angelov at gmail.com
Wed Nov 5 03:35:16 EST 2008

Hi Martijn,

If you are brave to start implementing DFA with all required
optimisations then you might want to look at:


This is a compiler for language called JAPE. In the language you
define a set of rules where the right hand side
is a regular expression and the left hand side is a Java code. The
compiler itself is implemented in Haskell.
It includes code to build DFA from the set of regexps and then it does
determinization and minimization.

I wrote the compiler few years ago. You can decide to take and change
the code or to reimplement it yourself. Definitely DFA guarantees that
the performance is always linear while with Parsec you have to be


On Tue, Nov 4, 2008 at 6:05 PM, Martijn van Steenbergen
<martijn at van.steenbergen.nl> wrote:
> Hello all,
> For my mud client Yogurt (see hackage) I'm currently working on
> improving the efficiency of the hooks. Right now several hooks, each
> consisting of a regex and an action can be active at the same time.
> Every time a line of input is available (usually several times a second)
> I run the line through all the available regexes and execute the first
> matching action.
> I figured this is not the cleverest approach and it'd be better if I
> |'ed all regexes into one big DFA. However, how do I then find out which
> of the original hooks matched and so which action to execute?
> As far as I know there's no way to do that with Text.Regex. Alex looks
> promising but is really only an executable and doesn't offer an API.
> I've also found mr. João Saraiva's HaLex but I don't know if that was
> meant to be used seriously.
> Does anyone have any experience with this? What's the best way to
> achieve this?
> Thanks much,
> Martijn.
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

More information about the Haskell-Cafe mailing list