[Haskell-cafe] Haskell performance when it comes to regex?

Alfredo Di Napoli alfredo.dinapoli at gmail.com
Mon May 22 07:48:08 UTC 2017


Hi Bram,

you might be interested in the “regex” package from my colleague Chris
Dornan:

http://regex.uk/

I know some proper performance work still needs to be done, but I would be
curious to hear your experience report ;)

Alfredo

On 19 May 2017 at 18:52, Bram Neijt <bneijt at gmail.com> wrote:

> Thank you!
>
> I already changed to Text instead, but I thought the regex was already
> memoized by GHC, so that should not be a problem.
>
> I'm trying regex-applicative now, maybe that will help, but it takes
> some time to figure out the syntax. I'll also try to see if
> precompilation helps.
>
> Greetings,
>
> Bram
>
>
>
> On Fri, May 19, 2017 at 1:17 PM, Станислав Черничкин
> <schernichkin at gmail.com> wrote:
> > Try to use Text or ByteString instead of strings. Try to use compile and
> > execute methods
> > (http://hackage.haskell.org/package/regex-tdfa-1.2.1/docs/
> Text-Regex-TDFA-ByteString.html),
> > make sure regex get compiled once.
> >
> > 2017-05-16 12:12 GMT+03:00 Bram Neijt <bneijt at gmail.com>:
> >>
> >> Dear reader,
> >>
> >> I decided to do a little project which is a simple search and replace
> >> program for large text files.
> >>
> >> Written in Haskell, it does a few different regex matches on each line
> >> and stores them in a leveldb key-value store to create a
> >> consistent/reviewable search-replace index. It should provide for some
> >> simple/brute-force anonymization of data and therefore I called it
> >> hanon (sorry, could not think of a better name).
> >>
> >> https://github.com/BigDataRepublic/hanon
> >>
> >> The code works, but I've done some benchmarking to compare it with
> >> Python and the code is about 80x slower then doing the same thing in
> >> Python, making it useless for larger data files.
> >>
> >> I'm obviously doing something wrong.
> >>
> >> Could you give me tips on improving the performance of this code?
> >> Probably mainly looking at
> >>
> >> https://github.com/BigDataRepublic/hanon/blob/master/src/Mapper.hs
> >>
> >> where the regex code lives?
> >>
> >> Greetings,
> >>
> >> Bram
> >> _______________________________________________
> >> Haskell-Cafe mailing list
> >> To (un)subscribe, modify options or view archives go to:
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >> Only members subscribed via the mailman list are allowed to post.
> >
> >
> >
> >
> > --
> > Sincerely, Stanislav Chernichkin.
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170522/2b360924/attachment.html>


More information about the Haskell-Cafe mailing list