[Haskell-cafe] Haskell performance when it comes to regex?

Rein Henrichs rein.henrichs at gmail.com
Wed May 24 23:43:35 UTC 2017


I recommend benchmarking with criterion and GHC profiling so you know where
the slow actually is before trying to optimize anything.

On Tue, May 23, 2017 at 9:26 AM David Fox <dsf at seereason.com> wrote:

> I have been surprised at how rarely switching to Text or ByteString makes
> things significantly faster.  If you do this you should look at
> Data.ByteString.Builder or Data.Text.Lazy.Builder.
>
> On Fri, May 19, 2017 at 4:17 AM, Станислав Черничкин <
> schernichkin at gmail.com> wrote:
>
>> Try to use Text or ByteString instead of strings. Try to use compile and
>> execute methods (
>> http://hackage.haskell.org/package/regex-tdfa-1.2.1/docs/Text-Regex-TDFA-ByteString.html),
>> make sure regex get compiled once.
>>
>> 2017-05-16 12:12 GMT+03:00 Bram Neijt <bneijt at gmail.com>:
>>
>>> Dear reader,
>>>
>>> I decided to do a little project which is a simple search and replace
>>> program for large text files.
>>>
>>> Written in Haskell, it does a few different regex matches on each line
>>> and stores them in a leveldb key-value store to create a
>>> consistent/reviewable search-replace index. It should provide for some
>>> simple/brute-force anonymization of data and therefore I called it
>>> hanon (sorry, could not think of a better name).
>>>
>>> https://github.com/BigDataRepublic/hanon
>>>
>>> The code works, but I've done some benchmarking to compare it with
>>> Python and the code is about 80x slower then doing the same thing in
>>> Python, making it useless for larger data files.
>>>
>>> I'm obviously doing something wrong.
>>>
>>> Could you give me tips on improving the performance of this code?
>>> Probably mainly looking at
>>>
>>> https://github.com/BigDataRepublic/hanon/blob/master/src/Mapper.hs
>>>
>>> where the regex code lives?
>>>
>>> Greetings,
>>>
>>> Bram
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> To (un)subscribe, modify options or view archives go to:
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>> Only members subscribed via the mailman list are allowed to post.
>>
>>
>>
>>
>> --
>> Sincerely, Stanislav Chernichkin.
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
>>
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170524/07c7b32a/attachment.html>


More information about the Haskell-Cafe mailing list