[Haskell-cafe] How to benckmark a Word64 -> Bool?
Charles-Pierre Astolfi
cpa at crans.org
Wed Feb 19 20:47:51 UTC 2014
Thanks Erik!
I was kind of hoping for such a benchmark. On my system, the C version is
consistently twice as fast as the Haskell version. (I tried -fllvm, but it
doesn't change perf)
Then I added bang patterns for the argument of only012 and in the tuple
(p,q) and it now has the same runtime as C.
Just for fun, I tried to void writing explicit recursion and used only
combinators and this Haskell code is still as competitive as C, so the real
bottleneck really was the laziness in the tuple/argument.
http://lpaste.net/100153
In fact, this Haskell version takes 8 sec vs 10 sec for the C version on my
laptop (input: 1000000000).
I tried using Data.List.Stream, I couldn't get any performance gain,
probably because the code is very close to optimal.
So remember kids:
- Bang pattern the heck out of your code (not a silver bullet, but
definitely worth trying).
- Don't rely on too much IO for benchmark.
- Don't be afraid of combinators even if you want performance.
- Ask on haskell-cafe, chaps here are very helpful :)
--
Cp
On Wed, Feb 19, 2014 at 8:39 PM, Erik Rantapaa <erantapaa at gmail.com> wrote:
> I think a lot is being obscured by the IO subsystem.
>
> If you print just a count of the only012 numbers I get very different
> results.
>
> Here are the versions I compared:
>
> http://lpaste.net/100142
>
> and on my system the C version is about 10x faster.
>
>
> On Wednesday, February 19, 2014 12:49:36 PM UTC-6, Charles-Pierre Astolfi
> wrote:
>
>> Switching to quotRem gave no measurable improvements.
>> After switching to ByteString, the code now runs in 9 seconds, which
>> outperforms my C version. But honestly, I have no idea why.
>>
>> New code:
>> http://lpaste.net/100136
>>
>> $ ghc --make -O3 303only012.hs && time ./303only012 50000000 > /dev/null
>> ./303only012 50000000 > /dev/null 9.72s user 0.21s system 90% cpu 10.961
>> total
>>
>> @Alois, I'm not sure how criterion can help compare my code with the C
>> version, since in the C version I cannot measure the exec time of only012
>> only. What did you have in mind?
>>
>> Thanks everyone!
>>
>>
>> --
>> Cp
>>
>>
>> On Wed, Feb 19, 2014 at 7:24 PM, Levent Erkok <erk... at gmail.com> wrote:
>>
>>> Also, prefer quotRem over divMod as the former is faster. See here:
>>> http://stackoverflow.com/questions/339719/when-is-
>>> the-difference-between-quotrem-and-divmod-useful
>>>
>>>
>>> On Wed, Feb 19, 2014 at 10:19 AM, Gregory Collins <
>>> gr... at gregorycollins.net> wrote:
>>>
>>>>
>>>> On Wed, Feb 19, 2014 at 9:36 AM, Charles-Pierre Astolfi <c... at crans.org
>>>> > wrote:
>>>>
>>>>> So there's a difference, but I'm not sure if it's related to my
>>>>> algorithm or related to IO/RTS.
>>>>
>>>>
>>>> Text.Printf is slower than the C version (and uses String). Use
>>>> Data.ByteString.putStr instead.
>>>>
>>>>
>>>>
>>>> --
>>>> Gregory Collins <gr... at gregorycollins.net>
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> Haskel... at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140219/474d5d6f/attachment.html>
More information about the Haskell-Cafe
mailing list