[Haskell-beginners] let x = x in x (GHC.Prim)
John Ky
newhoggy at gmail.com
Wed Mar 23 21:36:58 UTC 2016
Hi Rahul,
The reason I thought that maybe GHC didn't call the CPU instruction was due
to my benchmarking, which showed that the built-in popCount was slower than
the pure Haskell Broadword implementation:
main = defaultMain
[ env setupEnv (\bv -> bgroup "Rank"
[ bench "PopCnt1 Broadword - Once" (nf (map (\n -> getCount
(PC1BW.popCount1 (DVS.take n bv)))) [0, 1000..100000])
, bench "PopCnt1 GHC - Once" (nf (map (\n -> getCount
(PC1GHC.popCount1 (DVS.take n bv)))) [0, 1000..100000])
] )
]
Benchmarking results follow:
benchmarking Rank/PopCnt1 Broadword - Once
time 18.49 ms (17.89 ms .. 19.08 ms)
0.994 R² (0.989 R² .. 0.998 R²)
mean 19.62 ms (19.22 ms .. 20.04 ms)
std dev 1.026 ms (846.2 μs .. 1.315 ms)
variance introduced by outliers: 21% (moderately inflated)
benchmarking Rank/PopCnt1 GHC - Once
time 36.80 ms (36.25 ms .. 37.57 ms)
0.998 R² (0.996 R² .. 1.000 R²)
mean 37.14 ms (36.72 ms .. 37.97 ms)
std dev 1.100 ms (650.6 μs .. 1.680 ms)
This makes the built-in nearly two times slower than the pure Haskell.
That's how I got into the ghc-prim code.
This is the broadword implementation I was using:
popCount1 x0 = ((x3 * 0x0101010101010101) .>. 56)
where
x1 = x0 - ((x0 .&. 0xaaaaaaaaaaaaaaaa) .>. 1)
x2 = (x1 .&. 0x3333333333333333) + ((x1 .>. 2) .&. 0x3333333333333333)
x3 = (x2 + (x2 .>. 4)) .&. 0x0f0f0f0f0f0f0f0f
Maybe all I need to do is compile with some flags or switch the backend or
something? Any ideas?
Cheers,
-John
On Thu, 24 Mar 2016 at 08:28 John Ky <newhoggy at gmail.com> wrote:
> Hi Rahul,
>
> I see mention of the popcount instruction in nativeGen/X86/CodeGen.hs. In
> particular it looks like it only activates if sse4_2 is activated? Maybe
> all I need to do is activate this somehow?
>
> Cheers,
>
> -John
>
> On Wed, 23 Mar 2016 at 12:07 <rahulmutt at gmail.com> wrote:
>
>> Hi John,
>>
>> ghc-prim is just a stub package generated for the purpose of
>> documentation. All primops are defined in a low level language called Cmm
>> in GHC. If you want to make it even faster, you'll need to learn Cmm and
>> update the definition in GHC. If you want to a specialized implementation
>> for x86 systems, you may need to modify the NCG (Native Code Generator)
>> which requires a knowledge of assembly language.
>>
>> Hope that helps!
>> Rahul Muttineni
>>
>> Sent from my BlackBerry 10 smartphone.
>> *From: *John Ky
>> *Sent: *Wednesday 23 March 2016 4:40 AM
>> *To: *The Haskell-Beginners Mailing List - Discussion of primarily
>> beginner-level topics related to Haskell
>> *Reply To: *The Haskell-Beginners Mailing List - Discussion of primarily
>> beginner-level topics related to Haskell
>> *Subject: *[Haskell-beginners] let x = x in x (GHC.Prim)
>>
>> Hello Haskellers,
>>
>> I'm trying to write a faster popCount function for x86 systems.
>>
>> I tried cloning the ghc-prim package and repurposing it for my own needs,
>> but it isn't working as hoped.
>>
>> In particular, popCnt64# was implemented in GHC.Prim as:
>>
>> popCnt64# = let x = x in x
>>
>> Which shouldn't terminate. Yet when I call it, it magically finds the C
>> implementation in hs_popcnt64 and returns the correct value.
>>
>> My cloned project doesn't behave that way. Instead it doesn't terminate
>> as I would expect.
>>
>> Anyone know what's happening here, if there is a way to make this work or
>> tell me if I'm going about this completely the wrong way?
>>
>> Cheers,
>>
>> -John
>>
>>
>> _______________________________________________
>> Beginners mailing list
>> Beginners at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160323/d0fffcdb/attachment.html>
More information about the Beginners
mailing list