[Haskell-cafe] Haskell translation and transformation
Jaro Reinders
jaro.reinders at gmail.com
Tue Apr 27 12:57:58 UTC 2021
> My question is: does the ghc use primitive types automatically when possible?
> Otherwise, I cannot explain the same times... Or, to my big surprise, using primitives
> does not save memory and computation time, really?
GHC mainly uses inlining to transform high level code into code that uses
primitive types. Given a piece of code like:
acker :: Int -> Int -> Int
acker 0 n = n + 1
acker m 0 = acker (m - 1) 1
acker m n = acker (m - 1) (acker m (n - 1))
GHC will inline (+) and (-). The definition of these functions is:
(I# x) + (I# y) = I# (x +# y)
(I# x) - (I# y) = I# (x -# y)
So it transforms to something like:
acker 0 (I# n) = I# (n +# 1#)
acker (I# m) 0 = acker (I# (m -# 1#)) 1
acker (I# m) (I# n) = acker (I# (m -# 1#)) (acker (I# m) (I# (n -# 1#)))
And what I think is called the worker wrapper transformation can transform it
into something like:
acker (I# m) (I# n) = acker# m n
acker# 0# n = n +# 1#
acker# m 0# = acker# (m -# 1#) 1#
acker# m n = acker# (m -# 1#) (acker# m (n -# 1#))
Which completely eliminates the boxing and unboxing (the I#s) in the tight loop.
> And the other question is about reasoning during translation and code generation,
> what is the reason the code is so slow? I would guess that forcing primitive types
> and strict evaluation would produce a code with comparable time to C code... The
> difference seems to be like the one between compiled code to executable and to
> low level virtual machine code, which is interpreted then.
I think it is mainly that in small tight loops like your ackermann function,
low level optimizations become much more important.
On godbolt you can easily compare the produced assembly:
Haskell https://godbolt.org/z/nejqWsq9z (acker is Main_$wacker_info)
C https://godbolt.org/z/WKW7vcvfb (GCC 9.3 is easier to read than 10.2)
The main hot code path blocks like:
c4pc_info:
movq 8(%rbp),%rax
decq %rax
addq $16,%rbp
movq %rbx,%rsi
movq %rax,%r14
And
movq $c4pc_info,-16(%rbp)
decq %rsi
movq %r14,%rax
movq %rax,-8(%rbp)
addq $-16,%rbp
jmp Main_$wacker_info
Just seem very awkward when compared to to the GCC assembly. But I'm not
knowledgeable enough about GHC's internals to know why this awkward code is
generated.
Cheers,
Jaro
More information about the Haskell-Cafe
mailing list