[Haskell-cafe] Haskell translation and transformation

Jaro Reinders jaro.reinders at gmail.com
Tue Apr 27 12:57:58 UTC 2021


> My question is: does the ghc use primitive types automatically when possible?
> Otherwise, I cannot explain the same times... Or, to my big surprise, using primitives
> does not save memory and computation time, really?

GHC mainly uses inlining to transform high level code into code that uses 
primitive types. Given a piece of code like:

acker :: Int -> Int -> Int
acker 0 n = n + 1
acker m 0 = acker (m - 1) 1
acker m n = acker (m - 1) (acker m (n - 1))

GHC will inline (+) and (-). The definition of these functions is:

(I# x) + (I# y) = I# (x +# y)
(I# x) - (I# y) = I# (x -# y)

So it transforms to something like:

acker 0 (I# n) = I# (n +# 1#)
acker (I# m) 0 = acker (I# (m -# 1#)) 1
acker (I# m) (I# n) = acker (I# (m -# 1#)) (acker (I# m) (I# (n -# 1#)))

And what I think is called the worker wrapper transformation can transform it 
into something like:

acker (I# m) (I# n) = acker# m n

acker# 0# n = n +# 1#
acker# m 0# = acker# (m -# 1#) 1#
acker# m n = acker# (m -# 1#) (acker# m (n -# 1#))

Which completely eliminates the boxing and unboxing (the I#s) in the tight loop.

> And the other question is about reasoning during translation and code generation,
> what is the reason the code is so slow? I would guess that forcing primitive types
> and strict evaluation would produce a code with comparable time to C code... The
> difference seems to be like the one between compiled code to executable and to
> low level virtual machine code, which is interpreted then.

I think it is mainly that in small tight loops like your ackermann function, 
low level optimizations become much more important.

On godbolt you can easily compare the produced assembly:

Haskell https://godbolt.org/z/nejqWsq9z (acker is Main_$wacker_info)
C 	https://godbolt.org/z/WKW7vcvfb (GCC 9.3 is easier to read than 10.2)

The main hot code path blocks like:

c4pc_info:
         movq 8(%rbp),%rax
         decq %rax
         addq $16,%rbp
         movq %rbx,%rsi
         movq %rax,%r14

And

         movq $c4pc_info,-16(%rbp)
         decq %rsi
         movq %r14,%rax
         movq %rax,-8(%rbp)
         addq $-16,%rbp
         jmp Main_$wacker_info

Just seem very awkward when compared to to the GCC assembly. But I'm not 
knowledgeable enough about GHC's internals to know why this awkward code is 
generated.

Cheers,

Jaro


More information about the Haskell-Cafe mailing list