[Haskell-cafe] ST Vector / STRef -- Performances and allocation

Sat Jun 18 20:37:52 UTC 2016

Hello.

I read a reddit post [0] which ends in a debate about haskell
performances. Someone gives a C implementation which runs in 15ms on
some dataset. I wanted to learn about mutable vector and ST and got
something in haskell which runs in 50ms. Code is here [1].

[0] https://www.reddit.com/r/haskell/comments/4ojsn0/counting_inversions_haskell/
[1] https://github.com/guibou/inversionsCount/blob/syntax/inversions.hs

When I'm profiling the code, I'm surprised to see more than 50% of the
%alloc in function which, from my point of view, must not allocate.

For example, this function :

inc :: forall s. STRef s Int -> ST s ()
inc v = modifySTRef' v (+1)

Is responsible for 10% of the allocation in this program. I was hoping
that the "unboxing magic" of GHC may have replaced my Int somewhere by
an unboxed one, but I guess the allocation means that the STRef is
somehow pointing to a heap allocated Int.

Do you know a way to remove theses allocations?

As a second question, if you see any other way to improve this
implementation, I'll be happy to learn something new ;)

Finally, if you read my code, you'll see that I implemented a small
DSL around ST / STRef / MVector to make my main function more
readable. How do people write readable code in real life ?

Especially, I was annoyed when I tried to convert this piece of C code :

if (x1 < mid && (x2 == sz || tmp[x1] <= p[x2])) {
    a;
} else {
   b;
}

In my haskell code, x1, x2 are STRef, and tmp and p are MVector. This
was highly painful to write in haskell and I had to write something
such as:

x1 <- readSTRef x1'
x2 <- readSTRef x2'

cond <- if x1 < mid
                 then if x2 == sz
                           then return True
                           else do
                                tmp_x1 <- Vector.read tmp x1
                                p_x2 <_ Vector.read p x2
                                return (tmp_x1 <= p_x2)
                 else return False
if cond
   then a
   else b

This is painful, complex and error prone, so is there another solution ?

Thank you.