suboptimal ghc code generation in IO vs equivalent pure code case

Sat May 14 20:21:38 UTC 2016

On 14/05/16 02:31 PM, Harendra Kumar wrote:
> The difference seems to be entirely due to memory pressure. At list size 1000 both pure version and IO version perform equally. But as the size of the list increases the pure version scales linearly while the IO version degrades exponentially. Here are the execution times per list element in ns as the list size increases:
>
> Size of list  Pure       IO
> 1000           8.7          8.3
> 10000         8.7          18
> 100000       8.8          63
> 1000000     9.3          786
>
> This seems to be due to increased GC activity in the IO case. The GC stats for list size 1 million are:
>
> IO case:       %GC     time      66.1%  (61.1% elapsed)
> Pure case:   %GC     time       2.6%  (3.3% elapsed)
>
> Not sure if there is a way to write this code in IO monad which can reduce this overhead.

Something to be aware of is that GHC currently can't pass multiple return values in registers (that may not be a 100% accurate statement, but a reasonable high level summary, see ticket for details)

https://ghc.haskell.org/trac/ghc/ticket/2289

This can bite you with with the IO monad as having to pass around the world state token turns single return values into multiple return values (i.e., the new state token plus the returned value).

I haven't actually dug into your code to see if this is part of the problem, but figured I would mention it.

Cheers!  -Tyson