[Haskell-cafe] GHC worker/wrapper on multiple return values
Tyson Whitehead
twhitehead at gmail.com
Thu Feb 18 03:59:57 UTC 2016
Recently discovered GHC generates sub-optimal code when you have
- two levels of loops in the IO monad
- the inner loop doesn't do that much work
- the outer loop does a lot of work
From dumping core I believe the issue is the worker/wrapper transformation doesn't produce return value unboxing (from the inner loop to the outer in this case) when there are multiple return values (as you windup with the IO monad due to passing around the State# RealWorld behind the scenes).
Here is some simple code that demonstrates the issue. With a single return value
{-# LANGUAGE BangPatterns #-}
module Main (main) where
main :: IO ()
main = do
let a = l1 0 100000000
print a
l1 :: Int -> Int -> Int
l1 !a !n =
case n>0 of
True -> let m = l2 0 n 3
in l1 (a+m) (n-1)
False -> a
l2 :: Int -> Int -> Int -> Int
l2 !a !n !m =
case m>0 of
True -> l2 (a+n) (n-1) (m-1)
False -> a
you get nice clean core that performs no allocations and runs beautifully. Doing the same thing within the IO monad (effectively lifting the return value above into a unboxed tupple)
{-# LANGUAGE BangPatterns, MagicHash, UnboxedTuples #-}
module Main (main) where
main :: IO ()
main = do
a <- l1 0 100000000
print a
l1 :: Int -> Int -> IO Int
l1 !a !n = do
case n>0 of
True -> do m <- l2 0 n 3
l1 (a+m) (n-1)
False -> return a
l2 :: Int -> Int -> Int -> IO Int
l2 !a !n !m = do
case m>0 of
True -> l2 (a+n) (n-1) (m-1)
False -> return a
you don't get boxing and unboxing the return value between l1 and l2. This has quite a dramatic effect when l2 doesn't do the that much work (massive amounts of additional memory allocation and a 2x slow down).
$ time ./Simple-pure
14999999850000000
51,912 bytes allocated in the heap
...
real 0m0.322s
...
$ time ./Simple-io
14999999850000000
1,600,051,936 bytes allocated in the heap
...
real 0m0.634s
Should I open a ticket against GHC about this issue? Possibly it wouldn't be that hard to get the work/wrapper stuff to work its unboxing magic on multiple return values too?
Cheers! -Tyson
[*] https://gist.github.com/twhitehead/5744eee28bcde85b9fa4
More information about the Haskell-Cafe
mailing list