Common subexpressions are not optimized to single one?

ajb at spamcop.net ajb at spamcop.net
Tue Dec 2 20:55:42 EST 2003


G'day all.

Quoting Koji Nakahara <yu- at div.club.ne.jp>:

> I'm wondering about the optimization of ghc. I thought that since
> functions are pure in haskell, compiler can/do factorize common
> subexpressions when it optimizes a code.

The short answer is "no".  While the optimisation preserves semantics,
it will not, in general, preserve pragmatics.

A classic (and contrived) example is this:

        slow_pow2 :: Int -> Int
        slow_pow2 n
          = length (subsets [1..n])
          where
            subsets [] = [[]]
            subsets (x:xs) = subsets xs ++ [ x:xs' | xs' <- subsets xs ]

On my machine, slow_pow2 32 runs (very slowly; I didn't bother to let
it finish).  Optimising the common subexpression on the last line (i.e.
the recursive call) exhausts the heap.

In general, automatically optimising common subexpressions is only a
good idea if the compiler can prove that the expression optimised has
a bounded size, and even then, it's not always a good idea.  Some
libraries which use unsafePerformIO creatively actually rely on this
behaviour.  I've written code, for example, which uses does something
like this:

        {-# NOINLINE fooRef1 #-}
        fooRef1 :: IORef Foo
        fooRef1 = unsafePerformIO (newIORef makeAFoo)

        {-# NOINLINE fooRef1 #-}
        fooRef2 :: IORef Foo
        fooRef2 = unsafePerformIO (newIORef makeAFoo)

        fooOp1 :: IO ()
        fooOp1 = modifyIORef fooRef1 something >> return ()

        fooOp2 :: IO ()
        fooOp2 = modifyIORef fooRef2 somethingElse >> return ()

This is perfectly safe (as long as the compiler respects NOINLINE!), but
would break horribly if fooRef1 and fooRef2 were "optimised".

Cheers,
Andrew Bromage


More information about the Haskell-Cafe mailing list