[Haskell-cafe] Safer common subexpression elimination

Joachim Breitner mail at joachim-breitner.de
Thu Nov 25 11:32:21 CET 2010


Hi,

although semantically it could, ghc does not do common subexpression
elimination (CSE), at least not in every case. The canonical example
where it would be bad is the function (with div to avoid numerical
casting):

> avg :: [Int] -> Int
> avg n = (sum [0..n] `div` length [0..n])

The reason why [0..n] is not shared is because then it would be kept in
memory completely after the evaluation of sum, because length will still
need it, possibly overflowing the memory. In the non-shared form,
though, the garbage collector can clean up directly behind sum and
length, and the program runs in constant space.

Note that the shared expression would be very large compared to the
thunks.

Now consider the program:

> avg' :: [Int] -> (Int, Int)
> avg' n = (sum [0..n] `div` length [0..n], length [0..n])

I’d say that CSE to 
> avg n = (sum [0..n] `div` l, l)
>   where l = length [0..n]
is safe, because the shared expression (an Int) is smaller than the
thunks and the compiler can tell.

So I wonder:
 * Is sharing values of type Int (and Bool and similar small values)
always safe?
 * If so: does GHC already do that?
 * Would it be technically possible?
 * Is there an established theory that can tell, for a sharing
candidate, whether it is safe to share it?

Thanks,
Joachim

-- 
Joachim "nomeata" Breitner
  mail: mail at joachim-breitner.de | ICQ# 74513189 | GPG-Key: 4743206C
  JID: nomeata at joachim-breitner.de | http://www.joachim-breitner.de/
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20101125/35ad6ce5/attachment.bin


More information about the Haskell-Cafe mailing list