[Haskell-cafe] Optimizations for list comprehension
Simon Peyton Jones
simonpj at microsoft.com
Wed Aug 21 15:47:58 UTC 2019
I think what is happening is that
* In “inside” there are two distinct lists [0..n-1], each used once; they are fused with their consumers.
* In “outside” there is one list [0..n-1], which is used twice. GHC is paranoid about duplicating work (and would be right to do so if producing the list was expensive), so it does not fuse the list with its two consumers.
In this case GHC’s paranoia is not justified. It’d be better to duplicate the production of [0..n-1] so that it can fuse with its consumers.
One way to address this problem might be “cheapBuild”. There’s a ticket about this: https://gitlab.haskell.org/ghc/ghc/issues/7206. I see that a year ago the TL;DR was “Conclusion: let's do it. Ensuring that a Note gives a clear explanation, and points to this ticket.” But no one yet has.
Maybe someone might see if the cheapBuild idea really does solve this particular case.
Simon
From: Haskell-Cafe <haskell-cafe-bounces at haskell.org> On Behalf Of William Yager
Sent: 19 August 2019 15:35
To: Emilio Francesquini <francesquini at gmail.com>
Cc: Haskell Cafe <haskell-cafe at haskell.org>
Subject: Re: [Haskell-cafe] Optimizations for list comprehension
I'm not sure which exact optimizations are responsible, but based on --ddump-simple,
* "inside" is not allocating any lists at all. It's just a couple loops over unboxed ints
* "outside" is actually allocating a (single) list data structure and has an inner loop and an outer loop, both of which traverse the list
GHC seems to be too aggressive about sharing "range" in "outside". Adding a unit argument to "range" makes both functions go fast.
On Mon, Aug 19, 2019 at 8:14 PM Emilio Francesquini <francesquini at gmail.com<mailto:francesquini at gmail.com>> wrote:
Hello Cafe,
While investigating a performance problem I stumbled upon what I eventually reduced to the example below:
module Main where
import Data.Time.Clock
outside :: Int -> Int
outside n =
sum [i + j | i <- range, j <- range]
where
range = [0..n-1]
inside :: Int -> Int
inside n =
sum [i + j | i <- [0..n-1], j <- [0..n-1]]
main :: IO ()
main = do
t0 <- getCurrentTime
print $ inside 10000
t1 <- getCurrentTime
print $ outside 10000
t2 <-getCurrentTime
print (diffUTCTime t1 t0)
print (diffUTCTime t2 t1)
Compiling with -O2, up to GHC 8.2.2, both `inside` and `outside` functions would take the same amount of time to execute. Somewhere between GHC 8.2.2 and 8.6.4 something changed (possibly some new optimization) making `inside` run ~4x faster on my machine. With LLVM the difference is even bigger.
It is not that `outside` got slower, but that `inside` got much faster. I'm curious to what optimizations might be happening to the `inside` function that would not fire on the outside function.
Any hints?
Best regards,
Emilio
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fhaskell-cafe&data=02%7C01%7Csimonpj%40microsoft.com%7C2fb10c8d6105435beb9b08d724b27515%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637018221197211672&sdata=c4XWHdztJyFMMmPddKC%2B3wxgPb03%2B9yAiZHImzjpYBM%3D&reserved=0>
Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20190821/62a8420f/attachment.html>
More information about the Haskell-Cafe
mailing list