[Haskell-cafe] Reasoning about performance

Wed Sep 4 02:09:28 CEST 2013

Well for one thing, note that allPairs3 produces the result in reverse
order:

>>> allPairs1 "abc"
[('a','b'),('a','c'),('b','c')]
>>> allPairs2 "abc"
[('a','b'),('a','c'),('b','c')]
>>> allPairs3 "abc"
[('b','c'),('a','c'),('a','b')]

allPairs2 uses "guarded recursion" which the optimizer probably likes,
although I don't quite see why that version would use twice as much memory
as allPairs3.

See also:
http://www.haskell.org/haskellwiki/Tail_recursion

Here's a fun alternative for you to benchmark, using an old trick. I kind
of doubt that this one will optimize as nicely as the others, but I am by
no means an optimization guru:

allPairsS :: [a] -> [(a, a)]
allPairsS xs = go xs [] where
  go [] = id
  go (y:ys) = (map (\a -> (y, a)) ys ++) . go xs

Further reading:
http://www.haskell.org/haskellwiki/Difference_list

-- Dan Burton

On Tue, Sep 3, 2013 at 3:28 PM, Scott Pakin <pakin at lanl.gov> wrote:

> I'm a Haskell beginner, and I'm baffled trying to reason about code
> performance, at least with GHC.  For a program I'm writing I needed to
> find all pairs of elements of a list.  That is, given the list "ABCD"
> I wanted to wind up with the list
> [('A','B'),('A','C'),('A','D')**,('B','C'),('B','D'),('C','D')**], where
> the order of elements in the resulting list is immaterial, but I don't
> want both ('A', 'B') and ('B', 'A'), for example.
>
> My first attempt looked like this:
>
>     allPairs1 :: [a] -> [(a, a)]
>     allPairs1 []     = []
>     allPairs1 (x:xs) = map (\a  -> (x, a)) xs ++ allPairs1 xs
>
> Benchmarking with ghci's ":set +s" reveals the following performance
> (median of three runs shown):
>
>     $ ghci
>     GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
>     Loading package ghc-prim ... linking ... done.
>     Loading package integer-gmp ... linking ... done.
>     Loading package base ... linking ... done.
>     Prelude> :!ghc -c -O2 allpairs.hs
>     Prelude> :load allpairs
>     Ok, modules loaded: AllPairs.
>     Prelude AllPairs> :m +Control.DeepSeq
>     Prelude Control.DeepSeq AllPairs> :show modules
>     AllPairs         ( allpairs.hs, allpairs.o )
>     Prelude Control.DeepSeq AllPairs> :set +s
>     Prelude Control.DeepSeq AllPairs> deepseq (allPairs1 [1..10000]) True
>     True
>     (4.85 secs, 4004173984 bytes)
>
> A colleague suggested getting rid of the list append as follows:
>
>     allPairs2 :: [a] -> [(a, a)]
>     allPairs2 [] = []
>     allPairs2 (x:xs) = allPairs2' x xs xs
>       where allPairs2' x []     []     = []
>             allPairs2' x (y:ys) zs     = (x,y) : allPairs2' x ys zs
>             allPairs2' x []     (z:zs) = allPairs2' z zs zs
>
> Not surprisingly, that runs faster:
>
>     Prelude Control.DeepSeq AllPairs> deepseq (allPairs2 [1..10000]) True
>     True
>     (2.14 secs, 4403686376 bytes)
>
> I then figured I could do even better by rewriting the code
> tail-recursively:
>
>     allPairs3 :: [a] -> [(a, a)]
>     allPairs3 [] = []
>     allPairs3 (x:xs) = allPairs3' x xs xs []
>       where allPairs3' :: a -> [a] -> [a] -> [(a, a)] -> [(a, a)]
>             allPairs3' h (x:xs) ys     result = allPairs3' h xs ys ((h,
> x):result)
>             allPairs3' _ []     (y:ys) result = allPairs3' y ys ys result
>             allPairs3' _ []     []     result = result
>
> This takes half the memory as the above (yay!) but runs substantially
> *slower* (and appears to be thrashing my computer's memory system):
>
>     Prelude Control.DeepSeq AllPairs> deepseq (allPairs3 [1..10000]) True
>     True
>     (10.58 secs, 2403820152 bytes)
>
> What gives?  Why does my tail-recursive implementation run even slower
> and presumably produce more garbage than my initial, naive
> implementation?  Is there some optimization GHC is performing for
> allPairs1 and allPairs2 that it can't apply to allPairs3?
>
> Similarly, how should I, as a newcomer who wants to write efficient
> Haskell code, reason about whether a code change is likely to improve
> rather than degrade performance?  Guidelines like "per-iteration
> appends = bad" and "tail recursion = good" are great, but the
> preceding data indicate that something subtler is taking precedence
> over those guidelines with respect to code performance.
>
> Thanks in advance,
> -- Scott
>
> ______________________________**_________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130903/f4dbcbc8/attachment.htm>