<div dir="auto">On mobile, but I don't see how you're compiling? Is it at least with -O1? </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Dec 23, 2020, 18:26 Keith <<a href="mailto:keith.wygant@gmail.com">keith.wygant@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>There's no benefit to optimizing default Foldable methods for data types that already have specialized methods.<br><br>foldMap' does not care about the nesting of the structure like foldl', so it's a better default choice.<br><br>What I worry more about is that getSum . foldl' (\ z x -> z <> Sum x) mempty is compiling to different code than foldl' (+) 0.<br>—<br>Sent from my phone with K-9 Mail.<br><br><div class="gmail_quote">On December 23, 2020 9:52:00 PM UTC, Viktor Dukhovni <<a href="mailto:ietf-dane@dukhovni.org" target="_blank" rel="noreferrer">ietf-dane@dukhovni.org</a>> wrote:<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<pre dir="auto">On Wed, Dec 23, 2020 at 02:08:51PM +0100, Merijn Verstraaten wrote:<br><br><blockquote class="gmail_quote" style="margin:0pt 0pt 1ex 0.8ex;border-left:1px solid #729fcf;padding-left:1ex">Also, note that the benchmark from the original email:<br>1) uses ghci, rendering it meaningless as it's not remotely<br>representative of performance of compiled code<br></blockquote><br>While indeed List does not use the default definition, when that<br>definition is used for List, the results with optimised compiled code<br>are even more stark. When summing [0..1_000_000_000] using the<br>foldMap'-based definition of `sum`:<br><br> main :: IO ()<br> main = getArgs >>= \case<br> [n] -> go (read n)<br> _ -> go 1_000_000_000<br> where<br> go :: Int -> IO ()<br> go n = print $ getSum #. F.foldMap' Sum $ [0..n]<br><br>The list elements end up allocated on the heap and I get:<br><br> $ ./sum1 +RTS -s<br> 500000000500000000<br> 72,000,051,888 bytes allocated in the heap<br> 571,120 bytes copied during GC<br> 44,376 bytes maximum residency (2 sample(s))<br> 29,352 bytes maximum slop<br> 5 MiB total memory in use (0 MB lost due to fragmentation)<br><br> INIT time 0.000s ( 0.000s elapsed)<br> MUT time 6.996s ( 6.988s elapsed)<br> GC time 0.036s ( 0.043s elapsed)<br> EXIT time 0.000s ( 0.002s elapsed)<br> Total time 7.033s ( 7.034s elapsed)<br><br>while with the List instance of foldl':<br><br> main :: IO ()<br> main = getArgs >>= \case<br> [n] -> go (read n)<br> _ -> go 1_000_000_000<br> where<br> go :: Int -> IO ()<br> go n = print $ F.foldl' (+) 0 $ [0..n]<br><br>the computation avoids heap allocation:<br><br> $ ./sum2 +RTS -s<br> 500000000500000000<br> 51,816 bytes allocated in the heap<br> 3,320 bytes copied during GC<br> 44,376 bytes maximum residency (1 sample(s))<br> 25,256 bytes maximum slop<br> 5 MiB total memory in use (0 MB lost due to fragmentation)<br><br> INIT time 0.000s ( 0.000s elapsed)<br> MUT time 0.346s ( 0.346s elapsed)<br> GC time 0.001s ( 0.001s elapsed)<br> EXIT time 0.000s ( 0.004s elapsed)<br> Total time 0.347s ( 0.351s elapsed)<br><br><blockquote class="gmail_quote" style="margin:0pt 0pt 1ex 0.8ex;border-left:1px solid #729fcf;padding-left:1ex">2) uses the list instance of Foldable which does *not* use the<br>defaults shown in the original email, but uses explicit definitions.<br></blockquote><br>Yes, the "List" instance of `sum` does not use the default definition.<br>That instance shows comparable performance for `sum` and `foldl'` when<br>compiled optimised.<br><br><blockquote class="gmail_quote" style="margin:0pt 0pt 1ex 0.8ex;border-left:1px solid #729fcf;padding-left:1ex">The foldMap' code shown (but not used!) *is* strict, so would,<br>presumably perform comparably to foldl'. Unlike the foldl version of<br>sum that Foldable for lists currently uses.<br></blockquote><br>So my question is basically whether the default is *generally* the more<br>appropriate choice. It clearly is not the more efficient choice for<br>Lists (more precisely, lazily generated iterators).<br><br>The difference mostly goes away when the data structure in question is<br>already fully realised in memory (as with e.g. strict maps, ...)<br><br>But I am skeptical that the `foldMap'` defintion is a better default,<br>are there real cases where it is actually better?<br></pre></blockquote></div></div>_______________________________________________<br>
Libraries mailing list<br>
<a href="mailto:Libraries@haskell.org" target="_blank" rel="noreferrer">Libraries@haskell.org</a><br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" rel="noreferrer noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries</a><br>
</blockquote></div>