<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="markdown-here-wrapper" data-md-url="Thunderbird"
      style="font-family: Cantarell,Carlito,Calibri,DejaVu
      Sans,Trebuchet MS,Verdana,sans-serif;">
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">On 2018-02-27 08:19 AM, Shao Cheng wrote:</p>
      <blockquote style="margin: 1.2em 0px;border-left: 4px solid
        rgb(221, 221, 221); padding: 0px 1em; color: rgb(119, 119, 119);
        quotes: none;">
        <p style="margin: 0px 0px 0.75em ! important;">Coming back to
          your use case, you may try avoid using raw lists and switch to
          unboxed vectors, turn on -O2 and rely on stream fusion of the
          vector package. That will result in a considerable speedup.</p>
      </blockquote>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">I looked at the core that’s generated, and
        there’s no need for vectors. Fusion happens, there’s no use of
        lists at all and unboxed types are used. The code boils down to
        a single recursive function:</p>
      <pre style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;font-size: 1em; line-height: 1.2em;margin: 1.2em 0px;"><code class="hljs language-haskell" style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid rgb(234, 234, 234); background-color: rgb(248, 248, 248); border-radius: 3px; display: inline;white-space: pre; overflow: auto; border-radius: 3px; border: 1px solid rgb(0, 0, 0); background-color: rgb(68, 68, 68); color: rgb(0, 204, 0); padding: 0.5em 0.7em; display: block ! important;display: block; overflow-x: auto; padding: 0.5em; background: rgb(35, 36, 31) none repeat scroll 0% 0%; -moz-text-size-adjust: none;color: rgb(248, 248, 242);"><span class="hljs-title" style="color: rgb(166, 226, 46);">let</span> go i sum = <span class="hljs-keyword" style="color: rgb(249, 38, 114);">case</span> i <span class="hljs-keyword" style="color: rgb(249, 38, 114);">of</span>
        <span class="hljs-number" style="color: rgb(174, 129, 255);">100000000</span> -> sum + <span class="hljs-number" style="color: rgb(174, 129, 255);">200000000</span>
        _ -> go (i + <span class="hljs-number" style="color: rgb(174, 129, 255);">1</span>) (sum + i * <span class="hljs-number" style="color: rgb(174, 129, 255);">2</span>)
<span class="hljs-title" style="color: rgb(166, 226, 46);">in</span> go <span class="hljs-number" style="color: rgb(174, 129, 255);">1</span> <span class="hljs-number" style="color: rgb(174, 129, 255);">0</span>
</code></pre>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">except that the types are unboxed. The
        following complete program compiles down to almost identical
        core when compiled without optimization:</p>
      <pre style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;font-size: 1em; line-height: 1.2em;margin: 1.2em 0px;"><code class="hljs language-haskell" style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid rgb(234, 234, 234); background-color: rgb(248, 248, 248); border-radius: 3px; display: inline;white-space: pre; overflow: auto; border-radius: 3px; border: 1px solid rgb(0, 0, 0); background-color: rgb(68, 68, 68); color: rgb(0, 204, 0); padding: 0.5em 0.7em; display: block ! important;display: block; overflow-x: auto; padding: 0.5em; background: rgb(35, 36, 31) none repeat scroll 0% 0%; -moz-text-size-adjust: none;color: rgb(248, 248, 242);"><span class="hljs-pragma" style="color: rgb(248, 248, 242);">{-# LANGUAGE MagicHash #-}</span>

<span class="hljs-import"><span class="hljs-keyword" style="color: rgb(249, 38, 114);">import</span> GHC.Exts</span>

<span class="hljs-title" style="color: rgb(166, 226, 46);">main</span> = print $ <span class="hljs-type" style="color: rgb(230, 219, 116);">I</span># value
  <span class="hljs-keyword" style="color: rgb(249, 38, 114);">where</span>
    value =
        <span class="hljs-keyword" style="color: rgb(249, 38, 114);">let</span> go :: <span class="hljs-type" style="color: rgb(230, 219, 116);">Int</span># -> <span class="hljs-type" style="color: rgb(230, 219, 116);">Int</span># -> <span class="hljs-type" style="color: rgb(230, 219, 116);">Int</span>#
            go i sum = <span class="hljs-keyword" style="color: rgb(249, 38, 114);">case</span> i <span class="hljs-keyword" style="color: rgb(249, 38, 114);">of</span>
                <span class="hljs-number" style="color: rgb(174, 129, 255);">100000000</span># -> sum +# <span class="hljs-number" style="color: rgb(174, 129, 255);">200000000</span>#
                _ -> go (i +# <span class="hljs-number" style="color: rgb(174, 129, 255);">1</span>#) (sum +# i *# <span class="hljs-number" style="color: rgb(174, 129, 255);">2</span>#)
        <span class="hljs-keyword" style="color: rgb(249, 38, 114);">in</span> go <span class="hljs-number" style="color: rgb(174, 129, 255);">1</span># <span class="hljs-number" style="color: rgb(174, 129, 255);">0</span>#
</code></pre>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">I think that’s impressive even if it’s not
        a single number. Execution time on my lowly i5 is only 50ms.</p>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">BTW, GHC 8 seems to have removed the option
        for exporting core (<code style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid rgb(234, 234, 234); background-color: rgb(248, 248, 248); border-radius: 3px; display: inline;">-fext-core</code>)
        but there’s a wonderful plugin package called <a
          href="https://github.com/yav/dump-core"><code style="font-family: Consolas,Inconsolata,Andale Mono,DejaVu Sans Mono,Courier,monospace; font-size: 10pt;margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid rgb(234, 234, 234); background-color: rgb(248, 248, 248); border-radius: 3px; display: inline;">dump-core</code></a>
        that produces HTML output with colouring and interactivity. You
        just install it from Hackage and use the extra options it
        provides.</p>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">It seems to me that gcc’s compile-time
        evaluation of this loop is a special-case that matches the kind
        of thing that often crops up in C. I assume it’s not capable of
        doing that for every expression that could be evaluated at
        compile time, so a more complicated and realistic example would
        probably defeat it. After all, ghc could in theory evaluate any
        pure value (CAF) at compile time if it chose to, but that’s
        usually not what you want.</p>
      <p style="font-family: inherit; font-size: inherit;margin: 0px 0px
        0.75em ! important;">Also, it’s worth noting that due to
        Haskell’s lazy evaluation, a pure value (CAF) will never be
        evaluated more than once at runtime, which isn’t something you
        get with C.</p>
      <div
title="MDH:T24gMjAxOC0wMi0yNyAwODoxOSBBTSwgU2hhbyBDaGVuZyB3cm90ZTo8YnI+Jmd0OyBDb21pbmcgYmFjayB0byB5b3VyIHVzZSBjYXNlLCB5b3UgbWF5IHRyeSBhdm9pZCB1c2luZyByYXcgbGlzdHMg
YW5kIHN3aXRjaCB0byB1bmJveGVkIHZlY3RvcnMsIHR1cm4gb24gLU8yIGFuZCByZWx5IG9uIHN0
cmVhbSBmdXNpb24gb2YgdGhlIHZlY3RvciBwYWNrYWdlLiBUaGF0IHdpbGwgcmVzdWx0IGluIGEg
Y29uc2lkZXJhYmxlIHNwZWVkdXAuPGJyPjxicj5JIGxvb2tlZCBhdCB0aGUgY29yZSB0aGF0J3Mg
Z2VuZXJhdGVkLCBhbmQgdGhlcmUncyBubyBuZWVkIGZvciB2ZWN0b3JzLiBGdXNpb24gaGFwcGVu
cywgdGhlcmUncyBubyB1c2Ugb2YgbGlzdHMgYXQgYWxsIGFuZCB1bmJveGVkIHR5cGVzIGFyZSB1
c2VkLiBUaGUgY29kZSBib2lscyBkb3duIHRvIGEgc2luZ2xlIHJlY3Vyc2l2ZSBmdW5jdGlvbjo8
YnI+PGJyPmBgYGhhc2tlbGw8YnI+bGV0IGdvIGkgc3VtID0gY2FzZSBpIG9mPGJyPsKgwqDCoMKg
wqDCoMKgIDEwMDAwMDAwMCAtJmd0OyBzdW0gKyAyMDAwMDAwMDA8YnI+wqDCoMKgwqDCoMKgwqAg
XyAtJmd0OyBnbyAoaSArIDEpIChzdW0gKyBpICogMik8YnI+aW4gZ28gMSAwPGJyPmBgYDxicj48
YnI+ZXhjZXB0IHRoYXQgdGhlIHR5cGVzIGFyZSB1bmJveGVkLiBUaGUgZm9sbG93aW5nIGNvbXBs
ZXRlIHByb2dyYW0gY29tcGlsZXMgZG93biB0byBhbG1vc3QgaWRlbnRpY2FsIGNvcmUgd2hlbiBj
b21waWxlZCB3aXRob3V0IG9wdGltaXphdGlvbjo8YnI+PGJyPmBgYGhhc2tlbGw8YnI+ey0jIExB
TkdVQUdFIE1hZ2ljSGFzaCAjLX08YnI+PGJyPmltcG9ydCBHSEMuRXh0czxicj48YnI+bWFpbiA9
IHByaW50ICQgSSMgdmFsdWU8YnI+wqAgd2hlcmU8YnI+wqDCoMKgIHZhbHVlID08YnI+wqDCoMKg
wqDCoMKgwqAgbGV0IGdvIDo6IEludCMgLSZndDsgSW50IyAtJmd0OyBJbnQjPGJyPsKgwqDCoMKg
wqDCoMKgwqDCoMKgwqAgZ28gaSBzdW0gPSBjYXNlIGkgb2Y8YnI+wqDCoMKgwqDCoMKgwqDCoMKg
wqDCoMKgwqDCoMKgIDEwMDAwMDAwMCMgLSZndDsgc3VtICsjIDIwMDAwMDAwMCM8YnI+wqDCoMKg
wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgIF8gLSZndDsgZ28gKGkgKyMgMSMpIChzdW0gKyMgaSAq
IyAyIyk8YnI+wqDCoMKgwqDCoMKgwqAgaW4gZ28gMSMgMCM8YnI+YGBgPGJyPjxicj5JIHRoaW5r
IHRoYXQncyBpbXByZXNzaXZlIGV2ZW4gaWYgaXQncyBub3QgYSBzaW5nbGUgbnVtYmVyLiBFeGVj
dXRpb24gdGltZSBvbiBteSBsb3dseSBpNSBpcyBvbmx5IDUwbXMuPGJyPjxicj5CVFcsIEdIQyA4
IHNlZW1zIHRvIGhhdmUgcmVtb3ZlZCB0aGUgb3B0aW9uIGZvciBleHBvcnRpbmcgY29yZSAoYC1m
ZXh0LWNvcmVgKSBidXQgdGhlcmUncyBhIHdvbmRlcmZ1bCBwbHVnaW4gcGFja2FnZSBjYWxsZWQg
W2BkdW1wLWNvcmVgXShodHRwczovL2dpdGh1Yi5jb20veWF2L2R1bXAtY29yZSkgdGhhdCBwcm9k
dWNlcyBIVE1MIG91dHB1dCB3aXRoIGNvbG91cmluZyBhbmQgaW50ZXJhY3Rpdml0eS4gWW91IGp1
c3QgaW5zdGFsbCBpdCBmcm9tIEhhY2thZ2UgYW5kIHVzZSB0aGUgZXh0cmEgb3B0aW9ucyBpdCBw
cm92aWRlcy48YnI+PGJyPkl0IHNlZW1zIHRvIG1lIHRoYXQgZ2NjJ3MgY29tcGlsZS10aW1lIGV2
YWx1YXRpb24gb2YgdGhpcyBsb29wIGlzIGEgc3BlY2lhbC1jYXNlIHRoYXQgbWF0Y2hlcyB0aGUg
a2luZCBvZiB0aGluZyB0aGF0IG9mdGVuIGNyb3BzIHVwIGluIEMuIEkgYXNzdW1lIGl0J3Mgbm90
IGNhcGFibGUgb2YgZG9pbmcgdGhhdCBmb3IgZXZlcnkgZXhwcmVzc2lvbiB0aGF0IGNvdWxkIGJl
IGV2YWx1YXRlZCBhdCBjb21waWxlIHRpbWUsIHNvIGEgbW9yZSBjb21wbGljYXRlZCBhbmQgcmVh
bGlzdGljIGV4YW1wbGUgd291bGQgcHJvYmFibHkgZGVmZWF0IGl0LiBBZnRlciBhbGwsIGdoYyBj
b3VsZCBpbiB0aGVvcnkgZXZhbHVhdGUgYW55IHB1cmUgdmFsdWUgKENBRikgYXQgY29tcGlsZSB0
aW1lIGlmIGl0IGNob3NlIHRvLCBidXQgdGhhdCdzIHVzdWFsbHkgbm90IHdoYXQgeW91IHdhbnQu
PGJyPjxicj5BbHNvLCBpdCdzIHdvcnRoIG5vdGluZyB0aGF0IGR1ZSB0byBIYXNrZWxsJ3MgbGF6
eSBldmFsdWF0aW9uLCBhIHB1cmUgdmFsdWUgKENBRikgd2lsbCBuZXZlciBiZSBldmFsdWF0ZWQg
bW9yZSB0aGFuIG9uY2UgYXQgcnVudGltZSwgd2hpY2ggaXNuJ3Qgc29tZXRoaW5nIHlvdSBnZXQg
        d2l0aCBDLjxicj4="
style="height:0;width:0;max-height:0;max-width:0;overflow:hidden;font-size:0em;padding:0;margin:0;">​</div>
    </div>
  </body>
</html>