<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Hi Chris,</p>

    <p>It has been considered in the past. There are some traces in the

      wiki:

      <a class="moz-txt-link-freetext" href="https://gitlab.haskell.org/ghc/ghc/-/wikis/replacing-gmp-notes">https://gitlab.haskell.org/ghc/ghc/-/wikis/replacing-gmp-notes</a></p>

    <p>>> The suggestion discussed by <a

href="http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010670.html"

        rel="nofollow noreferrer noopener" target="_blank">John Meacham</a>,

      <a

href="http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010664.html"

        rel="nofollow noreferrer noopener" target="_blank"> Lennart

        Augustsson</a>, <a

href="http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010677.html"

        rel="nofollow noreferrer noopener" target="_blank"> Simon Marlow</a>

      and <a

href="http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010687.html"

        rel="nofollow noreferrer noopener" target="_blank"> Bulat

        Ziganshin</a> was to change the representation of Integer so the

      Int# does the work of S# and J#: the Int# could be either a

      pointer to the Bignum library array of limbs or, if the number of

      significant digits could fit into say, 31 bits, to use the extra

      bit as an indicator of that fact and hold the entire value in the

      Int#, thereby saving the memory from S# and J#.</p>

    <p>It's not trivial because it requires a new runtime representation

      that is dynamically boxed or not.</p>

    <p>> An unboxed sum might be an improvement? e.g. (# Int# |

      ByteArray# #) -- would this "kind of" approximate the approach

      described? I don't have a good intuition of what the memory layout

      would be like.</p>

    <p>After the unariser pass, the unboxed sum becomes an unboxed

      tuple: (# Int# {-tag-}, Int#, ByteArray# #)<br>

      The two fields don't overlap because they don't have the same slot

      type.<br>

    </p>

    <p>In my early experiments before implementing ghc-bignum,

      performance got worse in some cases with this encoding iirc. It

      may be worth checking again if someone has time to do it :).

      Nowadays it should be easier as we can define pattern synonyms

      with INLINE pragmas to replace Integer's constructors.</p>

    <p>Another issue we have with Integer/Natural is that we have to

      mark most operations NOINLINE to support constant-folding. To be

      fair benchmarks should take this into account.<br>

    </p>

    <p>Cheers,<br>

      Sylvain<br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 08/03/2021 18:13, Chris Done wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:942965df-67a7-4101-ab18-567550af34a1@www.fastmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <title></title>

      <style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style>

      <div>Hi all,<br>

      </div>

      <div><br>

      </div>

      <div>In OCaml's implementation, they use a well known 63-bit

        representation of ints to distinguish whether a given machine

        word is either a pointer or to be interpreted as an integer.<br>

      </div>

      <div><br>

      </div>

      <div>I was wondering whether anyone had considered the performance

        benefits of doing this for the venerable Integer type in

        Haskell? I.e. if the Int fits in 63-bits, just shift it and do

        regular arithmetic. If the result ever exceeds 63-bits, allocate

        a GMP/integer-simple integer and return a pointer to it. This

        way, for most applications--in my experience--integers don't

        really ever exceed 64-bit, so you would (possibly) pay a smaller

        cost than the pointer chasing involved in bignum arithmetic.

        Assumption: it's cheaper to do more CPU instructions than to

        allocate or wait for mainline memory. <br>

      </div>

      <div><br>

      </div>

      <div>This would need assistance from the GC to be able to

        recognize said bit flag.<br>

      </div>

      <div><br>

      </div>

      <div>As I understand the current implementation of integer-gimp,

        they also try to use an Int64 where possible using a constructor

        (<a

href="https://hackage.haskell.org/package/integer-gmp-1.0.3.0/docs/src/GHC.Integer.Type.html#Integer"

          moz-do-not-send="true">https://hackage.haskell.org/package/integer-gmp-1.0.3.0/docs/src/GHC.Integer.Type.html#Integer</a>),

        but I believe that the compiled code will still pointer chase

        through the constructor. Simple addition or subtraction, for

        example, is 24 times slower in Integer than in Int for 1000000

        iterations:</div>

      <div><br>

      </div>

      <div><a href="https://github.com/haskell-perf/numbers#addition"

          moz-do-not-send="true">https://github.com/haskell-perf/numbers#addition</a></div>

      <div><br>

      </div>

      <div>An unboxed sum might be an improvement? e.g. (# Int# |

        ByteArray# #) -- would this "kind of" approximate the approach

        described? I don't have a good intuition of what the memory

        layout would be like.<br>

      </div>

      <div><br>

      </div>

      <div>Just pondering.</div>

      <div><br>

      </div>

      <div>Cheers,<br>

      </div>

      <div><br>

      </div>

      <div>Chris</div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

ghc-devs mailing list

<a class="moz-txt-link-abbreviated" href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a>

<a class="moz-txt-link-freetext" href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a>

</pre>

    </blockquote>

  </body>

</html>