[GHC] #14465: Performance of Natural

GHC ghc-devs at haskell.org
Tue Nov 14 22:02:30 UTC 2017


#14465: Performance of Natural
-------------------------------------+-------------------------------------
           Reporter:  Bodigrim       |             Owner:  (none)
               Type:  feature        |            Status:  new
  request                            |
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.2.1
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 Recently I tried to use `Natural` instead of `Integer` in one of my
 projects. I expected no difference or even a minor performance boost
 (since `Natural` does not have to worry about a sign). But in fact it
 caused a slowdown.

 A constant of type `Integer` will be evaluated to a low-level
 representation (`S#` / `Jp#` / `Jn#`) during `CorePrep` stage. Nothing of
 this kind happens to constant values of type `Natural`:

 {{{#!hs
 import Numeric.Natural
 one :: Natural
 one = fromInteger 1
 }}}

 is translated to

 {{{#!hs
 one1 :: Integer
 one1 = 1 -- will be converted by CorePrep to S# 1#

 one :: Natural
 one
   = case one1 of {
       S# i#_a2c6 ->
         case tagToEnum# (>=# i#_a2c6 0#) of {
           False -> underflowError;
           True -> NatS# (int2Word# i#_a2c6)
         };
       Jp# dt_a2cg ->
         case uncheckedIShiftRL# (sizeofByteArray# dt_a2cg) 3# of {
           __DEFAULT ->
             case sizeofByteArray# dt_a2cg of {
               __DEFAULT -> NatJ# dt_a2cg;
               0# -> underflowError
             };
           1# ->
             case indexWordArray# dt_a2cg 0# of wild2_a2ck { __DEFAULT ->
             NatS# wild2_a2ck
             }
         };
       Jn# ipv_a2cn -> underflowError
     }
 }}}

 This is not bad itself, if `one` is a top-level definition. At the end of
 the day a thunk will be replaced by its value, computed exactly once. But
 suppose we have written

 {{{#!hs
 import Numeric.Natural
 plusOne :: Natural -> Natural
 plusOne n = n + 1
 }}}

 The corresponding Core looks this way:

 {{{#!hs
 plusOne :: Natural -> Natural
 plusOne
   = \ (n_auS :: Natural) ->
       case 1 of {
         S# i#_a2dA ->
           case tagToEnum# (>=# i#_a2dA 0#) of {
             False -> case underflowError of wild2_00 { };
             True -> plusNatural n_auS (NatS# (int2Word# i#_a2dA))
           };
         Jp# dt_a2dI ->
           case uncheckedIShiftRL# (sizeofByteArray# dt_a2dI) 3# of {
             __DEFAULT ->
               case sizeofByteArray# dt_a2dI of {
                 __DEFAULT -> plusNatural n_auS (NatJ# dt_a2dI);
                 0# -> case underflowError of wild4_00 { }
               };
             1# ->
               case indexWordArray# dt_a2dI 0# of wild2_a2dM { __DEFAULT ->
               plusNatural n_auS (NatS# wild2_a2dM)
               }
           };
         Jn# ipv_a2dP -> case underflowError of wild1_00 { }
       }
 }}}

 It looks expensive to pattern match `1` repeatedly, at every call to
 `plusOne`.

 Another deficiency of `Natural` is that no constant folding is done. Even
 `2 * 2` results in 50 lines of Core:

 {{{#!hs
 twoTimesTwo2 :: Integer
 twoTimesTwo2 = 2

 twoTimesTwo1 :: Natural
 twoTimesTwo1
   = case twoTimesTwo2 of {
       S# i#_a2u3 ->
         case tagToEnum# (>=# i#_a2u3 0#) of {
           False -> underflowError;
           True -> NatS# (int2Word# i#_a2u3)
         };
       Jp# dt_a2ub ->
         case uncheckedIShiftRL# (sizeofByteArray# dt_a2ub) 3# of {
           __DEFAULT ->
             case sizeofByteArray# dt_a2ub of {
               __DEFAULT -> NatJ# dt_a2ub;
               0# -> underflowError
             };
           1# ->
             case indexWordArray# dt_a2ub 0# of wild2_a2uf { __DEFAULT ->
             NatS# wild2_a2uf
             }
         };
       Jn# ipv_a2ui -> underflowError
     }

 twoTimesTwo :: Natural
 twoTimesTwo
   = case twoTimesTwo2 of {
       S# i#_a2u3 ->
         case tagToEnum# (>=# i#_a2u3 0#) of {
           False -> case underflowError of wild2_00 { };
           True -> $fNumNatural_$c* twoTimesTwo1 (NatS# (int2Word#
 i#_a2u3))
         };
       Jp# dt_a2ub ->
         case uncheckedIShiftRL# (sizeofByteArray# dt_a2ub) 3# of {
           __DEFAULT ->
             case sizeofByteArray# dt_a2ub of {
               __DEFAULT -> $fNumNatural_$c* twoTimesTwo1 (NatJ# dt_a2ub);
               0# -> case underflowError of wild4_00 { }
             };
           1# ->
             case indexWordArray# dt_a2ub 0# of wild2_a2uf { __DEFAULT ->
             $fNumNatural_$c* twoTimesTwo1 (NatS# wild2_a2uf)
             }
         };
       Jn# ipv_a2ui -> case underflowError of wild1_00 { }
     }
 }}}

 This is not surprising, since constant folding of `Integer` works only due
 to a special Core literal `LitInteger` and a set of hardcoded
 `PrelRules.builtinIntegerRules` (and `NOINLINE` pragmas in
 `GHC.Integer.Type`).

 ----

 Is it reasonable to make `Natural` a first-class Core citizen? I suppose a
 new `LitNatural` node and decent amount of copy-paste will do. Or,
 possibly, we may reuse `LitInteger Integer Type` with appropriate `Type`
 to avoid some duplication of code.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14465>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list