[Haskell-cafe] Basic list exercise

Sun Mar 26 19:07:03 UTC 2023

On Sun, Mar 26, 2023 at 10:59:13AM -0700, Todd Wilson wrote:

> Is there an overhead penalty to be paid for using functions here, or
> is it compiled away into Prolog-like tail calls that fill in the
> missing parts of the data structure?

Efficient implementations of functional langauges make extensive use of
tail call optimisation.

I don't know whether looking at the GHC-generated "Core" for your
implementation will answer any of your questions, but here it is:

  Haskell:
    runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs
      where
        run x [] = ([], [])
        run x (y:ys) = if x <= y
                       then let (us, vs) = run y ys
                            in (y:us, vs)
                       else ([], y:ys)

  Core (with some added comments):
    -- Unpack the "Ord" dictionary to extract just the required "<="
    -- function and call the "$wruns" worker ("ww3" is "<=", and "ds"
    -- is the list to be transformed:
    --
    runs :: forall a. Ord a => [a] -> [[a]]
    runs
      = \ (@a) ($dOrd :: Ord a) (ds :: [a]) ->
          case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 ->
          $wruns ww3 ds
          }

    Rec {
    $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]]
    $wruns
      = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) ->
          case ds of {
            [] -> [];           -- runs [] = []
            : x xs ->           -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs
              let {
                ds1 :: ([a], [a])  -- A lazily evaluated thunk for (run x xs)
                ds1
                  = letrec {
                      -- Internal recursion in "run" returns strict unboxed pairs
                      -- (on the stack) avoiding heap or thunk allocation for the tuple.
                      $wrun :: a -> [a] -> (# [a], [a] #)
                      $wrun
                        = \ (x1 :: a) (ds2 :: [a]) ->
                            case ds2 of wild1 {         -- (y:ys) is "wild1"
                              [] -> (# [], [] #);       -- run x [] = ([], [])
                              : y ys ->
                                case ww x1 y of {       -- x <= y ?
                                  False -> (# [], wild1 #);  -- else ([], y:ys)
                                  True ->                    -- then let (us, vs) = run y ys in (y:us, vs)
                                    let {
                                      ds3 :: ([a], [a])      -- A "thunk" for (run y ys) evaluated lazily
                                      ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in
                                    (# : y (case ds3 of { (us, vs) -> us }),
                                       case ds3 of { (us, vs) -> vs } #)
                                }
                            }; } in
                    case $wrun x xs of { (# ww1, ww2 #) -> (ww1, ww2) } } in
              : (: x (case ds1 of { (ys, zs) -> ys }))
                (case ds1 of { (ys, zs) -> $wruns ww zs })
          }
    end Rec }

So for a non-empty cons-cell (: x ...) the result is a new cons cell:

    : (x : (fst ds1)) (runs (snd ds1))
           ---------   --------------

in which both underline parts are computed lazily (on demand) from the
thunk "ds1":

    λ> head $ head $ runs $ 42 : undefined
    42

When we do want the successor of the first element, we look no futher
than necessary:

    λ> head $ runs $ 42 : 0 : undefined
    [42]

    λ> take 2 $ head $ runs $ 42 : 43 : undefined
    [42,43]

Does this help?  FWIW, the (simplified) Core output was generated via:

    hscore() {
        ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes "$@"
    }
    hscore -O2 Runs.hs > Runs.core

-- 
    Viktor.