[Haskell-cafe] Basic list exercise

Mon Mar 27 00:00:34 UTC 2023

On Sun, Mar 26, 2023 at 04:24:09PM -0700, Todd Wilson wrote:

> >   Core (with some added comments):
> >     -- Unpack the "Ord" dictionary to extract just the required "<="
> >     -- function and call the "$wruns" worker ("ww3" is "<=", and "ds"
> >     -- is the list to be transformed:
> >     --
> >     runs :: forall a. Ord a => [a] -> [[a]]
> >     runs
> >       = \ (@a) ($dOrd :: Ord a) (ds :: [a]) ->
> >           case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 ->
> >           $wruns ww3 ds
> >           }
> >
> >     Rec {
> >     $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]]
> >     $wruns
> >       = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) ->
> >           case ds of {
> >             [] -> [];           -- runs [] = []
> >             : x xs ->           -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs
> >               let {
> >                 ds1 :: ([a], [a])  -- A lazily evaluated thunk for (run x xs)
> >                 ds1
> >                   = letrec {
> >                       -- Internal recursion in "run" returns strict unboxed pairs
> >                       -- (on the stack) avoiding heap or thunk allocation for the tuple.
> >                       $wrun :: a -> [a] -> (# [a], [a] #)
> >                       $wrun
> >                         = \ (x1 :: a) (ds2 :: [a]) ->
> >                             case ds2 of wild1 {         -- (y:ys) is "wild1"
> >                               [] -> (# [], [] #);       -- run x [] = ([], [])
> >                               : y ys ->
> >                                 case ww x1 y of {       -- x <= y ?
> >                                   False -> (# [], wild1 #);  -- else ([], y:ys)
> >                                   True ->                    -- then let (us, vs) = run y ys in (y:us, vs)
> >                                     let {
> >                                       ds3 :: ([a], [a])      -- A "thunk" for (run y ys) evaluated lazily
> >
> 
> Why doesn't ds3 have an explicitly unboxed pair type, and does that have
> any performance implications? For example, ...

Precisely because "ds3" must be evaluated lazily, it can't be an unboxed
pair (which are always strictly evaluated).

> >                                       ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in
> >                                     (# : y (case ds3 of { (us, vs) -> us }),
> >                                        case ds3 of { (us, vs) -> vs } #)

> Granted I'm not that familiar with Core, but It sure looks like this code
> breaks apart pairs (with the equivalent of fst and snd) and rebuilds them

The inner (recursive) invocation of "run" must also be lazily evaluated,
so yes, its output needs to be boxed as a pair.

> > When we do want the successor of the first element, we look no futher
> > than necessary:
> >
> >     λ> head $ runs $ 42 : 0 : undefined
> >     [42]
> >
> >     λ> take 2 $ head $ runs $ 42 : 43 : undefined
> >     [42,43]
> >
> > Does this help?
> 
> Yes, it does, thanks, although I was aware of this aspect of the laziness
> of my code from the beginning and was concerned more with how the output
> lists were built.

As with all lists, they are built via the (:) constructor from a head
element and a tail.  GHC reuses any full list or tail it can reuse, and
constructs new cons cells that are not already in hand.

Because these are pure functions working with immutable data, to
construct an initial segment of a list we must build a new list, we
can't truncate the original original in place, it is immutable.

Therefore, the original list will be picked apart and reassembled.

With "linear Haskell" there are in some cases opportunities to mutate
certain objects in place, because they are sure to not have any other
references.  But that isn't the case here.

So even `runs [0..10]` has to pick apart and reassemble the list.  I
hope I understood correctly what you're getting at with the concern
about building and rebuilding.

It looks to me like the Core code does exactly as much building and
re-building as required by laziness, and the result can be consumed in a
single pass in constant space (multi-pass use naturally memoises the
result).

-- 
    Viktor.