Selector thunks again

Simon Peyton Jones simonpj at microsoft.com
Mon Mar 9 21:33:41 UTC 2020


David

Yes, quite right.  Do make a ticket.  But it’s not easy to see a truly robust way to fix this.

Selector thunks look like
              a = case x of (p,q) -> p

You example has thunks of the form
              a = case x of (_,b)  -> case b of (c,_) -> c

We want this thunk to vanish entirely if x is bound to a pair whose second component is a pair.  But, if x is bound to a pair, whose second component is not yet evaluated, we want the thunk to partially vanish, becoming
              a = case b of (c,_) -> c
where b is the second component of x.

This seems hard in general.

You could imagine translating the example into
              b = case x of (_,b) -> b
              a = case b of (c,_) -> c

Here I have built two thunks rather than one, but each is (independently) a selector thunk.  We get just the right thing happening if x is evaluated but its second component is note. Hooray.  What’s not nice is that execution is slower in the case where the selector-thunk mechanism doesn’t fire.

Another possibility.  Make selector thunks carry a kind of “path” indicating which fields to pick out as they select a nested component of the data structure.  A kind of selector-thunk chain.

Another possibility: for suitable selector-like thunks, compile special code that is run by the garbage collector.

This deserves a ticket and a wiki page. Feel free to plunder the above.

Simon

From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of David Feuer
Sent: 09 March 2020 17:28
To: ghc-devs <ghc-devs at haskell.org>
Subject: Selector thunks again

The fragility of this feature remains frustrating. A few days ago, I wrote this code for building a complete binary tree from its breadth-first traversal. (This is an improvement of a version by Will Ness.)

data Tree a
  = Empty
  | Node a (Tree a) (Tree a)
  deriving Show

-- An infinite list
data IL a = a :< IL a
infixr 5 :<

bft :: [a] -> Tree a
bft xs = tree
  where
    tree :< subtrees = go xs subtrees

    go :: [a] -> IL (Tree a) -> IL (Tree a)
    go (a : as) ~(b1 :< ~(b2 :< bs)) = Node a b1 b2 :< go as bs
    go [] _ = fix (Empty :<)

When GHC compiles the lazy patterns, we get something essentially like this:

    go (a : as) ys =
      Node a
        (case ys of b1 :< _ -> b1)
        (case ys of _ :< b2 :< _ -> b2)
      :<
      go as (case ys of _ :< _ :< bs -> bs)

Now `case ys of b1 :< _ -> b1` is a selector thunk, which is cool. The GC can reduce it as soon as either of the other thunks is forced. But neither of the other two case expressions is a selector thunk, so neither will ever be reduced by the GC. If I consume the result tree using an inorder traversal, for example, then all the elements in the left subtree of the root will remain live until I start to consume the right subtree of the root.

I can instead write this:

   go (a : as) ys = Node a b1 b2 :< go as bs
      where
         {-# NOINLINE b2bs #-}
         b1 :< b2bs = ys
         b2 :< bs = b2bs

Now all the suspended selections are selector thunks, so things should clean up nicely. There are still three problems, though. The first is that this is harder to read. The second is that now we have four suspended selections instead of three. Finally, if b1 is not the first one forced, we'll need to force two thunks instead of one.

Can't we do any better?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200309/5bfed9de/attachment.html>


More information about the ghc-devs mailing list