Containers and strictness continued

Claus Reinke claus.reinke at talk21.com
Fri Jul 9 14:00:58 EDT 2010


> for my work on the containers I need to settle some questions about 
> strictness.

Thanks again for driving this.

+1 for consistent strictness in keys
+1 for adding strict folds

Since I've been wondering about this problem for a while,
I would like to add two closely related suggestions:

(a) the useability of several higher-order functions in the API
    can be improved by splitting them into wrapper and worker
    (the wrapper gets inlined away, leaving an inlined worker
    for which the functional parameter is fixed, so that the
    strictness analyser can do a better job; see GHC's
    implementations of list folds)

(b) the only reason not to add strict-in-value versions of
    (Int)Map has been code duplication; it turns out that
    it might be possible to use the standard SPECIALIZE
    pragma to do most of the work for us, using nothing
    more than FlexibleInstances and MultiParamTypeClasses
    (no functional dependencies, no type families).

    The idea would be to provide both IntMap and IntMap',
    the latter being strict in the keys. There would be one
    type class abstractly representing IntMap constructors
    and (pattern match) views, and each of the API functions
    would be implemented once, in terms of this class. Then
    each of these abstract implementations would be
    specialized to both IntMap and IntMap'.

    Appended below is a concrete example of this abstract
    idea, a module with both strict (L1) and non-strict (L2)
    lists, an abstract interface (L) and a single definition
    of map (mapL) that is specialized for both use cases.
    For comparison, handcoded versions of mapL1 and
    mapL2 are also included.

    This module is small enough that one can compare
    the core output (-ddump-simpl), and it looks as if
    the -O2 compiled core for the specialized versions
    of mapL matches that for the handcoded versions.
    No manual duplication, just pragmas, same code
    and API, just different types for the two use cases.

(a) could be included directly, (b) is just offered for
discussion at the moment. If there are no hidden
traps in (b), I would like to see it implemented as
well, to reduce the frequent space issues with Maps
and large datasets.

Note that the combination of (a) and (b) could
replace some of the proposed special-purpose
strict functions, keeping the API simpler.

Claus

-----------------

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}

module Main(mapL1,mapL2,mapL,main) where

data L1 a = N1 | C1 !a (L1 a) deriving Show

-- mapL specialised to L1 should be as good as mapL1
mapL1 f N1       = N1
mapL1 f (C1 a l) = C1 (f a) (mapL1 f l)

data L2 a = N2 | C2 a (L2 a) deriving Show

-- mapL specialised to L2 should be as good as mapL2
mapL2 f N2       = N2
mapL2 f (C2 a l) = C2 (f a) (mapL2 f l)

class L l a where
  -- constructors
  n :: l a
  c :: a -> l a -> l a
  -- view (abstract pattern match)
  v :: l a -> b -> (a -> l a -> b) -> b

mapL :: (L l a,L l b) => (a -> b) -> l a -> l b
{-# SPECIALIZE mapL :: (a -> b) -> L1 a -> L1 b #-}
{-# SPECIALIZE mapL :: (a -> b) -> L2 a -> L2 b #-}
mapL f l = (v l) n (\x xs->c (f x) (mapL f xs))

instance L L1 a where
  n = N1
  c = C1
  v N1       n c = n
  v (C1 a l) n c = c a l

instance L L2 a where
  n = N2
  c = C2
  v N2       n c = n
  v (C2 a l) n c = c a l

main = do
  print $ mapL (+1) $ C1 0 $ C1 1 $ C1 2 $ N1
  print $ mapL (+1) $ C2 0 $ C2 1 $ C2 2 $ N2

  print $ mapL1 (+1) $ C1 0 $ C1 1 $ C1 2 $ N1
  print $ mapL2 (+1) $ C2 0 $ C2 1 $ C2 2 $ N2
 




More information about the Libraries mailing list