Using associated data types to create unpacked data structures

Johan Tibell johan.tibell at gmail.com
Thu Aug 12 06:13:20 EDT 2010


On Thu, Aug 12, 2010 at 11:28 AM, Simon Marlow <marlowsd at gmail.com> wrote:

> On 11/08/2010 17:03, Johan Tibell wrote:
>
>> Inspired by the generic maps example at
>>
>> http://www.haskell.org/haskellwiki/GHC/Indexed_types
>>
>> I tried to use associated data types to create a generic finite map that
>> unpacks both the key and value into the leaf data constructor.
>>
>
> What you're trying to do is have the compiler generate a whole module for
> you, including a datatype specialised to certain type paramters, and
> operations over that type.  Just defining a few of the operations isn't
> enough: they need to be inlined everywhere, essentially you need to
> recompile Data.Map for each instance.
>

There needs to be some amount of code generation, but much of the
implementation can still be shared. I previously tried to defined the type
class as

    {-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-}
    module Ex2 where

    import Prelude hiding (lookup)

    data MapView k v = TipView
                     | BinView {-# UNPACK #-} !Size !k !v !(Map k v) !(Map k
v)

    class Unbox k v where
        data Map k v :: *
        tip :: Map k v
        bin :: Size -> k -> v -> Map k v -> Map k v -> Map k v
        view :: Map k v -> MapView k v

    type Size = Int

    lookup :: (Ord k, Unbox k v) => k -> Map k v -> Maybe v
    lookup k m = case view m of
        TipView -> Nothing
        BinView _ kx x l r -> case compare k kx of
            LT -> lookup k l
            GT -> lookup k r
            EQ -> Just x
    {-# INLINE lookup #-}

Calling lookup from a different module at a know type gives exactly the Core
you'd like to see (unpacked types, no MapView constructors).

The compiler would still have to generate the associated data type instance
and the method implementations.

So I agree it would be nice if this happened automatically, behind the
> scenes, by virtue of just mentioning "Map Int Double" (though it would still
> have to be a typeclass of course, so that you can write polymorphic
> functions over Maps).  Automatic specialisation of this kind can be done by
> JIT runtimes (e.g. the .NET CLR), because there the code generation and
> caching of instances can be put under control of the runtime.  Here we would
> have to do it in the compiler, and the difficulty is that the compiler needs
> to support separate compilation.
>

C++ supports automatic instantiation and separate compilation. We'd have to
included the needed information in the .hi files so we can generate the
right instances at the usage site. Perhaps there are other problems (that
are somehow solved by the C++ compiler) that I'm not considering.


> Rather than try to solve this problem in one go, I would go for a low-tech
> approach for now: write a TH library to generate the code, and ask the user
> to declare the versions they need.  To make a particular version, the user
> would say something like
>
>  module MapIntDouble (module MapIntDouble) where
>  import TibbeMagicMapGenerator
>  make_me_a_map ...
>
> there's no type class of course, so you can't write functions that work
> over all specialised Maps.  But this at least lets you generate optimised
> maps for only a little boilerplate, and get the performance boost you were
> after.
>

This doesn't quite work though as two MapIntDouble defined in two different
libraries are incompatible. This is essentially the same problem as with
instance collisions.

Cheers,
Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100812/681de0eb/attachment-0001.html


More information about the Glasgow-haskell-users mailing list