Implementation idea for unboxed polymorphic types

Ömer Sinan Ağacan omeragacan at gmail.com
Sun Nov 15 15:08:32 UTC 2015


I had started working on exactly the same thing at some point. I had a
TemplateHaskell-based implementation which _almost_ worked.

The problem was that the syntax was very, very heavy. Because I had to use
quotes for _every_ constructor application(with explicitly passed types).
(because I had a specialized constructor for every instantiation of this
generic type)

Another problem was that because of how TemplateHaskell quotes evaluated, I
couldn't use a `List Int` where `List` is a template without first manually
adding a line for generating specialized version of `List` on `Int`.

When all of these combined it became very hard to use. But it was a
proof-of-concept and I think it worked.

(Code is horrible so I won't share it here :) I had to maintain a state shared
with different TH quote evaluations etc.)

2015-11-15 5:26 GMT-05:00 Francesco Mazzoli <f at mazzo.li>:
> (A nicely rendered version of this email can be found at <https://gist.github.com/bitonic/52cfe54a2dcdbee1b7f3>)
>
> ## Macro types
>
> I very often find myself wanting unboxed polymorphic types
> (e.g. types that contain `UNPACK`ed type variables). I find
> it extremely frustrating that it's easier to write fast _and_
> generic code in C++ than in Haskell.
>
> I'd like to submit to the mailing list a very rough proposal
> on how this could be achieved in a pretty straightforward way
> in GHC.
>
> The proposal is meant to be a proof of concept, just to show that
> this could be done rather easily. I did not think about a nice
> interface or the implementation details in GHC. My goal is to
> check the feasibility of this plan with GHC developers.
>
> I'll call such types "macro types", since their effect is similar
> to defining a macro that defines a new type for each type
> variable instantiation.
>
> Consider
>
> ```
> data #Point a = Point
>   { x :: {-# UNPACK #-} !a
>   , y :: {-# UNPACK #-} !a
>   }
> ```
>
> This definition defines the macro type `#Point`, with one parameter
> `a`.
>
> Macro types definition would be allowed only for single-constructor
> records. The intent is that if we mention `#Point Double`, it will
> be equivalent to
>
> ```
> data PointDouble = PointDouble
>   { x :: {-# UNPACK #-} !Double
>   , y :: {-# UNPACK #-} !Double
>   }
> ```
>
> To use `#Point` generically, the following type class would be
> generated:
>
> ```
> class PointFamily a where
>   data #Point a :: * -- Family of types generated by @data #Point a at .
>   #Point :: a -> a -> #Point a -- Constructor.
>   #x :: #Point a -> a -- Projection @x at .
>   #y :: #Point a -> a -- Projection @y at .
> ```
>
> Thi type class lets us work with `#Point`s generically, for example
>
> ```
> distance :: (PointFamily a, Fractional a) => #Point a -> #Point a -> a
> distance p1 p2 =
>   let dx = #x p1 - #x p2
>       dy = #y p1 - #y p2
>   in sqrt (dx*dx + dy*dy)
> ```
>
> Internally, for every type appearing for `a`, e.g. `#Point Double`,
> a new type equivalent to the `PointDouble` above would be generated
> by GHC, with the corresponding instance
>
> ```
> instance PointFamily Double where
>   data #Point Double = PointDouble
>   #x = x
>   #y = x
> ```
>
> If it's not possible to instantiate `#Point` with the provided type
> (for example because the type is not `UNPACK`able, e.g.
> `#Point (Maybe A)`), GHC would throw an error.
>
> Note that we can compile `distance` in its polymorphic version
> (as opposed to C++ templates, where template functions _must_ be
> instantiated at every use). The polymorphic `distance` would
> require a call to "virtual functions" `#x` and `#y`, as provided by
> the `PointFamily` dictionary. But if we use
> `INLINE` or `SPECIALIZE` pragmas the virtual calls to `#x` and `#y`
> would disappear, making this as efficient as if we were to define
> `distance` on the manually defined `PointDouble`. Compiler hints
> would be put in place to always inline functions using macro types,
> if possible.
>
> Note that the inlining is only important so that the `PointFamily`
> dictionary disappears, e.g. functions containing recursive
> helpers are fine, such as
>
> ```
> {-# INLINE leftmost #-}
> leftmost :: forall a. (PointFamily a, Ord a) => [#Point a] -> #Point a
> leftmost [] = error "leftmost: no points"
> leftmost (p0 : ps0) = go p0 ps0
>   where
>     go :: #Point a -> [#Point a] -> Point# a
>     go candidate (p : ps) =
>       if #x p < #x candidate
>         then go p ps
>         else go candidate ps
> ```
>
> It might be worth considering throwing a warning when a top-level
> definition whose type contains a macro type cannot be inlined, since
> the main performance benefit of using macro types would be lost.
>
> We can define instances for these types as normal, for instance
>
> ```
> instance (Show a, PointFamily a) => Show (#Point a) where
>   {-# INLINE show #-}
>   show pt = "Point{x = " ++ #x pt ++ ", y = " ++ #y pt ++ "}"
> ```
>
> `deriving` support could also be added.
>
> ## Further ideas
>
> ### Hide or remove `PointFamily` from the user
>
> In the examples above `PointFamily` is manipulated explicitely
> (e.g. in the type signature for `distance`).
> In most cases the right constraint could be generated
> automatically by GHC, although I think direct access to the
> type class would be beneficial (history shows that direct
> access to these facilities is good, see upcoming explicit
> type applications).
>
> Maybe the type class associated to a macro type should not even
> exist -- for example we could simply represent `#Point` as a type
> family and treat construction and destruction as built-in syntax
> (the `#` prefix).
>
> ### Pattern matching
>
> Sugar could be added to pattern match, e.g.
>
> ```
> foo :: Point# a -> ...
> distance (Point# x1 y1) = ...
>   or
> dinstance Point#{..} = ... -- #x and #y are available
> ```
>
> ### No "record types" limitation
>
> Instead of restricting ourselves to single-constructor records,
> we could simply generate
>
> ```
> data Point a = Point a
>   { x :: !a
>   , y :: !a
>   }
>
> class PointFamily a where
>   data Point# a :: *
>   destruct :: Point# a -> Point a
>   construct :: Point a -> Point# a
> ```
>
> However, I think it would be harder to guarantee the well-behavedness
> of the inlined functions if we had this intermediate type. I also
> don't think macro types would be very useful beyond polymorphic
> unboxed types.
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list