Implementation idea for unboxed polymorphic types

Richard Eisenberg eir at cis.upenn.edu
Mon Nov 16 00:41:19 UTC 2015


After reading Francesco's original post, I immediately thought of Ömer's proposed approach, of using Template Haskell to produce the right data family instances. But I don't see why you'd need quoting at constructor calls. Couldn't you just have a type class like `PointFamily`? I'd be more interested to see client code in Ömer's version than the TH generation code.

The TH approach would seem to require having a fixed set of specializations, which is a downside. But I'm not sure it's so much of a downside that the approach is unusable.

Richard

On Nov 15, 2015, at 10:08 AM, Ömer Sinan Ağacan <omeragacan at gmail.com> wrote:

> I had started working on exactly the same thing at some point. I had a
> TemplateHaskell-based implementation which _almost_ worked.
> 
> The problem was that the syntax was very, very heavy. Because I had to use
> quotes for _every_ constructor application(with explicitly passed types).
> (because I had a specialized constructor for every instantiation of this
> generic type)
> 
> Another problem was that because of how TemplateHaskell quotes evaluated, I
> couldn't use a `List Int` where `List` is a template without first manually
> adding a line for generating specialized version of `List` on `Int`.
> 
> When all of these combined it became very hard to use. But it was a
> proof-of-concept and I think it worked.
> 
> (Code is horrible so I won't share it here :) I had to maintain a state shared
> with different TH quote evaluations etc.)
> 
> 2015-11-15 5:26 GMT-05:00 Francesco Mazzoli <f at mazzo.li>:
>> (A nicely rendered version of this email can be found at <https://gist.github.com/bitonic/52cfe54a2dcdbee1b7f3>)
>> 
>> ## Macro types
>> 
>> I very often find myself wanting unboxed polymorphic types
>> (e.g. types that contain `UNPACK`ed type variables). I find
>> it extremely frustrating that it's easier to write fast _and_
>> generic code in C++ than in Haskell.
>> 
>> I'd like to submit to the mailing list a very rough proposal
>> on how this could be achieved in a pretty straightforward way
>> in GHC.
>> 
>> The proposal is meant to be a proof of concept, just to show that
>> this could be done rather easily. I did not think about a nice
>> interface or the implementation details in GHC. My goal is to
>> check the feasibility of this plan with GHC developers.
>> 
>> I'll call such types "macro types", since their effect is similar
>> to defining a macro that defines a new type for each type
>> variable instantiation.
>> 
>> Consider
>> 
>> ```
>> data #Point a = Point
>>  { x :: {-# UNPACK #-} !a
>>  , y :: {-# UNPACK #-} !a
>>  }
>> ```
>> 
>> This definition defines the macro type `#Point`, with one parameter
>> `a`.
>> 
>> Macro types definition would be allowed only for single-constructor
>> records. The intent is that if we mention `#Point Double`, it will
>> be equivalent to
>> 
>> ```
>> data PointDouble = PointDouble
>>  { x :: {-# UNPACK #-} !Double
>>  , y :: {-# UNPACK #-} !Double
>>  }
>> ```
>> 
>> To use `#Point` generically, the following type class would be
>> generated:
>> 
>> ```
>> class PointFamily a where
>>  data #Point a :: * -- Family of types generated by @data #Point a at .
>>  #Point :: a -> a -> #Point a -- Constructor.
>>  #x :: #Point a -> a -- Projection @x at .
>>  #y :: #Point a -> a -- Projection @y at .
>> ```
>> 
>> Thi type class lets us work with `#Point`s generically, for example
>> 
>> ```
>> distance :: (PointFamily a, Fractional a) => #Point a -> #Point a -> a
>> distance p1 p2 =
>>  let dx = #x p1 - #x p2
>>      dy = #y p1 - #y p2
>>  in sqrt (dx*dx + dy*dy)
>> ```
>> 
>> Internally, for every type appearing for `a`, e.g. `#Point Double`,
>> a new type equivalent to the `PointDouble` above would be generated
>> by GHC, with the corresponding instance
>> 
>> ```
>> instance PointFamily Double where
>>  data #Point Double = PointDouble
>>  #x = x
>>  #y = x
>> ```
>> 
>> If it's not possible to instantiate `#Point` with the provided type
>> (for example because the type is not `UNPACK`able, e.g.
>> `#Point (Maybe A)`), GHC would throw an error.
>> 
>> Note that we can compile `distance` in its polymorphic version
>> (as opposed to C++ templates, where template functions _must_ be
>> instantiated at every use). The polymorphic `distance` would
>> require a call to "virtual functions" `#x` and `#y`, as provided by
>> the `PointFamily` dictionary. But if we use
>> `INLINE` or `SPECIALIZE` pragmas the virtual calls to `#x` and `#y`
>> would disappear, making this as efficient as if we were to define
>> `distance` on the manually defined `PointDouble`. Compiler hints
>> would be put in place to always inline functions using macro types,
>> if possible.
>> 
>> Note that the inlining is only important so that the `PointFamily`
>> dictionary disappears, e.g. functions containing recursive
>> helpers are fine, such as
>> 
>> ```
>> {-# INLINE leftmost #-}
>> leftmost :: forall a. (PointFamily a, Ord a) => [#Point a] -> #Point a
>> leftmost [] = error "leftmost: no points"
>> leftmost (p0 : ps0) = go p0 ps0
>>  where
>>    go :: #Point a -> [#Point a] -> Point# a
>>    go candidate (p : ps) =
>>      if #x p < #x candidate
>>        then go p ps
>>        else go candidate ps
>> ```
>> 
>> It might be worth considering throwing a warning when a top-level
>> definition whose type contains a macro type cannot be inlined, since
>> the main performance benefit of using macro types would be lost.
>> 
>> We can define instances for these types as normal, for instance
>> 
>> ```
>> instance (Show a, PointFamily a) => Show (#Point a) where
>>  {-# INLINE show #-}
>>  show pt = "Point{x = " ++ #x pt ++ ", y = " ++ #y pt ++ "}"
>> ```
>> 
>> `deriving` support could also be added.
>> 
>> ## Further ideas
>> 
>> ### Hide or remove `PointFamily` from the user
>> 
>> In the examples above `PointFamily` is manipulated explicitely
>> (e.g. in the type signature for `distance`).
>> In most cases the right constraint could be generated
>> automatically by GHC, although I think direct access to the
>> type class would be beneficial (history shows that direct
>> access to these facilities is good, see upcoming explicit
>> type applications).
>> 
>> Maybe the type class associated to a macro type should not even
>> exist -- for example we could simply represent `#Point` as a type
>> family and treat construction and destruction as built-in syntax
>> (the `#` prefix).
>> 
>> ### Pattern matching
>> 
>> Sugar could be added to pattern match, e.g.
>> 
>> ```
>> foo :: Point# a -> ...
>> distance (Point# x1 y1) = ...
>>  or
>> dinstance Point#{..} = ... -- #x and #y are available
>> ```
>> 
>> ### No "record types" limitation
>> 
>> Instead of restricting ourselves to single-constructor records,
>> we could simply generate
>> 
>> ```
>> data Point a = Point a
>>  { x :: !a
>>  , y :: !a
>>  }
>> 
>> class PointFamily a where
>>  data Point# a :: *
>>  destruct :: Point# a -> Point a
>>  construct :: Point a -> Point# a
>> ```
>> 
>> However, I think it would be harder to guarantee the well-behavedness
>> of the inlined functions if we had this intermediate type. I also
>> don't think macro types would be very useful beyond polymorphic
>> unboxed types.
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



More information about the ghc-devs mailing list