Implementation idea for unboxed polymorphic types
Richard Eisenberg
eir at cis.upenn.edu
Mon Nov 16 00:41:19 UTC 2015
After reading Francesco's original post, I immediately thought of Ömer's proposed approach, of using Template Haskell to produce the right data family instances. But I don't see why you'd need quoting at constructor calls. Couldn't you just have a type class like `PointFamily`? I'd be more interested to see client code in Ömer's version than the TH generation code.
The TH approach would seem to require having a fixed set of specializations, which is a downside. But I'm not sure it's so much of a downside that the approach is unusable.
Richard
On Nov 15, 2015, at 10:08 AM, Ömer Sinan Ağacan <omeragacan at gmail.com> wrote:
> I had started working on exactly the same thing at some point. I had a
> TemplateHaskell-based implementation which _almost_ worked.
>
> The problem was that the syntax was very, very heavy. Because I had to use
> quotes for _every_ constructor application(with explicitly passed types).
> (because I had a specialized constructor for every instantiation of this
> generic type)
>
> Another problem was that because of how TemplateHaskell quotes evaluated, I
> couldn't use a `List Int` where `List` is a template without first manually
> adding a line for generating specialized version of `List` on `Int`.
>
> When all of these combined it became very hard to use. But it was a
> proof-of-concept and I think it worked.
>
> (Code is horrible so I won't share it here :) I had to maintain a state shared
> with different TH quote evaluations etc.)
>
> 2015-11-15 5:26 GMT-05:00 Francesco Mazzoli <f at mazzo.li>:
>> (A nicely rendered version of this email can be found at <https://gist.github.com/bitonic/52cfe54a2dcdbee1b7f3>)
>>
>> ## Macro types
>>
>> I very often find myself wanting unboxed polymorphic types
>> (e.g. types that contain `UNPACK`ed type variables). I find
>> it extremely frustrating that it's easier to write fast _and_
>> generic code in C++ than in Haskell.
>>
>> I'd like to submit to the mailing list a very rough proposal
>> on how this could be achieved in a pretty straightforward way
>> in GHC.
>>
>> The proposal is meant to be a proof of concept, just to show that
>> this could be done rather easily. I did not think about a nice
>> interface or the implementation details in GHC. My goal is to
>> check the feasibility of this plan with GHC developers.
>>
>> I'll call such types "macro types", since their effect is similar
>> to defining a macro that defines a new type for each type
>> variable instantiation.
>>
>> Consider
>>
>> ```
>> data #Point a = Point
>> { x :: {-# UNPACK #-} !a
>> , y :: {-# UNPACK #-} !a
>> }
>> ```
>>
>> This definition defines the macro type `#Point`, with one parameter
>> `a`.
>>
>> Macro types definition would be allowed only for single-constructor
>> records. The intent is that if we mention `#Point Double`, it will
>> be equivalent to
>>
>> ```
>> data PointDouble = PointDouble
>> { x :: {-# UNPACK #-} !Double
>> , y :: {-# UNPACK #-} !Double
>> }
>> ```
>>
>> To use `#Point` generically, the following type class would be
>> generated:
>>
>> ```
>> class PointFamily a where
>> data #Point a :: * -- Family of types generated by @data #Point a at .
>> #Point :: a -> a -> #Point a -- Constructor.
>> #x :: #Point a -> a -- Projection @x at .
>> #y :: #Point a -> a -- Projection @y at .
>> ```
>>
>> Thi type class lets us work with `#Point`s generically, for example
>>
>> ```
>> distance :: (PointFamily a, Fractional a) => #Point a -> #Point a -> a
>> distance p1 p2 =
>> let dx = #x p1 - #x p2
>> dy = #y p1 - #y p2
>> in sqrt (dx*dx + dy*dy)
>> ```
>>
>> Internally, for every type appearing for `a`, e.g. `#Point Double`,
>> a new type equivalent to the `PointDouble` above would be generated
>> by GHC, with the corresponding instance
>>
>> ```
>> instance PointFamily Double where
>> data #Point Double = PointDouble
>> #x = x
>> #y = x
>> ```
>>
>> If it's not possible to instantiate `#Point` with the provided type
>> (for example because the type is not `UNPACK`able, e.g.
>> `#Point (Maybe A)`), GHC would throw an error.
>>
>> Note that we can compile `distance` in its polymorphic version
>> (as opposed to C++ templates, where template functions _must_ be
>> instantiated at every use). The polymorphic `distance` would
>> require a call to "virtual functions" `#x` and `#y`, as provided by
>> the `PointFamily` dictionary. But if we use
>> `INLINE` or `SPECIALIZE` pragmas the virtual calls to `#x` and `#y`
>> would disappear, making this as efficient as if we were to define
>> `distance` on the manually defined `PointDouble`. Compiler hints
>> would be put in place to always inline functions using macro types,
>> if possible.
>>
>> Note that the inlining is only important so that the `PointFamily`
>> dictionary disappears, e.g. functions containing recursive
>> helpers are fine, such as
>>
>> ```
>> {-# INLINE leftmost #-}
>> leftmost :: forall a. (PointFamily a, Ord a) => [#Point a] -> #Point a
>> leftmost [] = error "leftmost: no points"
>> leftmost (p0 : ps0) = go p0 ps0
>> where
>> go :: #Point a -> [#Point a] -> Point# a
>> go candidate (p : ps) =
>> if #x p < #x candidate
>> then go p ps
>> else go candidate ps
>> ```
>>
>> It might be worth considering throwing a warning when a top-level
>> definition whose type contains a macro type cannot be inlined, since
>> the main performance benefit of using macro types would be lost.
>>
>> We can define instances for these types as normal, for instance
>>
>> ```
>> instance (Show a, PointFamily a) => Show (#Point a) where
>> {-# INLINE show #-}
>> show pt = "Point{x = " ++ #x pt ++ ", y = " ++ #y pt ++ "}"
>> ```
>>
>> `deriving` support could also be added.
>>
>> ## Further ideas
>>
>> ### Hide or remove `PointFamily` from the user
>>
>> In the examples above `PointFamily` is manipulated explicitely
>> (e.g. in the type signature for `distance`).
>> In most cases the right constraint could be generated
>> automatically by GHC, although I think direct access to the
>> type class would be beneficial (history shows that direct
>> access to these facilities is good, see upcoming explicit
>> type applications).
>>
>> Maybe the type class associated to a macro type should not even
>> exist -- for example we could simply represent `#Point` as a type
>> family and treat construction and destruction as built-in syntax
>> (the `#` prefix).
>>
>> ### Pattern matching
>>
>> Sugar could be added to pattern match, e.g.
>>
>> ```
>> foo :: Point# a -> ...
>> distance (Point# x1 y1) = ...
>> or
>> dinstance Point#{..} = ... -- #x and #y are available
>> ```
>>
>> ### No "record types" limitation
>>
>> Instead of restricting ourselves to single-constructor records,
>> we could simply generate
>>
>> ```
>> data Point a = Point a
>> { x :: !a
>> , y :: !a
>> }
>>
>> class PointFamily a where
>> data Point# a :: *
>> destruct :: Point# a -> Point a
>> construct :: Point a -> Point# a
>> ```
>>
>> However, I think it would be harder to guarantee the well-behavedness
>> of the inlined functions if we had this intermediate type. I also
>> don't think macro types would be very useful beyond polymorphic
>> unboxed types.
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
More information about the ghc-devs
mailing list