Implementation idea for unboxed polymorphic types

Ömer Sinan Ağacan omeragacan at gmail.com
Tue Jan 5 16:05:24 UTC 2016


That's a really good question, I think. I tried to make it working here:
https://gist.github.com/osa1/00597c24a79816c7ef90/

In that code, just assume whenever you see a type or constructor with Ubx
prefix it's just magically get unboxed.

I added lots of inline comments about problems and results etc. but in short,
we have a problem that this idea doesn't solve: GHC doesn't generate
specialized functions at all. So if I have a function that's too big to inline
or if I simply don't want to inline for some reason, we're out of luck.

(I don't understand this restriction, specialization without inlining would be
very useful in lots of cases, I think)

Second thing is ideally we shouldn't be using unboxed types in the code
manually. For example, in the `distance` functions we should really write
`distance :: Point a => Point a -> a`. But this means that `Unbox` typeclass
needs to be magical, GHC should recognize it and specialize functions using
Unbox dictionaries.

2015-11-17 10:54 GMT-05:00 Alexey Vagarenko <vagarenko at gmail.com>:
> At the moment, GHC does not support type families over kind #,
> but if it did, would this code do the trick
> https://gist.github.com/vagarenko/077c6dd73cd610269aa9 ?
>
> 2015-11-16 22:32 GMT+05:00 Ömer Sinan Ağacan <omeragacan at gmail.com>:
>>
>> > But I don't see why you'd need quoting at constructor calls. Couldn't
>> > you
>> > just have a type class like `PointFamily`?
>>
>> This is exactly right, my memory has failed me. My initial implementation
>> didn't use the type family trick, I had further attempts that use type
>> families
>> but honestly I don't remember how good it worked. This was quite a while
>> ago.
>>
>> 2015-11-15 19:41 GMT-05:00 Richard Eisenberg <eir at cis.upenn.edu>:
>> > After reading Francesco's original post, I immediately thought of Ömer's
>> > proposed approach, of using Template Haskell to produce the right data
>> > family instances. But I don't see why you'd need quoting at constructor
>> > calls. Couldn't you just have a type class like `PointFamily`? I'd be more
>> > interested to see client code in Ömer's version than the TH generation code.
>> >
>> > The TH approach would seem to require having a fixed set of
>> > specializations, which is a downside. But I'm not sure it's so much of a
>> > downside that the approach is unusable.
>> >
>> > Richard
>> >
>> > On Nov 15, 2015, at 10:08 AM, Ömer Sinan Ağacan <omeragacan at gmail.com>
>> > wrote:
>> >
>> >> I had started working on exactly the same thing at some point. I had a
>> >> TemplateHaskell-based implementation which _almost_ worked.
>> >>
>> >> The problem was that the syntax was very, very heavy. Because I had to
>> >> use
>> >> quotes for _every_ constructor application(with explicitly passed
>> >> types).
>> >> (because I had a specialized constructor for every instantiation of
>> >> this
>> >> generic type)
>> >>
>> >> Another problem was that because of how TemplateHaskell quotes
>> >> evaluated, I
>> >> couldn't use a `List Int` where `List` is a template without first
>> >> manually
>> >> adding a line for generating specialized version of `List` on `Int`.
>> >>
>> >> When all of these combined it became very hard to use. But it was a
>> >> proof-of-concept and I think it worked.
>> >>
>> >> (Code is horrible so I won't share it here :) I had to maintain a state
>> >> shared
>> >> with different TH quote evaluations etc.)
>> >>
>> >> 2015-11-15 5:26 GMT-05:00 Francesco Mazzoli <f at mazzo.li>:
>> >>> (A nicely rendered version of this email can be found at
>> >>> <https://gist.github.com/bitonic/52cfe54a2dcdbee1b7f3>)
>> >>>
>> >>> ## Macro types
>> >>>
>> >>> I very often find myself wanting unboxed polymorphic types
>> >>> (e.g. types that contain `UNPACK`ed type variables). I find
>> >>> it extremely frustrating that it's easier to write fast _and_
>> >>> generic code in C++ than in Haskell.
>> >>>
>> >>> I'd like to submit to the mailing list a very rough proposal
>> >>> on how this could be achieved in a pretty straightforward way
>> >>> in GHC.
>> >>>
>> >>> The proposal is meant to be a proof of concept, just to show that
>> >>> this could be done rather easily. I did not think about a nice
>> >>> interface or the implementation details in GHC. My goal is to
>> >>> check the feasibility of this plan with GHC developers.
>> >>>
>> >>> I'll call such types "macro types", since their effect is similar
>> >>> to defining a macro that defines a new type for each type
>> >>> variable instantiation.
>> >>>
>> >>> Consider
>> >>>
>> >>> ```
>> >>> data #Point a = Point
>> >>>  { x :: {-# UNPACK #-} !a
>> >>>  , y :: {-# UNPACK #-} !a
>> >>>  }
>> >>> ```
>> >>>
>> >>> This definition defines the macro type `#Point`, with one parameter
>> >>> `a`.
>> >>>
>> >>> Macro types definition would be allowed only for single-constructor
>> >>> records. The intent is that if we mention `#Point Double`, it will
>> >>> be equivalent to
>> >>>
>> >>> ```
>> >>> data PointDouble = PointDouble
>> >>>  { x :: {-# UNPACK #-} !Double
>> >>>  , y :: {-# UNPACK #-} !Double
>> >>>  }
>> >>> ```
>> >>>
>> >>> To use `#Point` generically, the following type class would be
>> >>> generated:
>> >>>
>> >>> ```
>> >>> class PointFamily a where
>> >>>  data #Point a :: * -- Family of types generated by @data #Point a at .
>> >>>  #Point :: a -> a -> #Point a -- Constructor.
>> >>>  #x :: #Point a -> a -- Projection @x at .
>> >>>  #y :: #Point a -> a -- Projection @y at .
>> >>> ```
>> >>>
>> >>> Thi type class lets us work with `#Point`s generically, for example
>> >>>
>> >>> ```
>> >>> distance :: (PointFamily a, Fractional a) => #Point a -> #Point a -> a
>> >>> distance p1 p2 =
>> >>>  let dx = #x p1 - #x p2
>> >>>      dy = #y p1 - #y p2
>> >>>  in sqrt (dx*dx + dy*dy)
>> >>> ```
>> >>>
>> >>> Internally, for every type appearing for `a`, e.g. `#Point Double`,
>> >>> a new type equivalent to the `PointDouble` above would be generated
>> >>> by GHC, with the corresponding instance
>> >>>
>> >>> ```
>> >>> instance PointFamily Double where
>> >>>  data #Point Double = PointDouble
>> >>>  #x = x
>> >>>  #y = x
>> >>> ```
>> >>>
>> >>> If it's not possible to instantiate `#Point` with the provided type
>> >>> (for example because the type is not `UNPACK`able, e.g.
>> >>> `#Point (Maybe A)`), GHC would throw an error.
>> >>>
>> >>> Note that we can compile `distance` in its polymorphic version
>> >>> (as opposed to C++ templates, where template functions _must_ be
>> >>> instantiated at every use). The polymorphic `distance` would
>> >>> require a call to "virtual functions" `#x` and `#y`, as provided by
>> >>> the `PointFamily` dictionary. But if we use
>> >>> `INLINE` or `SPECIALIZE` pragmas the virtual calls to `#x` and `#y`
>> >>> would disappear, making this as efficient as if we were to define
>> >>> `distance` on the manually defined `PointDouble`. Compiler hints
>> >>> would be put in place to always inline functions using macro types,
>> >>> if possible.
>> >>>
>> >>> Note that the inlining is only important so that the `PointFamily`
>> >>> dictionary disappears, e.g. functions containing recursive
>> >>> helpers are fine, such as
>> >>>
>> >>> ```
>> >>> {-# INLINE leftmost #-}
>> >>> leftmost :: forall a. (PointFamily a, Ord a) => [#Point a] -> #Point a
>> >>> leftmost [] = error "leftmost: no points"
>> >>> leftmost (p0 : ps0) = go p0 ps0
>> >>>  where
>> >>>    go :: #Point a -> [#Point a] -> Point# a
>> >>>    go candidate (p : ps) =
>> >>>      if #x p < #x candidate
>> >>>        then go p ps
>> >>>        else go candidate ps
>> >>> ```
>> >>>
>> >>> It might be worth considering throwing a warning when a top-level
>> >>> definition whose type contains a macro type cannot be inlined, since
>> >>> the main performance benefit of using macro types would be lost.
>> >>>
>> >>> We can define instances for these types as normal, for instance
>> >>>
>> >>> ```
>> >>> instance (Show a, PointFamily a) => Show (#Point a) where
>> >>>  {-# INLINE show #-}
>> >>>  show pt = "Point{x = " ++ #x pt ++ ", y = " ++ #y pt ++ "}"
>> >>> ```
>> >>>
>> >>> `deriving` support could also be added.
>> >>>
>> >>> ## Further ideas
>> >>>
>> >>> ### Hide or remove `PointFamily` from the user
>> >>>
>> >>> In the examples above `PointFamily` is manipulated explicitely
>> >>> (e.g. in the type signature for `distance`).
>> >>> In most cases the right constraint could be generated
>> >>> automatically by GHC, although I think direct access to the
>> >>> type class would be beneficial (history shows that direct
>> >>> access to these facilities is good, see upcoming explicit
>> >>> type applications).
>> >>>
>> >>> Maybe the type class associated to a macro type should not even
>> >>> exist -- for example we could simply represent `#Point` as a type
>> >>> family and treat construction and destruction as built-in syntax
>> >>> (the `#` prefix).
>> >>>
>> >>> ### Pattern matching
>> >>>
>> >>> Sugar could be added to pattern match, e.g.
>> >>>
>> >>> ```
>> >>> foo :: Point# a -> ...
>> >>> distance (Point# x1 y1) = ...
>> >>>  or
>> >>> dinstance Point#{..} = ... -- #x and #y are available
>> >>> ```
>> >>>
>> >>> ### No "record types" limitation
>> >>>
>> >>> Instead of restricting ourselves to single-constructor records,
>> >>> we could simply generate
>> >>>
>> >>> ```
>> >>> data Point a = Point a
>> >>>  { x :: !a
>> >>>  , y :: !a
>> >>>  }
>> >>>
>> >>> class PointFamily a where
>> >>>  data Point# a :: *
>> >>>  destruct :: Point# a -> Point a
>> >>>  construct :: Point a -> Point# a
>> >>> ```
>> >>>
>> >>> However, I think it would be harder to guarantee the well-behavedness
>> >>> of the inlined functions if we had this intermediate type. I also
>> >>> don't think macro types would be very useful beyond polymorphic
>> >>> unboxed types.
>> >>> _______________________________________________
>> >>> ghc-devs mailing list
>> >>> ghc-devs at haskell.org
>> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> >> _______________________________________________
>> >> ghc-devs mailing list
>> >> ghc-devs at haskell.org
>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> >
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>


More information about the ghc-devs mailing list