Fwd: Restricted sums in BoxedRep

David Feuer david.feuer at gmail.com
Thu Oct 15 20:08:16 UTC 2020


This might be lost in the noise for MVar and TVar, but for arrays, there
are tremendous advantages to cutting out the extra heap objects.

On Thu, Oct 15, 2020, 1:10 PM Ryan Yates <fryguybob at gmail.com> wrote:

> I haven't been following this discussion too closely, but in my research
> work I found that some of the benefits that I wanted in this direction were
> already there with pointer tagging.
>
> On Thu, Oct 15, 2020 at 12:59 PM David Feuer <david.feuer at gmail.com>
> wrote:
>
>> Yes, that's something quite different. We'd need a whole different heap
>> object type for such MVars and TVars. My approach covers the case where the
>> unboxed thing can only take on a few values, for some value of "a few"
>> which, depending on implementation, may or may not be very small. If the
>> nulls point to actual heap objects in pre-allocated pinned memory (say),
>> then up to 64 or so might be reasonable. If they point to "invalid" address
>> space, then the numbers could go up a good bit.
>>
>> On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald <
>> carter.schonwald at gmail.com> wrote:
>>
>>> Indeed, I mean things that aren’t pointery, and could be represented by
>>> a tvar paired with a mutable byte array or mvar with mutable byte array,
>>> but which we’d want considered as a single heap object from the rts/gc
>>> perspective.
>>>
>>> On Thu, Oct 15, 2020 at 11:58 AM David Feuer <david.feuer at gmail.com>
>>> wrote:
>>>
>>>> Sorry, unlifted, not unboxed...
>>>>
>>>> On Thu, Oct 15, 2020, 11:57 AM David Feuer <david.feuer at gmail.com>
>>>> wrote:
>>>>
>>>>> Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's
>>>>> accepted BoxedRep proposal.
>>>>>
>>>>> On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald <
>>>>> carter.schonwald at gmail.com> wrote:
>>>>>
>>>>>> A related idea that came up recently and is perhaps simpler ties into
>>>>>> this via the lens of having unboxed Mvars/tvars (even if it’s restricted to
>>>>>> just things we can embed in a word64#)
>>>>>>
>>>>>> This came up in
>>>>>> https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where
>>>>>> viktor had millions of independent mvars holding what’s essentially a
>>>>>> strict unit ()!
>>>>>>
>>>>>> The motivation in this later scenario is that in high concurrency
>>>>>> settings, the less trivial stuff the gc needs to trace under updates, the
>>>>>> better ghc scales.
>>>>>>
>>>>>> This may not be a use case david has in mind, but certainly seems
>>>>>> related.
>>>>>>
>>>>>> Phrased more succinctly: gc perf dominates large heap / many core
>>>>>> computation in Haskell via sensitivity to allocation volume / mutation
>>>>>> volume (to ensure generational hypothesis stays valid), and providing tools
>>>>>> to incrementally reduce the pressure with local changes would be good.
>>>>>>
>>>>>> So I’d propose / suggest that a baby step towards what david asks
>>>>>> would be for us to work out some manner of unboxed tvar/mvar ref machinery
>>>>>> that supports unboxed values.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger <
>>>>>> klebinger.andreas at gmx.at> wrote:
>>>>>>
>>>>>>> From a implementors perspective my main questions would be:
>>>>>>>
>>>>>>> * How big is the benefit in practice? How many use cases are there?
>>>>>>> * How bad are the costs? (Runtime overhead, rts complexity, ...)
>>>>>>>
>>>>>>> The details of how this would be exposed to a user would are
>>>>>>> important.
>>>>>>> But if the costs are too high for the drawbacks then it becomes a
>>>>>>> moot point.
>>>>>>>
>>>>>>>
>>>>>>> David Feuer schrieb am 14.10.2020 um 22:21:
>>>>>>>
>>>>>>> Forwarded from Andrew Martin below. I think we want more than just
>>>>>>> Maybe (more than one null), but the nesting I described is certainly more
>>>>>>> convenience than necessity.
>>>>>>>
>>>>>>> ---------- Forwarded message ---------
>>>>>>> From: Andrew Martin <andrew.thaddeus at gmail.com>
>>>>>>> Date: Wed, Oct 14, 2020, 4:14 PM
>>>>>>> Subject: Re: Restricted sums in BoxedRep
>>>>>>> To: David Feuer <david.feuer at gmail.com>
>>>>>>>
>>>>>>>
>>>>>>> You'll have to forward this to the ghc-devs list to share it with
>>>>>>> others since I'm not currently subscribed to it, but I've had this same
>>>>>>> thought before. It is discussed at
>>>>>>> https://github.com/andrewthad/impure-containers/issues/12. Here's
>>>>>>> the relevant excerpt:
>>>>>>>
>>>>>>>> Relatedly, I was thinking the other day that after finishing
>>>>>>>> implementing
>>>>>>>> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-pointer-rep.rst,
>>>>>>>> I should really look at seeing if it's possible to add this
>>>>>>>> maybe-of-a-lifted value trick straight to GHC. I think that with:
>>>>>>>>
>>>>>>>> data RuntimpRep
>>>>>>>>   = BoxedRep Levity
>>>>>>>>   | MaybeBoxedRep Levity
>>>>>>>>   | IntRep
>>>>>>>>   | ...
>>>>>>>>
>>>>>>>> data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
>>>>>>>>
>>>>>>>> This doesn't have the nesting issues because the kind system
>>>>>>>> prevents nesting. But anyway, back to the original question. I would
>>>>>>>> recommend not using Maybe.Unsafe and using unpacked-maybe instead.
>>>>>>>> The latter is definitely safe, and it only costs an extra machine word of
>>>>>>>> space in each data constructor it gets used in, and it doesn't introduce
>>>>>>>> more indirections.
>>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 13, 2020 at 5:47 PM David Feuer <david.feuer at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Null pointers are widely known to be a lousy language feature in
>>>>>>>> general, but there are certain situations where they're *really* useful for
>>>>>>>> compact representation. For example, we define
>>>>>>>>
>>>>>>>>     newtype TMVar a = TMVar (TVar (Maybe a))
>>>>>>>>
>>>>>>>> We don't, however, actually use the fact that (Maybe a) is lifted.
>>>>>>>> So we could represent this much more efficiently using something like
>>>>>>>>
>>>>>>>>     newtype TMVar a = TMVar (TVar a)
>>>>>>>>
>>>>>>>> where Nothing is represented by a distinguished "null" pointer.
>>>>>>>>
>>>>>>>> While it's possible to implement this sort of thing in user code
>>>>>>>> (with lots of fuss and care), it's not very nice at all. What I'd really
>>>>>>>> like to be able to do is represent certain kinds of sums like this natively.
>>>>>>>>
>>>>>>>> Now that we're getting BoxedRep, I think we can probably make it
>>>>>>>> happen. The trick is to add a special Levity constructor representing sums
>>>>>>>> of particular shapes. Specifically, we can represent a type like this if it
>>>>>>>> is a possibly-nested sum which, when flattened into a single sum, consists
>>>>>>>> of some number of nullary tuples and at most one Lifted or Unlifted type.
>>>>>>>> Then we can have (inline) primops to convert between the BoxedRep and the
>>>>>>>> sum-of-sums representations.
>>>>>>>>
>>>>>>>> Anyone have thoughts on details for what the Levity constructor
>>>>>>>> arguments might look like?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Andrew Thaddeus Martin
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ghc-devs mailing listghc-devs at haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ghc-devs mailing list
>>>>>>> ghc-devs at haskell.org
>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>>>>>
>>>>>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20201015/247c494b/attachment.html>


More information about the ghc-devs mailing list