From johan.tibell at gmail.com Tue Sep 1 05:14:22 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Mon, 31 Aug 2015 22:14:22 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Works for me. On Mon, Aug 31, 2015 at 3:50 PM, Ryan Yates wrote: > Any time works for me. > > Ryan > > On Mon, Aug 31, 2015 at 6:11 PM, Ryan Newton wrote: > > Dear Edward, Ryan Yates, and other interested parties -- > > > > So when should we meet up about this? > > > > May I propose the Tues afternoon break for everyone at ICFP who is > > interested in this topic? We can meet out in the coffee area and > congregate > > around Edward Kmett, who is tall and should be easy to find ;-). > > > > I think Ryan is going to show us how to use his new primops for combined > > array + other fields in one heap object? > > > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: > >> > >> Without a custom primitive it doesn't help much there, you have to store > >> the indirection to the mask. > >> > >> With a custom primitive it should cut the on heap root-to-leaf path of > >> everything in the HAMT in half. A shorter HashMap was actually one of > the > >> motivating factors for me doing this. It is rather astoundingly > difficult to > >> beat the performance of HashMap, so I had to start cheating pretty > badly. ;) > >> > >> -Edward > >> > >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > >> wrote: > >>> > >>> I'd also be interested to chat at ICFP to see if I can use this for my > >>> HAMT implementation. > >>> > >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > wrote: > >>>> > >>>> Sounds good to me. Right now I'm just hacking up composable accessors > >>>> for "typed slots" in a fairly lens-like fashion, and treating the set > of > >>>> slots I define and the 'new' function I build for the data type as > its API, > >>>> and build atop that. This could eventually graduate to > template-haskell, but > >>>> I'm not entirely satisfied with the solution I have. I currently > distinguish > >>>> between what I'm calling "slots" (things that point directly to > another > >>>> SmallMutableArrayArray# sans wrapper) and "fields" which point > directly to > >>>> the usual Haskell data types because unifying the two notions meant > that I > >>>> couldn't lift some coercions out "far enough" to make them vanish. > >>>> > >>>> I'll be happy to run through my current working set of issues in > person > >>>> and -- as things get nailed down further -- in a longer lived medium > than in > >>>> personal conversations. ;) > >>>> > >>>> -Edward > >>>> > >>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > wrote: > >>>>> > >>>>> I'd also love to meet up at ICFP and discuss this. I think the array > >>>>> primops plus a TH layer that lets (ab)use them many times without > too much > >>>>> marginal cost sounds great. And I'd like to learn how we could be > either > >>>>> early users of, or help with, this infrastructure. > >>>>> > >>>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in > >>>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. > student > >>>>> who is currently working on concurrent data structures in Haskell, > but will > >>>>> not be at ICFP. > >>>>> > >>>>> > >>>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > >>>>> wrote: > >>>>>> > >>>>>> I completely agree. I would love to spend some time during ICFP and > >>>>>> friends talking about what it could look like. My small array for > STM > >>>>>> changes for the RTS can be seen here [1]. It is on a branch > somewhere > >>>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some > >>>>>> confusing naming choices (sorry), but should cover all the details > >>>>>> needed to implement it for a non-STM context. The biggest surprise > >>>>>> for me was following small array too closely and having a word/byte > >>>>>> offset miss-match [2]. > >>>>>> > >>>>>> [1]: > >>>>>> > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > >>>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 > >>>>>> > >>>>>> Ryan > >>>>>> > >>>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett > >>>>>> wrote: > >>>>>> > I'd love to have that last 10%, but its a lot of work to get there > >>>>>> > and more > >>>>>> > importantly I don't know quite what it should look like. > >>>>>> > > >>>>>> > On the other hand, I do have a pretty good idea of how the > >>>>>> > primitives above > >>>>>> > could be banged out and tested in a long evening, well in time for > >>>>>> > 7.12. And > >>>>>> > as noted earlier, those remain useful even if a nicer typed > version > >>>>>> > with an > >>>>>> > extra level of indirection to the sizes is built up after. > >>>>>> > > >>>>>> > The rest sounds like a good graduate student project for someone > who > >>>>>> > has > >>>>>> > graduate students lying around. Maybe somebody at Indiana > University > >>>>>> > who has > >>>>>> > an interest in type theory and parallelism can find us one. =) > >>>>>> > > >>>>>> > -Edward > >>>>>> > > >>>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates > >>>>>> > wrote: > >>>>>> >> > >>>>>> >> I think from my perspective, the motivation for getting the type > >>>>>> >> checker involved is primarily bringing this to the level where > >>>>>> >> users > >>>>>> >> could be expected to build these structures. it is reasonable to > >>>>>> >> think that there are people who want to use STM (a context with > >>>>>> >> mutation already) to implement a straight forward data structure > >>>>>> >> that > >>>>>> >> avoids extra indirection penalty. There should be some places > >>>>>> >> where > >>>>>> >> knowing that things are field accesses rather then array indexing > >>>>>> >> could be helpful, but I think GHC is good right now about > handling > >>>>>> >> constant offsets. In my code I don't do any bounds checking as I > >>>>>> >> know > >>>>>> >> I will only be accessing my arrays with constant indexes. I make > >>>>>> >> wrappers for each field access and leave all the unsafe stuff in > >>>>>> >> there. When things go wrong though, the compiler is no help. > >>>>>> >> Maybe > >>>>>> >> template Haskell that generates the appropriate wrappers is the > >>>>>> >> right > >>>>>> >> direction to go. > >>>>>> >> There is another benefit for me when working with these as arrays > >>>>>> >> in > >>>>>> >> that it is quite simple and direct (given the hoops already > jumped > >>>>>> >> through) to play with alignment. I can ensure two pointers are > >>>>>> >> never > >>>>>> >> on the same cache-line by just spacing things out in the array. > >>>>>> >> > >>>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett > >>>>>> >> wrote: > >>>>>> >> > They just segfault at this level. ;) > >>>>>> >> > > >>>>>> >> > Sent from my iPhone > >>>>>> >> > > >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton > >>>>>> >> > wrote: > >>>>>> >> > > >>>>>> >> > You presumably also save a bounds check on reads by hard-coding > >>>>>> >> > the > >>>>>> >> > sizes? > >>>>>> >> > > >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett < > ekmett at gmail.com> > >>>>>> >> > wrote: > >>>>>> >> >> > >>>>>> >> >> Also there are 4 different "things" here, basically depending > on > >>>>>> >> >> two > >>>>>> >> >> independent questions: > >>>>>> >> >> > >>>>>> >> >> a.) if you want to shove the sizes into the info table, and > >>>>>> >> >> b.) if you want cardmarking. > >>>>>> >> >> > >>>>>> >> >> Versions with/without cardmarking for different sizes can be > >>>>>> >> >> done > >>>>>> >> >> pretty > >>>>>> >> >> easily, but as noted, the infotable variants are pretty > >>>>>> >> >> invasive. > >>>>>> >> >> > >>>>>> >> >> -Edward > >>>>>> >> >> > >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett < > ekmett at gmail.com> > >>>>>> >> >> wrote: > >>>>>> >> >>> > >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which > >>>>>> >> >>> adds up > >>>>>> >> >>> if > >>>>>> >> >>> they were small enough and there are enough of them. You get > a > >>>>>> >> >>> bit > >>>>>> >> >>> better > >>>>>> >> >>> locality of reference in terms of what fits in the first > cache > >>>>>> >> >>> line of > >>>>>> >> >>> them. > >>>>>> >> >>> > >>>>>> >> >>> -Edward > >>>>>> >> >>> > >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >>>>>> >> >>> > >>>>>> >> >>> wrote: > >>>>>> >> >>>> > >>>>>> >> >>>> Yes. And for the short term I can imagine places we will > >>>>>> >> >>>> settle with > >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and > >>>>>> >> >>>> unsafeCoercing > >>>>>> >> >>>> pointers whose types don't actually match their siblings. > >>>>>> >> >>>> > >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed > >>>>>> >> >>>> sized > >>>>>> >> >>>> array > >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to > >>>>>> >> >>>> derecommend, as > >>>>>> >> >>>> you mentioned!) > >>>>>> >> >>>> > >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >>>>>> >> >>>> > >>>>>> >> >>>> wrote: > >>>>>> >> >>>>> > >>>>>> >> >>>>> I think both are useful, but the one you suggest requires a > >>>>>> >> >>>>> lot more > >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the > >>>>>> >> >>>>> other. > >>>>>> >> >>>>> > >>>>>> >> >>>>> -Edward > >>>>>> >> >>>>> > >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >>>>>> >> >>>>> > >>>>>> >> >>>>> wrote: > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed > type, > >>>>>> >> >>>>>> unbounded > >>>>>> >> >>>>>> length) with extra payload. > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, > >>>>>> >> >>>>>> especially > >>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general > >>>>>> >> >>>>>> solution > >>>>>> >> >>>>>> for > >>>>>> >> >>>>>> structs be one that that allows new user data type defs > for > >>>>>> >> >>>>>> # > >>>>>> >> >>>>>> types? > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >>>>>> >> >>>>>> > >>>>>> >> >>>>>> wrote: > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words > >>>>>> >> >>>>>>> and a > >>>>>> >> >>>>>>> known > >>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was > >>>>>> >> >>>>>>> suggesting > >>>>>> >> >>>>>>> above, but > >>>>>> >> >>>>>>> where the word counts were stored in the objects > >>>>>> >> >>>>>>> themselves. > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts > >>>>>> >> >>>>>>> it'd > >>>>>> >> >>>>>>> likely > >>>>>> >> >>>>>>> want to be something we build in addition to MutVar# > rather > >>>>>> >> >>>>>>> than a > >>>>>> >> >>>>>>> replacement. > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and > build > >>>>>> >> >>>>>>> info > >>>>>> >> >>>>>>> tables that knew them, and typechecker support, for > >>>>>> >> >>>>>>> instance, it'd > >>>>>> >> >>>>>>> get > >>>>>> >> >>>>>>> rather invasive. > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' > >>>>>> >> >>>>>>> versions > >>>>>> >> >>>>>>> above, like working with evil unsized c-style arrays > >>>>>> >> >>>>>>> directly > >>>>>> >> >>>>>>> inline at the > >>>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't > even > >>>>>> >> >>>>>>> a pure > >>>>>> >> >>>>>>> win if we > >>>>>> >> >>>>>>> did the engineering effort. > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by > adding > >>>>>> >> >>>>>>> the one > >>>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> -Edward > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> wrote: > >>>>>> >> >>>>>>>> > >>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable > >>>>>> >> >>>>>>>> structs > >>>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why > >>>>>> >> >>>>>>>> it's hard. > >>>>>> >> >>>>>>>> > >>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of > >>>>>> >> >>>>>>>> object > >>>>>> >> >>>>>>>> identity problems. But what about directly supporting an > >>>>>> >> >>>>>>>> extensible set of > >>>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even > >>>>>> >> >>>>>>>> replacing) > >>>>>> >> >>>>>>>> MutVar#? That > >>>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? > >>>>>> >> >>>>>>>> > >>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want > >>>>>> >> >>>>>>>> best in > >>>>>> >> >>>>>>>> class > >>>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and > >>>>>> >> >>>>>>>> sequential > >>>>>> >> >>>>>>>> counterparts. > >>>>>> >> >>>>>>>> > >>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >>>>>> >> >>>>>>>> wrote: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it > into a > >>>>>> >> >>>>>>>>> short > >>>>>> >> >>>>>>>>> article. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC > Trac, > >>>>>> >> >>>>>>>>> and > >>>>>> >> >>>>>>>>> maybe > >>>>>> >> >>>>>>>>> make a ticket for it. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Thanks > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Simon > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >>>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 > >>>>>> >> >>>>>>>>> To: Simon Peyton Jones > >>>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified > >>>>>> >> >>>>>>>>> invariant. It > >>>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or > >>>>>> >> >>>>>>>>> ByteArray#'s. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> While those live in #, they are garbage collected > >>>>>> >> >>>>>>>>> objects, so > >>>>>> >> >>>>>>>>> this > >>>>>> >> >>>>>>>>> all lives on the heap. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when > >>>>>> >> >>>>>>>>> it has > >>>>>> >> >>>>>>>>> to > >>>>>> >> >>>>>>>>> deal with nested arrays. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a > better > >>>>>> >> >>>>>>>>> thing. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> The Problem > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> ----------------- > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Consider the scenario where you write a classic > >>>>>> >> >>>>>>>>> doubly-linked > >>>>>> >> >>>>>>>>> list > >>>>>> >> >>>>>>>>> in Haskell. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 > >>>>>> >> >>>>>>>>> pointers > >>>>>> >> >>>>>>>>> on > >>>>>> >> >>>>>>>>> the heap. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe > DLL) > >>>>>> >> >>>>>>>>> ~> > >>>>>> >> >>>>>>>>> Maybe > >>>>>> >> >>>>>>>>> DLL ~> DLL > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> That is 3 levels of indirection. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with > >>>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for > DLL > >>>>>> >> >>>>>>>>> and > >>>>>> >> >>>>>>>>> worsening our representation. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> This means that every operation we perform on this > >>>>>> >> >>>>>>>>> structure > >>>>>> >> >>>>>>>>> will > >>>>>> >> >>>>>>>>> be about half of the speed of an implementation in most > >>>>>> >> >>>>>>>>> other > >>>>>> >> >>>>>>>>> languages > >>>>>> >> >>>>>>>>> assuming we're memory bound on loading things into > cache! > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Making Progress > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> ---------------------- > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I have been working on a number of data structures > where > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> indirection of going from something in * out to an > object > >>>>>> >> >>>>>>>>> in # > >>>>>> >> >>>>>>>>> which > >>>>>> >> >>>>>>>>> contains the real pointer to my target and coming back > >>>>>> >> >>>>>>>>> effectively doubles > >>>>>> >> >>>>>>>>> my runtime. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> MutVar# > >>>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well > >>>>>> >> >>>>>>>>> defined > >>>>>> >> >>>>>>>>> write-barrier. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I could change out the representation to use > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# > every > >>>>>> >> >>>>>>>>> time, > >>>>>> >> >>>>>>>>> but > >>>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the > >>>>>> >> >>>>>>>>> amount of > >>>>>> >> >>>>>>>>> distinct > >>>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 > >>>>>> >> >>>>>>>>> per > >>>>>> >> >>>>>>>>> object to 2. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get > to > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> array > >>>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that > >>>>>> >> >>>>>>>>> to the > >>>>>> >> >>>>>>>>> next array. I > >>>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm > >>>>>> >> >>>>>>>>> paying for > >>>>>> >> >>>>>>>>> a card > >>>>>> >> >>>>>>>>> marking table as well, which I don't particularly need > >>>>>> >> >>>>>>>>> with just > >>>>>> >> >>>>>>>>> two > >>>>>> >> >>>>>>>>> pointers, but we can shed that with the > >>>>>> >> >>>>>>>>> "SmallMutableArray#" > >>>>>> >> >>>>>>>>> machinery added > >>>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new > >>>>>> >> >>>>>>>>> data > >>>>>> >> >>>>>>>>> type, which can > >>>>>> >> >>>>>>>>> speed things up a bit when you don't have very big > >>>>>> >> >>>>>>>>> arrays: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and > >>>>>> >> >>>>>>>>> have two > >>>>>> >> >>>>>>>>> mutable fields and be able to share the sme write > >>>>>> >> >>>>>>>>> barrier? > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array > >>>>>> >> >>>>>>>>> types. > >>>>>> >> >>>>>>>>> What > >>>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal > with > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> impedence > >>>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and > >>>>>> >> >>>>>>>>> then just > >>>>>> >> >>>>>>>>> let the > >>>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make > be > >>>>>> >> >>>>>>>>> a > >>>>>> >> >>>>>>>>> special > >>>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can > >>>>>> >> >>>>>>>>> even > >>>>>> >> >>>>>>>>> abuse pattern > >>>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals > further > >>>>>> >> >>>>>>>>> to > >>>>>> >> >>>>>>>>> make this > >>>>>> >> >>>>>>>>> cheaper. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and > >>>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the > >>>>>> >> >>>>>>>>> preceding > >>>>>> >> >>>>>>>>> and next > >>>>>> >> >>>>>>>>> entry in the linked list. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps > me' > >>>>>> >> >>>>>>>>> into a > >>>>>> >> >>>>>>>>> strict world, and everything there lives in #. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> next :: DLL -> IO DLL > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s > >>>>>> >> >>>>>>>>> of > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that > >>>>>> >> >>>>>>>>> code to > >>>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed > >>>>>> >> >>>>>>>>> pretty > >>>>>> >> >>>>>>>>> easily when they > >>>>>> >> >>>>>>>>> are known strict and you chain operations of this sort! > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Cleaning it Up > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> ------------------ > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an > array > >>>>>> >> >>>>>>>>> that > >>>>>> >> >>>>>>>>> points directly to other arrays. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, > but > >>>>>> >> >>>>>>>>> I can > >>>>>> >> >>>>>>>>> fix > >>>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and > >>>>>> >> >>>>>>>>> using a > >>>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me > >>>>>> >> >>>>>>>>> store a > >>>>>> >> >>>>>>>>> mixture of > >>>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data > >>>>>> >> >>>>>>>>> structure. > >>>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> existing > >>>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one > >>>>>> >> >>>>>>>>> of the > >>>>>> >> >>>>>>>>> arguments it > >>>>>> >> >>>>>>>>> takes. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have > fields > >>>>>> >> >>>>>>>>> that > >>>>>> >> >>>>>>>>> would > >>>>>> >> >>>>>>>>> be best left unboxed. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can > >>>>>> >> >>>>>>>>> currently > >>>>>> >> >>>>>>>>> at > >>>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# > >>>>>> >> >>>>>>>>> at a > >>>>>> >> >>>>>>>>> boxed or at a > >>>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove > the > >>>>>> >> >>>>>>>>> int in > >>>>>> >> >>>>>>>>> question in > >>>>>> >> >>>>>>>>> there. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I > >>>>>> >> >>>>>>>>> need to > >>>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. > >>>>>> >> >>>>>>>>> Having to > >>>>>> >> >>>>>>>>> go off to > >>>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the > first > >>>>>> >> >>>>>>>>> pointer > >>>>>> >> >>>>>>>>> chase. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we > >>>>>> >> >>>>>>>>> could > >>>>>> >> >>>>>>>>> construct that had n words with unsafe access and m > >>>>>> >> >>>>>>>>> pointers to > >>>>>> >> >>>>>>>>> other heap > >>>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list > >>>>>> >> >>>>>>>>> when any > >>>>>> >> >>>>>>>>> of those > >>>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of > >>>>>> >> >>>>>>>>> two in > >>>>>> >> >>>>>>>>> all > >>>>>> >> >>>>>>>>> circumstances. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Prototype > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> ------------- > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Over the last few days I've put together a small > >>>>>> >> >>>>>>>>> prototype > >>>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data > >>>>>> >> >>>>>>>>> structures > >>>>>> >> >>>>>>>>> for things > >>>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem > >>>>>> >> >>>>>>>>> and > >>>>>> >> >>>>>>>>> order-maintenance. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> https://github.com/ekmett/structs > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Notable bits: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation > >>>>>> >> >>>>>>>>> of > >>>>>> >> >>>>>>>>> link-cut > >>>>>> >> >>>>>>>>> trees in this style. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying > guts > >>>>>> >> >>>>>>>>> that > >>>>>> >> >>>>>>>>> make > >>>>>> >> >>>>>>>>> it go fast. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, > >>>>>> >> >>>>>>>>> almost > >>>>>> >> >>>>>>>>> all > >>>>>> >> >>>>>>>>> the references to the LinkCut or Object data > constructor > >>>>>> >> >>>>>>>>> get > >>>>>> >> >>>>>>>>> optimized away, > >>>>>> >> >>>>>>>>> and we're left with beautiful strict code directly > >>>>>> >> >>>>>>>>> mutating out > >>>>>> >> >>>>>>>>> underlying > >>>>>> >> >>>>>>>>> representation. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it > into a > >>>>>> >> >>>>>>>>> short > >>>>>> >> >>>>>>>>> article. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> -Edward > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >>>>>> >> >>>>>>>>> wrote: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in > this > >>>>>> >> >>>>>>>>> thread. > >>>>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is > >>>>>> >> >>>>>>>>> there a > >>>>>> >> >>>>>>>>> ticket? Is > >>>>>> >> >>>>>>>>> there a wiki page? > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket > would > >>>>>> >> >>>>>>>>> be a > >>>>>> >> >>>>>>>>> good > >>>>>> >> >>>>>>>>> thing. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Simon > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] > On > >>>>>> >> >>>>>>>>> Behalf > >>>>>> >> >>>>>>>>> Of > >>>>>> >> >>>>>>>>> Edward Kmett > >>>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 > >>>>>> >> >>>>>>>>> To: Manuel M T Chakravarty > >>>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's > >>>>>> >> >>>>>>>>> would be > >>>>>> >> >>>>>>>>> very handy as well. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Consider right now if I have something like an > >>>>>> >> >>>>>>>>> order-maintenance > >>>>>> >> >>>>>>>>> structure I have: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray > s) > >>>>>> >> >>>>>>>>> {-# > >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > !(MutVar > >>>>>> >> >>>>>>>>> s > >>>>>> >> >>>>>>>>> (Upper s)) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper > s)) > >>>>>> >> >>>>>>>>> {-# > >>>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > !(MutVar > >>>>>> >> >>>>>>>>> s > >>>>>> >> >>>>>>>>> (Lower s)) {-# > >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and > two > >>>>>> >> >>>>>>>>> pointers, > >>>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is > >>>>>> >> >>>>>>>>> basically > >>>>>> >> >>>>>>>>> the same > >>>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the > >>>>>> >> >>>>>>>>> structure > >>>>>> >> >>>>>>>>> above. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> On the heap this is an object that points to a > structure > >>>>>> >> >>>>>>>>> for the > >>>>>> >> >>>>>>>>> bytearray, and points to another structure for each > >>>>>> >> >>>>>>>>> mutvar which > >>>>>> >> >>>>>>>>> each point > >>>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of > >>>>>> >> >>>>>>>>> indirection smeared > >>>>>> >> >>>>>>>>> over everything. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward > >>>>>> >> >>>>>>>>> link > >>>>>> >> >>>>>>>>> from > >>>>>> >> >>>>>>>>> the structure below to the structure above. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a > MutableByteArray#, > >>>>>> >> >>>>>>>>> and > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous > >>>>>> >> >>>>>>>>> objects, > >>>>>> >> >>>>>>>>> represented > >>>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use > >>>>>> >> >>>>>>>>> sameMutableArrayArray# on these > >>>>>> >> >>>>>>>>> for object identity, which lets me check for the ends > of > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> lists by tying > >>>>>> >> >>>>>>>>> things back on themselves. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> and below that > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot > pointing > >>>>>> >> >>>>>>>>> up to > >>>>>> >> >>>>>>>>> an > >>>>>> >> >>>>>>>>> upper structure. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting > out > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> slots > >>>>>> >> >>>>>>>>> in question, while it has gained a level of indirection > >>>>>> >> >>>>>>>>> between > >>>>>> >> >>>>>>>>> the wrapper > >>>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that > >>>>>> >> >>>>>>>>> one can > >>>>>> >> >>>>>>>>> be basically > >>>>>> >> >>>>>>>>> erased by ghc. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on > >>>>>> >> >>>>>>>>> the heap > >>>>>> >> >>>>>>>>> for > >>>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# > for > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> object itself, > >>>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry > >>>>>> >> >>>>>>>>> around the > >>>>>> >> >>>>>>>>> mutable > >>>>>> >> >>>>>>>>> int. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> The only pain points are > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently > prevents > >>>>>> >> >>>>>>>>> me > >>>>>> >> >>>>>>>>> from > >>>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or > Array > >>>>>> >> >>>>>>>>> into an > >>>>>> >> >>>>>>>>> ArrayArray > >>>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the > rest > >>>>>> >> >>>>>>>>> of > >>>>>> >> >>>>>>>>> Haskell, > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> and > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us > >>>>>> >> >>>>>>>>> avoid the > >>>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 > >>>>>> >> >>>>>>>>> pointers > >>>>>> >> >>>>>>>>> wide. Card > >>>>>> >> >>>>>>>>> marking doesn't help. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things > and > >>>>>> >> >>>>>>>>> convert > >>>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how > to > >>>>>> >> >>>>>>>>> unsafeCoerce my way > >>>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other > arrays > >>>>>> >> >>>>>>>>> directly into the > >>>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see > here > >>>>>> >> >>>>>>>>> by > >>>>>> >> >>>>>>>>> aping the > >>>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really > >>>>>> >> >>>>>>>>> dangerous! > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on > >>>>>> >> >>>>>>>>> the > >>>>>> >> >>>>>>>>> altar > >>>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC > move > >>>>>> >> >>>>>>>>> them > >>>>>> >> >>>>>>>>> and collect > >>>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based > >>>>>> >> >>>>>>>>> solutions. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> -Edward > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >>>>>> >> >>>>>>>>> wrote: > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> That?s an interesting idea. > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> Manuel > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > Edward Kmett : > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add > >>>>>> >> >>>>>>>>> > Array# and > >>>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that > >>>>>> >> >>>>>>>>> > the > >>>>>> >> >>>>>>>>> > ArrayArray# entries > >>>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of > >>>>>> >> >>>>>>>>> > indirection for > >>>>>> >> >>>>>>>>> > the containing > >>>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it > >>>>>> >> >>>>>>>>> > if my > >>>>>> >> >>>>>>>>> > leaf level data > >>>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among > ByteArray#s. > >>>>>> >> >>>>>>>>> > It'd be > >>>>>> >> >>>>>>>>> > nice to be > >>>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff > >>>>>> >> >>>>>>>>> > down at > >>>>>> >> >>>>>>>>> > the leaves to > >>>>>> >> >>>>>>>>> > hold lifted contents. > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I > go > >>>>>> >> >>>>>>>>> > to > >>>>>> >> >>>>>>>>> > access > >>>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd > >>>>>> >> >>>>>>>>> > do that > >>>>>> >> >>>>>>>>> > if i tried to > >>>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# > >>>>>> >> >>>>>>>>> > as a > >>>>>> >> >>>>>>>>> > ByteArray# > >>>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story > >>>>>> >> >>>>>>>>> > preventing > >>>>>> >> >>>>>>>>> > this. > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the > >>>>>> >> >>>>>>>>> > indirection > >>>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, > and > >>>>>> >> >>>>>>>>> > I > >>>>>> >> >>>>>>>>> > could shoehorn a > >>>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of > >>>>>> >> >>>>>>>>> > unnecessary > >>>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce > >>>>>> >> >>>>>>>>> > that pain > >>>>>> >> >>>>>>>>> > to just 1 > >>>>>> >> >>>>>>>>> > level of unnecessary indirection. > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > -Edward > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > _______________________________________________ > >>>>>> >> >>>>>>>>> > ghc-devs mailing list > >>>>>> >> >>>>>>>>> > ghc-devs at haskell.org > >>>>>> >> >>>>>>>>> > > >>>>>> >> >>>>>>>>> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> _______________________________________________ > >>>>>> >> >>>>>>>>> ghc-devs mailing list > >>>>>> >> >>>>>>>>> ghc-devs at haskell.org > >>>>>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> > >>>>>> >> >>>>> > >>>>>> >> >>> > >>>>>> >> >> > >>>>>> >> > > >>>>>> >> > > >>>>>> >> > _______________________________________________ > >>>>>> >> > ghc-devs mailing list > >>>>>> >> > ghc-devs at haskell.org > >>>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>>>> >> > > >>>>>> > > >>>>>> > > >>>>> > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> ghc-devs mailing list > >>>> ghc-devs at haskell.org > >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>> > >>> > >> > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Tue Sep 1 06:45:40 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 31 Aug 2015 23:45:40 -0700 Subject: more releases Message-ID: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> Hi devs, An interesting topic came up over dinner tonight: what if GHC made more releases? As an extreme example, we could release a new point version every time a bug fix gets merged to the stable branch. This may be a terrible idea. But what's stopping us from doing so? The biggest objection I can see is that we would want to make sure that users' code would work with the new version. Could the Stackage crew help us with this? If they run their nightly build with a release candidate and diff against the prior results, we would get a pretty accurate sense of whether the bugfix is good. If this test succeeds, why not release? Would it be hard to automate the packaging/posting process? The advantage to more releases is that it gets bugfixes in more hands sooner. What are the disadvantages? Richard PS: I'm not 100% sold on this idea. But I thought it was interesting enough to raise a broader discussion. From ekmett at gmail.com Tue Sep 1 06:50:14 2015 From: ekmett at gmail.com (Edward Kmett) Date: Mon, 31 Aug 2015 23:50:14 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Works for me. On Mon, Aug 31, 2015 at 10:14 PM, Johan Tibell wrote: > Works for me. > > On Mon, Aug 31, 2015 at 3:50 PM, Ryan Yates wrote: > >> Any time works for me. >> >> Ryan >> >> On Mon, Aug 31, 2015 at 6:11 PM, Ryan Newton wrote: >> > Dear Edward, Ryan Yates, and other interested parties -- >> > >> > So when should we meet up about this? >> > >> > May I propose the Tues afternoon break for everyone at ICFP who is >> > interested in this topic? We can meet out in the coffee area and >> congregate >> > around Edward Kmett, who is tall and should be easy to find ;-). >> > >> > I think Ryan is going to show us how to use his new primops for combined >> > array + other fields in one heap object? >> > >> > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: >> >> >> >> Without a custom primitive it doesn't help much there, you have to >> store >> >> the indirection to the mask. >> >> >> >> With a custom primitive it should cut the on heap root-to-leaf path of >> >> everything in the HAMT in half. A shorter HashMap was actually one of >> the >> >> motivating factors for me doing this. It is rather astoundingly >> difficult to >> >> beat the performance of HashMap, so I had to start cheating pretty >> badly. ;) >> >> >> >> -Edward >> >> >> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >> >> wrote: >> >>> >> >>> I'd also be interested to chat at ICFP to see if I can use this for my >> >>> HAMT implementation. >> >>> >> >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett >> wrote: >> >>>> >> >>>> Sounds good to me. Right now I'm just hacking up composable accessors >> >>>> for "typed slots" in a fairly lens-like fashion, and treating the >> set of >> >>>> slots I define and the 'new' function I build for the data type as >> its API, >> >>>> and build atop that. This could eventually graduate to >> template-haskell, but >> >>>> I'm not entirely satisfied with the solution I have. I currently >> distinguish >> >>>> between what I'm calling "slots" (things that point directly to >> another >> >>>> SmallMutableArrayArray# sans wrapper) and "fields" which point >> directly to >> >>>> the usual Haskell data types because unifying the two notions meant >> that I >> >>>> couldn't lift some coercions out "far enough" to make them vanish. >> >>>> >> >>>> I'll be happy to run through my current working set of issues in >> person >> >>>> and -- as things get nailed down further -- in a longer lived medium >> than in >> >>>> personal conversations. ;) >> >>>> >> >>>> -Edward >> >>>> >> >>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton >> wrote: >> >>>>> >> >>>>> I'd also love to meet up at ICFP and discuss this. I think the >> array >> >>>>> primops plus a TH layer that lets (ab)use them many times without >> too much >> >>>>> marginal cost sounds great. And I'd like to learn how we could be >> either >> >>>>> early users of, or help with, this infrastructure. >> >>>>> >> >>>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >> >>>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. >> student >> >>>>> who is currently working on concurrent data structures in Haskell, >> but will >> >>>>> not be at ICFP. >> >>>>> >> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates >> >>>>> wrote: >> >>>>>> >> >>>>>> I completely agree. I would love to spend some time during ICFP >> and >> >>>>>> friends talking about what it could look like. My small array for >> STM >> >>>>>> changes for the RTS can be seen here [1]. It is on a branch >> somewhere >> >>>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some >> >>>>>> confusing naming choices (sorry), but should cover all the details >> >>>>>> needed to implement it for a non-STM context. The biggest surprise >> >>>>>> for me was following small array too closely and having a word/byte >> >>>>>> offset miss-match [2]. >> >>>>>> >> >>>>>> [1]: >> >>>>>> >> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >> >>>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >> >>>>>> >> >>>>>> Ryan >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett >> >>>>>> wrote: >> >>>>>> > I'd love to have that last 10%, but its a lot of work to get >> there >> >>>>>> > and more >> >>>>>> > importantly I don't know quite what it should look like. >> >>>>>> > >> >>>>>> > On the other hand, I do have a pretty good idea of how the >> >>>>>> > primitives above >> >>>>>> > could be banged out and tested in a long evening, well in time >> for >> >>>>>> > 7.12. And >> >>>>>> > as noted earlier, those remain useful even if a nicer typed >> version >> >>>>>> > with an >> >>>>>> > extra level of indirection to the sizes is built up after. >> >>>>>> > >> >>>>>> > The rest sounds like a good graduate student project for someone >> who >> >>>>>> > has >> >>>>>> > graduate students lying around. Maybe somebody at Indiana >> University >> >>>>>> > who has >> >>>>>> > an interest in type theory and parallelism can find us one. =) >> >>>>>> > >> >>>>>> > -Edward >> >>>>>> > >> >>>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates > > >> >>>>>> > wrote: >> >>>>>> >> >> >>>>>> >> I think from my perspective, the motivation for getting the type >> >>>>>> >> checker involved is primarily bringing this to the level where >> >>>>>> >> users >> >>>>>> >> could be expected to build these structures. it is reasonable >> to >> >>>>>> >> think that there are people who want to use STM (a context with >> >>>>>> >> mutation already) to implement a straight forward data structure >> >>>>>> >> that >> >>>>>> >> avoids extra indirection penalty. There should be some places >> >>>>>> >> where >> >>>>>> >> knowing that things are field accesses rather then array >> indexing >> >>>>>> >> could be helpful, but I think GHC is good right now about >> handling >> >>>>>> >> constant offsets. In my code I don't do any bounds checking as >> I >> >>>>>> >> know >> >>>>>> >> I will only be accessing my arrays with constant indexes. I >> make >> >>>>>> >> wrappers for each field access and leave all the unsafe stuff in >> >>>>>> >> there. When things go wrong though, the compiler is no help. >> >>>>>> >> Maybe >> >>>>>> >> template Haskell that generates the appropriate wrappers is the >> >>>>>> >> right >> >>>>>> >> direction to go. >> >>>>>> >> There is another benefit for me when working with these as >> arrays >> >>>>>> >> in >> >>>>>> >> that it is quite simple and direct (given the hoops already >> jumped >> >>>>>> >> through) to play with alignment. I can ensure two pointers are >> >>>>>> >> never >> >>>>>> >> on the same cache-line by just spacing things out in the array. >> >>>>>> >> >> >>>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett > > >> >>>>>> >> wrote: >> >>>>>> >> > They just segfault at this level. ;) >> >>>>>> >> > >> >>>>>> >> > Sent from my iPhone >> >>>>>> >> > >> >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >> >>>>>> >> > wrote: >> >>>>>> >> > >> >>>>>> >> > You presumably also save a bounds check on reads by >> hard-coding >> >>>>>> >> > the >> >>>>>> >> > sizes? >> >>>>>> >> > >> >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett < >> ekmett at gmail.com> >> >>>>>> >> > wrote: >> >>>>>> >> >> >> >>>>>> >> >> Also there are 4 different "things" here, basically >> depending on >> >>>>>> >> >> two >> >>>>>> >> >> independent questions: >> >>>>>> >> >> >> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and >> >>>>>> >> >> b.) if you want cardmarking. >> >>>>>> >> >> >> >>>>>> >> >> Versions with/without cardmarking for different sizes can be >> >>>>>> >> >> done >> >>>>>> >> >> pretty >> >>>>>> >> >> easily, but as noted, the infotable variants are pretty >> >>>>>> >> >> invasive. >> >>>>>> >> >> >> >>>>>> >> >> -Edward >> >>>>>> >> >> >> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett < >> ekmett at gmail.com> >> >>>>>> >> >> wrote: >> >>>>>> >> >>> >> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which >> >>>>>> >> >>> adds up >> >>>>>> >> >>> if >> >>>>>> >> >>> they were small enough and there are enough of them. You >> get a >> >>>>>> >> >>> bit >> >>>>>> >> >>> better >> >>>>>> >> >>> locality of reference in terms of what fits in the first >> cache >> >>>>>> >> >>> line of >> >>>>>> >> >>> them. >> >>>>>> >> >>> >> >>>>>> >> >>> -Edward >> >>>>>> >> >>> >> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >> >>>>>> >> >>> >> >>>>>> >> >>> wrote: >> >>>>>> >> >>>> >> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will >> >>>>>> >> >>>> settle with >> >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and >> >>>>>> >> >>>> unsafeCoercing >> >>>>>> >> >>>> pointers whose types don't actually match their siblings. >> >>>>>> >> >>>> >> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for >> fixed >> >>>>>> >> >>>> sized >> >>>>>> >> >>>> array >> >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to >> >>>>>> >> >>>> derecommend, as >> >>>>>> >> >>>> you mentioned!) >> >>>>>> >> >>>> >> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >> >>>>>> >> >>>> >> >>>>>> >> >>>> wrote: >> >>>>>> >> >>>>> >> >>>>>> >> >>>>> I think both are useful, but the one you suggest requires >> a >> >>>>>> >> >>>>> lot more >> >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the >> >>>>>> >> >>>>> other. >> >>>>>> >> >>>>> >> >>>>>> >> >>>>> -Edward >> >>>>>> >> >>>>> >> >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >> >>>>>> >> >>>>> >> >>>>>> >> >>>>> wrote: >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed >> type, >> >>>>>> >> >>>>>> unbounded >> >>>>>> >> >>>>>> length) with extra payload. >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> I can see how we can do without structs if we have >> arrays, >> >>>>>> >> >>>>>> especially >> >>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general >> >>>>>> >> >>>>>> solution >> >>>>>> >> >>>>>> for >> >>>>>> >> >>>>>> structs be one that that allows new user data type defs >> for >> >>>>>> >> >>>>>> # >> >>>>>> >> >>>>>> types? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words >> >>>>>> >> >>>>>>> and a >> >>>>>> >> >>>>>>> known >> >>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was >> >>>>>> >> >>>>>>> suggesting >> >>>>>> >> >>>>>>> above, but >> >>>>>> >> >>>>>>> where the word counts were stored in the objects >> >>>>>> >> >>>>>>> themselves. >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts >> >>>>>> >> >>>>>>> it'd >> >>>>>> >> >>>>>>> likely >> >>>>>> >> >>>>>>> want to be something we build in addition to MutVar# >> rather >> >>>>>> >> >>>>>>> than a >> >>>>>> >> >>>>>>> replacement. >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and >> build >> >>>>>> >> >>>>>>> info >> >>>>>> >> >>>>>>> tables that knew them, and typechecker support, for >> >>>>>> >> >>>>>>> instance, it'd >> >>>>>> >> >>>>>>> get >> >>>>>> >> >>>>>>> rather invasive. >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >> >>>>>> >> >>>>>>> versions >> >>>>>> >> >>>>>>> above, like working with evil unsized c-style arrays >> >>>>>> >> >>>>>>> directly >> >>>>>> >> >>>>>>> inline at the >> >>>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't >> even >> >>>>>> >> >>>>>>> a pure >> >>>>>> >> >>>>>>> win if we >> >>>>>> >> >>>>>>> did the engineering effort. >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by >> adding >> >>>>>> >> >>>>>>> the one >> >>>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> -Edward >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> wrote: >> >>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>>>> I like the possibility of a general solution for >> mutable >> >>>>>> >> >>>>>>>> structs >> >>>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why >> >>>>>> >> >>>>>>>> it's hard. >> >>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >> >>>>>> >> >>>>>>>> object >> >>>>>> >> >>>>>>>> identity problems. But what about directly supporting >> an >> >>>>>> >> >>>>>>>> extensible set of >> >>>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >> >>>>>> >> >>>>>>>> replacing) >> >>>>>> >> >>>>>>>> MutVar#? That >> >>>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want >> >>>>>> >> >>>>>>>> best in >> >>>>>> >> >>>>>>>> class >> >>>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >> >>>>>> >> >>>>>>>> sequential >> >>>>>> >> >>>>>>>> counterparts. >> >>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >>>>>> >> >>>>>>>> wrote: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it >> into a >> >>>>>> >> >>>>>>>>> short >> >>>>>> >> >>>>>>>>> article. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC >> Trac, >> >>>>>> >> >>>>>>>>> and >> >>>>>> >> >>>>>>>>> maybe >> >>>>>> >> >>>>>>>>> make a ticket for it. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Thanks >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Simon >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >>>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >>>>>> >> >>>>>>>>> To: Simon Peyton Jones >> >>>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified >> >>>>>> >> >>>>>>>>> invariant. It >> >>>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >> >>>>>> >> >>>>>>>>> ByteArray#'s. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> While those live in #, they are garbage collected >> >>>>>> >> >>>>>>>>> objects, so >> >>>>>> >> >>>>>>>>> this >> >>>>>> >> >>>>>>>>> all lives on the heap. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast >> when >> >>>>>> >> >>>>>>>>> it has >> >>>>>> >> >>>>>>>>> to >> >>>>>> >> >>>>>>>>> deal with nested arrays. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a >> better >> >>>>>> >> >>>>>>>>> thing. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> The Problem >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> ----------------- >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Consider the scenario where you write a classic >> >>>>>> >> >>>>>>>>> doubly-linked >> >>>>>> >> >>>>>>>>> list >> >>>>>> >> >>>>>>>>> in Haskell. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >> >>>>>> >> >>>>>>>>> pointers >> >>>>>> >> >>>>>>>>> on >> >>>>>> >> >>>>>>>>> the heap. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe >> DLL) >> >>>>>> >> >>>>>>>>> ~> >> >>>>>> >> >>>>>>>>> Maybe >> >>>>>> >> >>>>>>>>> DLL ~> DLL >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> That is 3 levels of indirection. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >>>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for >> DLL >> >>>>>> >> >>>>>>>>> and >> >>>>>> >> >>>>>>>>> worsening our representation. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> This means that every operation we perform on this >> >>>>>> >> >>>>>>>>> structure >> >>>>>> >> >>>>>>>>> will >> >>>>>> >> >>>>>>>>> be about half of the speed of an implementation in >> most >> >>>>>> >> >>>>>>>>> other >> >>>>>> >> >>>>>>>>> languages >> >>>>>> >> >>>>>>>>> assuming we're memory bound on loading things into >> cache! >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Making Progress >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> ---------------------- >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I have been working on a number of data structures >> where >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> indirection of going from something in * out to an >> object >> >>>>>> >> >>>>>>>>> in # >> >>>>>> >> >>>>>>>>> which >> >>>>>> >> >>>>>>>>> contains the real pointer to my target and coming back >> >>>>>> >> >>>>>>>>> effectively doubles >> >>>>>> >> >>>>>>>>> my runtime. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> MutVar# >> >>>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a >> well >> >>>>>> >> >>>>>>>>> defined >> >>>>>> >> >>>>>>>>> write-barrier. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I could change out the representation to use >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# >> every >> >>>>>> >> >>>>>>>>> time, >> >>>>>> >> >>>>>>>>> but >> >>>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >> >>>>>> >> >>>>>>>>> amount of >> >>>>>> >> >>>>>>>>> distinct >> >>>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from >> 3 >> >>>>>> >> >>>>>>>>> per >> >>>>>> >> >>>>>>>>> object to 2. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and >> get to >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> array >> >>>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase >> that >> >>>>>> >> >>>>>>>>> to the >> >>>>>> >> >>>>>>>>> next array. I >> >>>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >> >>>>>> >> >>>>>>>>> paying for >> >>>>>> >> >>>>>>>>> a card >> >>>>>> >> >>>>>>>>> marking table as well, which I don't particularly need >> >>>>>> >> >>>>>>>>> with just >> >>>>>> >> >>>>>>>>> two >> >>>>>> >> >>>>>>>>> pointers, but we can shed that with the >> >>>>>> >> >>>>>>>>> "SmallMutableArray#" >> >>>>>> >> >>>>>>>>> machinery added >> >>>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new >> >>>>>> >> >>>>>>>>> data >> >>>>>> >> >>>>>>>>> type, which can >> >>>>>> >> >>>>>>>>> speed things up a bit when you don't have very big >> >>>>>> >> >>>>>>>>> arrays: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | >> Nil >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and >> >>>>>> >> >>>>>>>>> have two >> >>>>>> >> >>>>>>>>> mutable fields and be able to share the sme write >> >>>>>> >> >>>>>>>>> barrier? >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >> >>>>>> >> >>>>>>>>> types. >> >>>>>> >> >>>>>>>>> What >> >>>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal >> with >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> impedence >> >>>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and >> >>>>>> >> >>>>>>>>> then just >> >>>>>> >> >>>>>>>>> let the >> >>>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just >> make be >> >>>>>> >> >>>>>>>>> a >> >>>>>> >> >>>>>>>>> special >> >>>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I >> can >> >>>>>> >> >>>>>>>>> even >> >>>>>> >> >>>>>>>>> abuse pattern >> >>>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals >> further >> >>>>>> >> >>>>>>>>> to >> >>>>>> >> >>>>>>>>> make this >> >>>>>> >> >>>>>>>>> cheaper. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >>>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >> >>>>>> >> >>>>>>>>> preceding >> >>>>>> >> >>>>>>>>> and next >> >>>>>> >> >>>>>>>>> entry in the linked list. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps >> me' >> >>>>>> >> >>>>>>>>> into a >> >>>>>> >> >>>>>>>>> strict world, and everything there lives in #. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# >> s >> >>>>>> >> >>>>>>>>> of >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of >> that >> >>>>>> >> >>>>>>>>> code to >> >>>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed >> >>>>>> >> >>>>>>>>> pretty >> >>>>>> >> >>>>>>>>> easily when they >> >>>>>> >> >>>>>>>>> are known strict and you chain operations of this >> sort! >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Cleaning it Up >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> ------------------ >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an >> array >> >>>>>> >> >>>>>>>>> that >> >>>>>> >> >>>>>>>>> points directly to other arrays. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, >> but >> >>>>>> >> >>>>>>>>> I can >> >>>>>> >> >>>>>>>>> fix >> >>>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# >> and >> >>>>>> >> >>>>>>>>> using a >> >>>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me >> >>>>>> >> >>>>>>>>> store a >> >>>>>> >> >>>>>>>>> mixture of >> >>>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >> >>>>>> >> >>>>>>>>> structure. >> >>>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> existing >> >>>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of >> one >> >>>>>> >> >>>>>>>>> of the >> >>>>>> >> >>>>>>>>> arguments it >> >>>>>> >> >>>>>>>>> takes. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have >> fields >> >>>>>> >> >>>>>>>>> that >> >>>>>> >> >>>>>>>>> would >> >>>>>> >> >>>>>>>>> be best left unboxed. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >> >>>>>> >> >>>>>>>>> currently >> >>>>>> >> >>>>>>>>> at >> >>>>>> >> >>>>>>>>> best point one of the entries of the >> SmallMutableArray# >> >>>>>> >> >>>>>>>>> at a >> >>>>>> >> >>>>>>>>> boxed or at a >> >>>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove >> the >> >>>>>> >> >>>>>>>>> int in >> >>>>>> >> >>>>>>>>> question in >> >>>>>> >> >>>>>>>>> there. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I >> >>>>>> >> >>>>>>>>> need to >> >>>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >> >>>>>> >> >>>>>>>>> Having to >> >>>>>> >> >>>>>>>>> go off to >> >>>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the >> first >> >>>>>> >> >>>>>>>>> pointer >> >>>>>> >> >>>>>>>>> chase. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we >> >>>>>> >> >>>>>>>>> could >> >>>>>> >> >>>>>>>>> construct that had n words with unsafe access and m >> >>>>>> >> >>>>>>>>> pointers to >> >>>>>> >> >>>>>>>>> other heap >> >>>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list >> >>>>>> >> >>>>>>>>> when any >> >>>>>> >> >>>>>>>>> of those >> >>>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of >> >>>>>> >> >>>>>>>>> two in >> >>>>>> >> >>>>>>>>> all >> >>>>>> >> >>>>>>>>> circumstances. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Prototype >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> ------------- >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Over the last few days I've put together a small >> >>>>>> >> >>>>>>>>> prototype >> >>>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data >> >>>>>> >> >>>>>>>>> structures >> >>>>>> >> >>>>>>>>> for things >> >>>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling >> problem >> >>>>>> >> >>>>>>>>> and >> >>>>>> >> >>>>>>>>> order-maintenance. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Notable bits: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an >> implementation >> >>>>>> >> >>>>>>>>> of >> >>>>>> >> >>>>>>>>> link-cut >> >>>>>> >> >>>>>>>>> trees in this style. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying >> guts >> >>>>>> >> >>>>>>>>> that >> >>>>>> >> >>>>>>>>> make >> >>>>>> >> >>>>>>>>> it go fast. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >> >>>>>> >> >>>>>>>>> almost >> >>>>>> >> >>>>>>>>> all >> >>>>>> >> >>>>>>>>> the references to the LinkCut or Object data >> constructor >> >>>>>> >> >>>>>>>>> get >> >>>>>> >> >>>>>>>>> optimized away, >> >>>>>> >> >>>>>>>>> and we're left with beautiful strict code directly >> >>>>>> >> >>>>>>>>> mutating out >> >>>>>> >> >>>>>>>>> underlying >> >>>>>> >> >>>>>>>>> representation. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it >> into a >> >>>>>> >> >>>>>>>>> short >> >>>>>> >> >>>>>>>>> article. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> -Edward >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >>>>>> >> >>>>>>>>> wrote: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in >> this >> >>>>>> >> >>>>>>>>> thread. >> >>>>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >> >>>>>> >> >>>>>>>>> there a >> >>>>>> >> >>>>>>>>> ticket? Is >> >>>>>> >> >>>>>>>>> there a wiki page? >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket >> would >> >>>>>> >> >>>>>>>>> be a >> >>>>>> >> >>>>>>>>> good >> >>>>>> >> >>>>>>>>> thing. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Simon >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] >> On >> >>>>>> >> >>>>>>>>> Behalf >> >>>>>> >> >>>>>>>>> Of >> >>>>>> >> >>>>>>>>> Edward Kmett >> >>>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >>>>>> >> >>>>>>>>> To: Manuel M T Chakravarty >> >>>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> When (ab)using them for this purpose, >> SmallArrayArray's >> >>>>>> >> >>>>>>>>> would be >> >>>>>> >> >>>>>>>>> very handy as well. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Consider right now if I have something like an >> >>>>>> >> >>>>>>>>> order-maintenance >> >>>>>> >> >>>>>>>>> structure I have: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} >> !(MutableByteArray s) >> >>>>>> >> >>>>>>>>> {-# >> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} >> !(MutVar >> >>>>>> >> >>>>>>>>> s >> >>>>>> >> >>>>>>>>> (Upper s)) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper >> s)) >> >>>>>> >> >>>>>>>>> {-# >> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} >> !(MutVar >> >>>>>> >> >>>>>>>>> s >> >>>>>> >> >>>>>>>>> (Lower s)) {-# >> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and >> two >> >>>>>> >> >>>>>>>>> pointers, >> >>>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is >> >>>>>> >> >>>>>>>>> basically >> >>>>>> >> >>>>>>>>> the same >> >>>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the >> >>>>>> >> >>>>>>>>> structure >> >>>>>> >> >>>>>>>>> above. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> On the heap this is an object that points to a >> structure >> >>>>>> >> >>>>>>>>> for the >> >>>>>> >> >>>>>>>>> bytearray, and points to another structure for each >> >>>>>> >> >>>>>>>>> mutvar which >> >>>>>> >> >>>>>>>>> each point >> >>>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >>>>>> >> >>>>>>>>> indirection smeared >> >>>>>> >> >>>>>>>>> over everything. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an >> upward >> >>>>>> >> >>>>>>>>> link >> >>>>>> >> >>>>>>>>> from >> >>>>>> >> >>>>>>>>> the structure below to the structure above. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a >> MutableByteArray#, >> >>>>>> >> >>>>>>>>> and >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next >> previous >> >>>>>> >> >>>>>>>>> objects, >> >>>>>> >> >>>>>>>>> represented >> >>>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >>>>>> >> >>>>>>>>> sameMutableArrayArray# on these >> >>>>>> >> >>>>>>>>> for object identity, which lets me check for the ends >> of >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> lists by tying >> >>>>>> >> >>>>>>>>> things back on themselves. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> and below that >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot >> pointing >> >>>>>> >> >>>>>>>>> up to >> >>>>>> >> >>>>>>>>> an >> >>>>>> >> >>>>>>>>> upper structure. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting >> out >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> slots >> >>>>>> >> >>>>>>>>> in question, while it has gained a level of >> indirection >> >>>>>> >> >>>>>>>>> between >> >>>>>> >> >>>>>>>>> the wrapper >> >>>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that >> >>>>>> >> >>>>>>>>> one can >> >>>>>> >> >>>>>>>>> be basically >> >>>>>> >> >>>>>>>>> erased by ghc. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on >> >>>>>> >> >>>>>>>>> the heap >> >>>>>> >> >>>>>>>>> for >> >>>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# >> for >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> object itself, >> >>>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >> >>>>>> >> >>>>>>>>> around the >> >>>>>> >> >>>>>>>>> mutable >> >>>>>> >> >>>>>>>>> int. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> The only pain points are >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently >> prevents >> >>>>>> >> >>>>>>>>> me >> >>>>>> >> >>>>>>>>> from >> >>>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or >> Array >> >>>>>> >> >>>>>>>>> into an >> >>>>>> >> >>>>>>>>> ArrayArray >> >>>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the >> rest >> >>>>>> >> >>>>>>>>> of >> >>>>>> >> >>>>>>>>> Haskell, >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> and >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us >> >>>>>> >> >>>>>>>>> avoid the >> >>>>>> >> >>>>>>>>> card marking overhead. These objects are all small, >> 3-4 >> >>>>>> >> >>>>>>>>> pointers >> >>>>>> >> >>>>>>>>> wide. Card >> >>>>>> >> >>>>>>>>> marking doesn't help. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things >> and >> >>>>>> >> >>>>>>>>> convert >> >>>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how >> to >> >>>>>> >> >>>>>>>>> unsafeCoerce my way >> >>>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other >> arrays >> >>>>>> >> >>>>>>>>> directly into the >> >>>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see >> here >> >>>>>> >> >>>>>>>>> by >> >>>>>> >> >>>>>>>>> aping the >> >>>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >> >>>>>> >> >>>>>>>>> dangerous! >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything >> on >> >>>>>> >> >>>>>>>>> the >> >>>>>> >> >>>>>>>>> altar >> >>>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC >> move >> >>>>>> >> >>>>>>>>> them >> >>>>>> >> >>>>>>>>> and collect >> >>>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based >> >>>>>> >> >>>>>>>>> solutions. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> -Edward >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T >> Chakravarty >> >>>>>> >> >>>>>>>>> wrote: >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> That?s an interesting idea. >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> Manuel >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> > Edward Kmett : >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add >> >>>>>> >> >>>>>>>>> > Array# and >> >>>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that >> >>>>>> >> >>>>>>>>> > the >> >>>>>> >> >>>>>>>>> > ArrayArray# entries >> >>>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of >> >>>>>> >> >>>>>>>>> > indirection for >> >>>>>> >> >>>>>>>>> > the containing >> >>>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use >> it >> >>>>>> >> >>>>>>>>> > if my >> >>>>>> >> >>>>>>>>> > leaf level data >> >>>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among >> ByteArray#s. >> >>>>>> >> >>>>>>>>> > It'd be >> >>>>>> >> >>>>>>>>> > nice to be >> >>>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff >> >>>>>> >> >>>>>>>>> > down at >> >>>>>> >> >>>>>>>>> > the leaves to >> >>>>>> >> >>>>>>>>> > hold lifted contents. >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I >> go >> >>>>>> >> >>>>>>>>> > to >> >>>>>> >> >>>>>>>>> > access >> >>>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose >> it'd >> >>>>>> >> >>>>>>>>> > do that >> >>>>>> >> >>>>>>>>> > if i tried to >> >>>>>> >> >>>>>>>>> > use one of the members that held a nested >> ArrayArray# >> >>>>>> >> >>>>>>>>> > as a >> >>>>>> >> >>>>>>>>> > ByteArray# >> >>>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >> >>>>>> >> >>>>>>>>> > preventing >> >>>>>> >> >>>>>>>>> > this. >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the >> >>>>>> >> >>>>>>>>> > indirection >> >>>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, >> and >> >>>>>> >> >>>>>>>>> > I >> >>>>>> >> >>>>>>>>> > could shoehorn a >> >>>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >> >>>>>> >> >>>>>>>>> > unnecessary >> >>>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce >> >>>>>> >> >>>>>>>>> > that pain >> >>>>>> >> >>>>>>>>> > to just 1 >> >>>>>> >> >>>>>>>>> > level of unnecessary indirection. >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > -Edward >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> > _______________________________________________ >> >>>>>> >> >>>>>>>>> > ghc-devs mailing list >> >>>>>> >> >>>>>>>>> > ghc-devs at haskell.org >> >>>>>> >> >>>>>>>>> > >> >>>>>> >> >>>>>>>>> > >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>>>>>> _______________________________________________ >> >>>>>> >> >>>>>>>>> ghc-devs mailing list >> >>>>>> >> >>>>>>>>> ghc-devs at haskell.org >> >>>>>> >> >>>>>>>>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>>>> >> >>>>>> >> >>>>> >> >>>>>> >> >>> >> >>>>>> >> >> >> >>>>>> >> > >> >>>>>> >> > >> >>>>>> >> > _______________________________________________ >> >>>>>> >> > ghc-devs mailing list >> >>>>>> >> > ghc-devs at haskell.org >> >>>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>> >> > >> >>>>>> > >> >>>>>> > >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> ghc-devs mailing list >> >>>> ghc-devs at haskell.org >> >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>> >> >>> >> >> >> > >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at snoyman.com Tue Sep 1 06:53:00 2015 From: michael at snoyman.com (Michael Snoyman) Date: Tue, 1 Sep 2015 09:53:00 +0300 Subject: more releases In-Reply-To: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> Message-ID: It's definitely an interesting idea. From the Stackage side: I'm happy to provide testing and, even better, support to get some automated Stackage testing tied into the GHC release process. (Why not be more aggressive? We could do some CI against Stackage from the 7.10 branch on a regular basis.) I like the idea of getting bug fixes out to users more frequently, so I'm definitely +1 on the discussion. Let me play devil's advocate though: having a large number of versions of GHC out there can make it difficult for library authors, package curators, and large open source projects, due to variety of what people are using. If we end up in a world where virtually everyone ends up on the latest point release in a short timeframe, the problem is reduced, but most of our current installation methods are not amenable to that. We need to have a serious discussion about how Linux distros, Haskell Platform, minimal installers, and so on would address this shift. (stack would be able to adapt to this easily since it can download new GHCs as needed, but users may not like having 100MB installs on a daily basis ;).) What I would love to see is that bug fixes are regularly backported to the stable GHC release and that within a reasonable timeframe are released, where reasonable is some value we can discuss and come to consensus on. I'll say that at the extremes: I think a week is far too short, and a year is far too long. On Tue, Sep 1, 2015 at 9:45 AM, Richard Eisenberg wrote: > Hi devs, > > An interesting topic came up over dinner tonight: what if GHC made more > releases? As an extreme example, we could release a new point version every > time a bug fix gets merged to the stable branch. This may be a terrible > idea. But what's stopping us from doing so? > > The biggest objection I can see is that we would want to make sure that > users' code would work with the new version. Could the Stackage crew help > us with this? If they run their nightly build with a release candidate and > diff against the prior results, we would get a pretty accurate sense of > whether the bugfix is good. If this test succeeds, why not release? Would > it be hard to automate the packaging/posting process? > > The advantage to more releases is that it gets bugfixes in more hands > sooner. What are the disadvantages? > > Richard > > PS: I'm not 100% sold on this idea. But I thought it was interesting > enough to raise a broader discussion. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hvriedel at gmail.com Tue Sep 1 07:01:55 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Tue, 01 Sep 2015 09:01:55 +0200 Subject: more releases In-Reply-To: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> (Richard Eisenberg's message of "Mon, 31 Aug 2015 23:45:40 -0700") References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> Message-ID: <87si6y1v30.fsf@gmail.com> On 2015-09-01 at 08:45:40 +0200, Richard Eisenberg wrote: > An interesting topic came up over dinner tonight: what if GHC made > more releases? As an extreme example, we could release a new point > version every time a bug fix gets merged to the stable branch. This > may be a terrible idea. But what's stopping us from doing so? > > The biggest objection I can see is that we would want to make sure > that users' code would work with the new version. Could the Stackage > crew help us with this? If they run their nightly build with a release > candidate and diff against the prior results, we would get a pretty > accurate sense of whether the bugfix is good. If this test succeeds, > why not release? Would it be hard to automate the packaging/posting > process? > > The advantage to more releases is that it gets bugfixes in more hands > sooner. What are the disadvantages? I'd say mostly organisational overhead which can't be fully automated (afaik, Ben has already automated large parts but not everything can be): - Coordinating with people creating and testing the bindists - Writing releases notes & announcment - Coordinating with the HP release process (which requires separate QA) - If bundled core-libraries are affected, coordination overhead with package maintainers (unless GHC HQ owned), verifying version bumps (API diff!) and changelogs have been updated accordingly, uploading to Hackage - Uploading and signing packagees to download.haskell.org, and verifying the downloads Austin & Ben probably have more to add to this list That said, doing more stable point releases is certainly doable if the bugs fixed are critical enough. This is mostly a trade-off between time spent on getting GHC HEAD in shape for the next major release (whose release-schedules suffer from time delays anyway) vs. maintaining a stable branch. Cheers, hvr From eir at cis.upenn.edu Tue Sep 1 07:12:21 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 1 Sep 2015 00:12:21 -0700 Subject: more releases In-Reply-To: <87si6y1v30.fsf@gmail.com> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> Message-ID: On Sep 1, 2015, at 12:01 AM, Herbert Valerio Riedel wrote: > I'd say mostly organisational overhead which can't be fully automated > (afaik, Ben has already automated large parts but not everything can be): > > - Coordinating with people creating and testing the bindists This was the sort of thing I thought could be automated. I'm picturing a system where Austin/Ben hits a button and everything whirs to life, creating, testing, and posting bindists, with no people involved. > - Writing releases notes & announcment Release notes should, theoretically, be updated with the patches. Announcement can be automated. > - Coordinating with the HP release process (which requires separate QA) I'm sure others will have opinions here, but I guess I was thinking that the HP wouldn't be involved. These tiny releases could even be called something like "7.10.2 build 18". The HP would get updated only when we go to 7.10.3. Maybe we even have a binary compatibility requirement between tiny releases -- no interface file changes! Then a user's package library doesn't have to be recompiled when updating. In theory, other than the bugfixes, two people with different "builds" of GHC should have the same experience. > - If bundled core-libraries are affected, coordination overhead with package > maintainers (unless GHC HQ owned), verifying version bumps (API diff!) and > changelogs have been updated accordingly, uploading to Hackage Any library version change would require a more proper release. Do these libraries tend to change during a major release cycle? > - Uploading and signing packagees to download.haskell.org, and verifying > the downloads This isn't automated? > > Austin & Ben probably have more to add to this list > I'm sure they do. Again, I'd be fine if the answer from the community is "it's just not what we need". But I wanted to see if there were technical/practical/social reasons why this was or wasn't a good idea. If we do think it's a good idea absent those reasons, then we can work on addressing those concerns. Richard > That said, doing more stable point releases is certainly doable if the > bugs fixed are critical enough. This is mostly a trade-off between time > spent on getting GHC HEAD in shape for the next major release (whose > release-schedules suffer from time delays anyway) vs. maintaining a > stable branch. > > Cheers, > hvr From simonpj at microsoft.com Tue Sep 1 11:50:05 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 1 Sep 2015 11:50:05 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: <107de3fcc21b4ccab7a14cc908cdb110@AM3PR30MB019.064d.mgd.msft.net> OK Tuesday afternoon break! S From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Johan Tibell Sent: 01 September 2015 06:14 To: Ryan Yates Cc: Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates Subject: Re: ArrayArrays Works for me. On Mon, Aug 31, 2015 at 3:50 PM, Ryan Yates > wrote: Any time works for me. Ryan On Mon, Aug 31, 2015 at 6:11 PM, Ryan Newton > wrote: > Dear Edward, Ryan Yates, and other interested parties -- > > So when should we meet up about this? > > May I propose the Tues afternoon break for everyone at ICFP who is > interested in this topic? We can meet out in the coffee area and congregate > around Edward Kmett, who is tall and should be easy to find ;-). > > I think Ryan is going to show us how to use his new primops for combined > array + other fields in one heap object? > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > wrote: >> >> Without a custom primitive it doesn't help much there, you have to store >> the indirection to the mask. >> >> With a custom primitive it should cut the on heap root-to-leaf path of >> everything in the HAMT in half. A shorter HashMap was actually one of the >> motivating factors for me doing this. It is rather astoundingly difficult to >> beat the performance of HashMap, so I had to start cheating pretty badly. ;) >> >> -Edward >> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > >> wrote: >>> >>> I'd also be interested to chat at ICFP to see if I can use this for my >>> HAMT implementation. >>> >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > wrote: >>>> >>>> Sounds good to me. Right now I'm just hacking up composable accessors >>>> for "typed slots" in a fairly lens-like fashion, and treating the set of >>>> slots I define and the 'new' function I build for the data type as its API, >>>> and build atop that. This could eventually graduate to template-haskell, but >>>> I'm not entirely satisfied with the solution I have. I currently distinguish >>>> between what I'm calling "slots" (things that point directly to another >>>> SmallMutableArrayArray# sans wrapper) and "fields" which point directly to >>>> the usual Haskell data types because unifying the two notions meant that I >>>> couldn't lift some coercions out "far enough" to make them vanish. >>>> >>>> I'll be happy to run through my current working set of issues in person >>>> and -- as things get nailed down further -- in a longer lived medium than in >>>> personal conversations. ;) >>>> >>>> -Edward >>>> >>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > wrote: >>>>> >>>>> I'd also love to meet up at ICFP and discuss this. I think the array >>>>> primops plus a TH layer that lets (ab)use them many times without too much >>>>> marginal cost sounds great. And I'd like to learn how we could be either >>>>> early users of, or help with, this infrastructure. >>>>> >>>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >>>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >>>>> who is currently working on concurrent data structures in Haskell, but will >>>>> not be at ICFP. >>>>> >>>>> >>>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > >>>>> wrote: >>>>>> >>>>>> I completely agree. I would love to spend some time during ICFP and >>>>>> friends talking about what it could look like. My small array for STM >>>>>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>>>>> confusing naming choices (sorry), but should cover all the details >>>>>> needed to implement it for a non-STM context. The biggest surprise >>>>>> for me was following small array too closely and having a word/byte >>>>>> offset miss-match [2]. >>>>>> >>>>>> [1]: >>>>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>>>>> >>>>>> Ryan >>>>>> >>>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett > >>>>>> wrote: >>>>>> > I'd love to have that last 10%, but its a lot of work to get there >>>>>> > and more >>>>>> > importantly I don't know quite what it should look like. >>>>>> > >>>>>> > On the other hand, I do have a pretty good idea of how the >>>>>> > primitives above >>>>>> > could be banged out and tested in a long evening, well in time for >>>>>> > 7.12. And >>>>>> > as noted earlier, those remain useful even if a nicer typed version >>>>>> > with an >>>>>> > extra level of indirection to the sizes is built up after. >>>>>> > >>>>>> > The rest sounds like a good graduate student project for someone who >>>>>> > has >>>>>> > graduate students lying around. Maybe somebody at Indiana University >>>>>> > who has >>>>>> > an interest in type theory and parallelism can find us one. =) >>>>>> > >>>>>> > -Edward >>>>>> > >>>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates > >>>>>> > wrote: >>>>>> >> >>>>>> >> I think from my perspective, the motivation for getting the type >>>>>> >> checker involved is primarily bringing this to the level where >>>>>> >> users >>>>>> >> could be expected to build these structures. it is reasonable to >>>>>> >> think that there are people who want to use STM (a context with >>>>>> >> mutation already) to implement a straight forward data structure >>>>>> >> that >>>>>> >> avoids extra indirection penalty. There should be some places >>>>>> >> where >>>>>> >> knowing that things are field accesses rather then array indexing >>>>>> >> could be helpful, but I think GHC is good right now about handling >>>>>> >> constant offsets. In my code I don't do any bounds checking as I >>>>>> >> know >>>>>> >> I will only be accessing my arrays with constant indexes. I make >>>>>> >> wrappers for each field access and leave all the unsafe stuff in >>>>>> >> there. When things go wrong though, the compiler is no help. >>>>>> >> Maybe >>>>>> >> template Haskell that generates the appropriate wrappers is the >>>>>> >> right >>>>>> >> direction to go. >>>>>> >> There is another benefit for me when working with these as arrays >>>>>> >> in >>>>>> >> that it is quite simple and direct (given the hoops already jumped >>>>>> >> through) to play with alignment. I can ensure two pointers are >>>>>> >> never >>>>>> >> on the same cache-line by just spacing things out in the array. >>>>>> >> >>>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett > >>>>>> >> wrote: >>>>>> >> > They just segfault at this level. ;) >>>>>> >> > >>>>>> >> > Sent from my iPhone >>>>>> >> > >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton > >>>>>> >> > wrote: >>>>>> >> > >>>>>> >> > You presumably also save a bounds check on reads by hard-coding >>>>>> >> > the >>>>>> >> > sizes? >>>>>> >> > >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > >>>>>> >> > wrote: >>>>>> >> >> >>>>>> >> >> Also there are 4 different "things" here, basically depending on >>>>>> >> >> two >>>>>> >> >> independent questions: >>>>>> >> >> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and >>>>>> >> >> b.) if you want cardmarking. >>>>>> >> >> >>>>>> >> >> Versions with/without cardmarking for different sizes can be >>>>>> >> >> done >>>>>> >> >> pretty >>>>>> >> >> easily, but as noted, the infotable variants are pretty >>>>>> >> >> invasive. >>>>>> >> >> >>>>>> >> >> -Edward >>>>>> >> >> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > >>>>>> >> >> wrote: >>>>>> >> >>> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which >>>>>> >> >>> adds up >>>>>> >> >>> if >>>>>> >> >>> they were small enough and there are enough of them. You get a >>>>>> >> >>> bit >>>>>> >> >>> better >>>>>> >> >>> locality of reference in terms of what fits in the first cache >>>>>> >> >>> line of >>>>>> >> >>> them. >>>>>> >> >>> >>>>>> >> >>> -Edward >>>>>> >> >>> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >>>>>> >> >>> > >>>>>> >> >>> wrote: >>>>>> >> >>>> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will >>>>>> >> >>>> settle with >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>>>>> >> >>>> unsafeCoercing >>>>>> >> >>>> pointers whose types don't actually match their siblings. >>>>>> >> >>>> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>>>>> >> >>>> sized >>>>>> >> >>>> array >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to >>>>>> >> >>>> derecommend, as >>>>>> >> >>>> you mentioned!) >>>>>> >> >>>> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>>>>> >> >>>> > >>>>>> >> >>>> wrote: >>>>>> >> >>>>> >>>>>> >> >>>>> I think both are useful, but the one you suggest requires a >>>>>> >> >>>>> lot more >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the >>>>>> >> >>>>> other. >>>>>> >> >>>>> >>>>>> >> >>>>> -Edward >>>>>> >> >>>>> >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >>>>>> >> >>>>> > >>>>>> >> >>>>> wrote: >>>>>> >> >>>>>> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>>>>> >> >>>>>> unbounded >>>>>> >> >>>>>> length) with extra payload. >>>>>> >> >>>>>> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >>>>>> >> >>>>>> especially >>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general >>>>>> >> >>>>>> solution >>>>>> >> >>>>>> for >>>>>> >> >>>>>> structs be one that that allows new user data type defs for >>>>>> >> >>>>>> # >>>>>> >> >>>>>> types? >>>>>> >> >>>>>> >>>>>> >> >>>>>> >>>>>> >> >>>>>> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >>>>>> >> >>>>>> > >>>>>> >> >>>>>> wrote: >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words >>>>>> >> >>>>>>> and a >>>>>> >> >>>>>>> known >>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was >>>>>> >> >>>>>>> suggesting >>>>>> >> >>>>>>> above, but >>>>>> >> >>>>>>> where the word counts were stored in the objects >>>>>> >> >>>>>>> themselves. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts >>>>>> >> >>>>>>> it'd >>>>>> >> >>>>>>> likely >>>>>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>>>>> >> >>>>>>> than a >>>>>> >> >>>>>>> replacement. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>>>>> >> >>>>>>> info >>>>>> >> >>>>>>> tables that knew them, and typechecker support, for >>>>>> >> >>>>>>> instance, it'd >>>>>> >> >>>>>>> get >>>>>> >> >>>>>>> rather invasive. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>>>>> >> >>>>>>> versions >>>>>> >> >>>>>>> above, like working with evil unsized c-style arrays >>>>>> >> >>>>>>> directly >>>>>> >> >>>>>>> inline at the >>>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even >>>>>> >> >>>>>>> a pure >>>>>> >> >>>>>>> win if we >>>>>> >> >>>>>>> did the engineering effort. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>>>>> >> >>>>>>> the one >>>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> -Edward >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>>>> >> >>>>>>> > >>>>>> >> >>>>>>> wrote: >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable >>>>>> >> >>>>>>>> structs >>>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why >>>>>> >> >>>>>>>> it's hard. >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>>>>> >> >>>>>>>> object >>>>>> >> >>>>>>>> identity problems. But what about directly supporting an >>>>>> >> >>>>>>>> extensible set of >>>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>>>>> >> >>>>>>>> replacing) >>>>>> >> >>>>>>>> MutVar#? That >>>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want >>>>>> >> >>>>>>>> best in >>>>>> >> >>>>>>>> class >>>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>>>>> >> >>>>>>>> sequential >>>>>> >> >>>>>>>> counterparts. >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>>>>> >> >>>>>>>> > wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>>> >> >>>>>>>>> short >>>>>> >> >>>>>>>>> article. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> maybe >>>>>> >> >>>>>>>>> make a ticket for it. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Thanks >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Simon >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>>>>> >> >>>>>>>>> To: Simon Peyton Jones >>>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified >>>>>> >> >>>>>>>>> invariant. It >>>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>>>>> >> >>>>>>>>> ByteArray#'s. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> While those live in #, they are garbage collected >>>>>> >> >>>>>>>>> objects, so >>>>>> >> >>>>>>>>> this >>>>>> >> >>>>>>>>> all lives on the heap. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when >>>>>> >> >>>>>>>>> it has >>>>>> >> >>>>>>>>> to >>>>>> >> >>>>>>>>> deal with nested arrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>>>>> >> >>>>>>>>> thing. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The Problem >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ----------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Consider the scenario where you write a classic >>>>>> >> >>>>>>>>> doubly-linked >>>>>> >> >>>>>>>>> list >>>>>> >> >>>>>>>>> in Haskell. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>>>>> >> >>>>>>>>> pointers >>>>>> >> >>>>>>>>> on >>>>>> >> >>>>>>>>> the heap. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) >>>>>> >> >>>>>>>>> ~> >>>>>> >> >>>>>>>>> Maybe >>>>>> >> >>>>>>>>> DLL ~> DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> That is 3 levels of indirection. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> worsening our representation. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> This means that every operation we perform on this >>>>>> >> >>>>>>>>> structure >>>>>> >> >>>>>>>>> will >>>>>> >> >>>>>>>>> be about half of the speed of an implementation in most >>>>>> >> >>>>>>>>> other >>>>>> >> >>>>>>>>> languages >>>>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Making Progress >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ---------------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I have been working on a number of data structures where >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> indirection of going from something in * out to an object >>>>>> >> >>>>>>>>> in # >>>>>> >> >>>>>>>>> which >>>>>> >> >>>>>>>>> contains the real pointer to my target and coming back >>>>>> >> >>>>>>>>> effectively doubles >>>>>> >> >>>>>>>>> my runtime. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> MutVar# >>>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>>>>> >> >>>>>>>>> defined >>>>>> >> >>>>>>>>> write-barrier. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I could change out the representation to use >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>>>>> >> >>>>>>>>> time, >>>>>> >> >>>>>>>>> but >>>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>>>>> >> >>>>>>>>> amount of >>>>>> >> >>>>>>>>> distinct >>>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 >>>>>> >> >>>>>>>>> per >>>>>> >> >>>>>>>>> object to 2. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> array >>>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that >>>>>> >> >>>>>>>>> to the >>>>>> >> >>>>>>>>> next array. I >>>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>>>>> >> >>>>>>>>> paying for >>>>>> >> >>>>>>>>> a card >>>>>> >> >>>>>>>>> marking table as well, which I don't particularly need >>>>>> >> >>>>>>>>> with just >>>>>> >> >>>>>>>>> two >>>>>> >> >>>>>>>>> pointers, but we can shed that with the >>>>>> >> >>>>>>>>> "SmallMutableArray#" >>>>>> >> >>>>>>>>> machinery added >>>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new >>>>>> >> >>>>>>>>> data >>>>>> >> >>>>>>>>> type, which can >>>>>> >> >>>>>>>>> speed things up a bit when you don't have very big >>>>>> >> >>>>>>>>> arrays: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and >>>>>> >> >>>>>>>>> have two >>>>>> >> >>>>>>>>> mutable fields and be able to share the sme write >>>>>> >> >>>>>>>>> barrier? >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>>>>> >> >>>>>>>>> types. >>>>>> >> >>>>>>>>> What >>>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> impedence >>>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and >>>>>> >> >>>>>>>>> then just >>>>>> >> >>>>>>>>> let the >>>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be >>>>>> >> >>>>>>>>> a >>>>>> >> >>>>>>>>> special >>>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can >>>>>> >> >>>>>>>>> even >>>>>> >> >>>>>>>>> abuse pattern >>>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further >>>>>> >> >>>>>>>>> to >>>>>> >> >>>>>>>>> make this >>>>>> >> >>>>>>>>> cheaper. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>>>>> >> >>>>>>>>> preceding >>>>>> >> >>>>>>>>> and next >>>>>> >> >>>>>>>>> entry in the linked list. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>>>>> >> >>>>>>>>> into a >>>>>> >> >>>>>>>>> strict world, and everything there lives in #. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>>>>> >> >>>>>>>>> code to >>>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed >>>>>> >> >>>>>>>>> pretty >>>>>> >> >>>>>>>>> easily when they >>>>>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Cleaning it Up >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ------------------ >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> points directly to other arrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but >>>>>> >> >>>>>>>>> I can >>>>>> >> >>>>>>>>> fix >>>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>>>>> >> >>>>>>>>> using a >>>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me >>>>>> >> >>>>>>>>> store a >>>>>> >> >>>>>>>>> mixture of >>>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>>>>> >> >>>>>>>>> structure. >>>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> existing >>>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one >>>>>> >> >>>>>>>>> of the >>>>>> >> >>>>>>>>> arguments it >>>>>> >> >>>>>>>>> takes. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> would >>>>>> >> >>>>>>>>> be best left unboxed. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>>>>> >> >>>>>>>>> currently >>>>>> >> >>>>>>>>> at >>>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# >>>>>> >> >>>>>>>>> at a >>>>>> >> >>>>>>>>> boxed or at a >>>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>>>>> >> >>>>>>>>> int in >>>>>> >> >>>>>>>>> question in >>>>>> >> >>>>>>>>> there. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I >>>>>> >> >>>>>>>>> need to >>>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>>>>> >> >>>>>>>>> Having to >>>>>> >> >>>>>>>>> go off to >>>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>>>>> >> >>>>>>>>> pointer >>>>>> >> >>>>>>>>> chase. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we >>>>>> >> >>>>>>>>> could >>>>>> >> >>>>>>>>> construct that had n words with unsafe access and m >>>>>> >> >>>>>>>>> pointers to >>>>>> >> >>>>>>>>> other heap >>>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list >>>>>> >> >>>>>>>>> when any >>>>>> >> >>>>>>>>> of those >>>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of >>>>>> >> >>>>>>>>> two in >>>>>> >> >>>>>>>>> all >>>>>> >> >>>>>>>>> circumstances. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Prototype >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Over the last few days I've put together a small >>>>>> >> >>>>>>>>> prototype >>>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>>>>> >> >>>>>>>>> structures >>>>>> >> >>>>>>>>> for things >>>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> order-maintenance. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Notable bits: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> link-cut >>>>>> >> >>>>>>>>> trees in this style. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> make >>>>>> >> >>>>>>>>> it go fast. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>>>>> >> >>>>>>>>> almost >>>>>> >> >>>>>>>>> all >>>>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor >>>>>> >> >>>>>>>>> get >>>>>> >> >>>>>>>>> optimized away, >>>>>> >> >>>>>>>>> and we're left with beautiful strict code directly >>>>>> >> >>>>>>>>> mutating out >>>>>> >> >>>>>>>>> underlying >>>>>> >> >>>>>>>>> representation. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>>> >> >>>>>>>>> short >>>>>> >> >>>>>>>>> article. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>>>>> >> >>>>>>>>> > wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>>>>> >> >>>>>>>>> thread. >>>>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>>>>> >> >>>>>>>>> there a >>>>>> >> >>>>>>>>> ticket? Is >>>>>> >> >>>>>>>>> there a wiki page? >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would >>>>>> >> >>>>>>>>> be a >>>>>> >> >>>>>>>>> good >>>>>> >> >>>>>>>>> thing. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Simon >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>>>>> >> >>>>>>>>> Behalf >>>>>> >> >>>>>>>>> Of >>>>>> >> >>>>>>>>> Edward Kmett >>>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>>>>> >> >>>>>>>>> To: Manuel M T Chakravarty >>>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>>>>> >> >>>>>>>>> would be >>>>>> >> >>>>>>>>> very handy as well. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Consider right now if I have something like an >>>>>> >> >>>>>>>>> order-maintenance >>>>>> >> >>>>>>>>> structure I have: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>>>>> >> >>>>>>>>> {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar >>>>>> >> >>>>>>>>> s >>>>>> >> >>>>>>>>> (Upper s)) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>> >> >>>>>>>>> {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar >>>>>> >> >>>>>>>>> s >>>>>> >> >>>>>>>>> (Lower s)) {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>>>>> >> >>>>>>>>> pointers, >>>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>>>>> >> >>>>>>>>> basically >>>>>> >> >>>>>>>>> the same >>>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the >>>>>> >> >>>>>>>>> structure >>>>>> >> >>>>>>>>> above. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On the heap this is an object that points to a structure >>>>>> >> >>>>>>>>> for the >>>>>> >> >>>>>>>>> bytearray, and points to another structure for each >>>>>> >> >>>>>>>>> mutvar which >>>>>> >> >>>>>>>>> each point >>>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>>>>> >> >>>>>>>>> indirection smeared >>>>>> >> >>>>>>>>> over everything. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward >>>>>> >> >>>>>>>>> link >>>>>> >> >>>>>>>>> from >>>>>> >> >>>>>>>>> the structure below to the structure above. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>>>>> >> >>>>>>>>> objects, >>>>>> >> >>>>>>>>> represented >>>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>>>>> >> >>>>>>>>> sameMutableArrayArray# on these >>>>>> >> >>>>>>>>> for object identity, which lets me check for the ends of >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> lists by tying >>>>>> >> >>>>>>>>> things back on themselves. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> and below that >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>>>>> >> >>>>>>>>> up to >>>>>> >> >>>>>>>>> an >>>>>> >> >>>>>>>>> upper structure. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> slots >>>>>> >> >>>>>>>>> in question, while it has gained a level of indirection >>>>>> >> >>>>>>>>> between >>>>>> >> >>>>>>>>> the wrapper >>>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that >>>>>> >> >>>>>>>>> one can >>>>>> >> >>>>>>>>> be basically >>>>>> >> >>>>>>>>> erased by ghc. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on >>>>>> >> >>>>>>>>> the heap >>>>>> >> >>>>>>>>> for >>>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> object itself, >>>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>>>>> >> >>>>>>>>> around the >>>>>> >> >>>>>>>>> mutable >>>>>> >> >>>>>>>>> int. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The only pain points are >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents >>>>>> >> >>>>>>>>> me >>>>>> >> >>>>>>>>> from >>>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>>>>> >> >>>>>>>>> into an >>>>>> >> >>>>>>>>> ArrayArray >>>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> Haskell, >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us >>>>>> >> >>>>>>>>> avoid the >>>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>>>>> >> >>>>>>>>> pointers >>>>>> >> >>>>>>>>> wide. Card >>>>>> >> >>>>>>>>> marking doesn't help. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>>>>> >> >>>>>>>>> convert >>>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>>>>> >> >>>>>>>>> unsafeCoerce my way >>>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>>>>> >> >>>>>>>>> directly into the >>>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here >>>>>> >> >>>>>>>>> by >>>>>> >> >>>>>>>>> aping the >>>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>>>>> >> >>>>>>>>> dangerous! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> altar >>>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>>>>> >> >>>>>>>>> them >>>>>> >> >>>>>>>>> and collect >>>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based >>>>>> >> >>>>>>>>> solutions. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>>>>> >> >>>>>>>>> > wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> That?s an interesting idea. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Manuel >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > Edward Kmett >: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add >>>>>> >> >>>>>>>>> > Array# and >>>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that >>>>>> >> >>>>>>>>> > the >>>>>> >> >>>>>>>>> > ArrayArray# entries >>>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of >>>>>> >> >>>>>>>>> > indirection for >>>>>> >> >>>>>>>>> > the containing >>>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it >>>>>> >> >>>>>>>>> > if my >>>>>> >> >>>>>>>>> > leaf level data >>>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>>>>> >> >>>>>>>>> > It'd be >>>>>> >> >>>>>>>>> > nice to be >>>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff >>>>>> >> >>>>>>>>> > down at >>>>>> >> >>>>>>>>> > the leaves to >>>>>> >> >>>>>>>>> > hold lifted contents. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go >>>>>> >> >>>>>>>>> > to >>>>>> >> >>>>>>>>> > access >>>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd >>>>>> >> >>>>>>>>> > do that >>>>>> >> >>>>>>>>> > if i tried to >>>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# >>>>>> >> >>>>>>>>> > as a >>>>>> >> >>>>>>>>> > ByteArray# >>>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>>>>> >> >>>>>>>>> > preventing >>>>>> >> >>>>>>>>> > this. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the >>>>>> >> >>>>>>>>> > indirection >>>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and >>>>>> >> >>>>>>>>> > I >>>>>> >> >>>>>>>>> > could shoehorn a >>>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>>>>> >> >>>>>>>>> > unnecessary >>>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce >>>>>> >> >>>>>>>>> > that pain >>>>>> >> >>>>>>>>> > to just 1 >>>>>> >> >>>>>>>>> > level of unnecessary indirection. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > _______________________________________________ >>>>>> >> >>>>>>>>> > ghc-devs mailing list >>>>>> >> >>>>>>>>> > ghc-devs at haskell.org >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> _______________________________________________ >>>>>> >> >>>>>>>>> ghc-devs mailing list >>>>>> >> >>>>>>>>> ghc-devs at haskell.org >>>>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> >>>>>> >> >>>>> >>>>>> >> >>> >>>>>> >> >> >>>>>> >> > >>>>>> >> > >>>>>> >> > _______________________________________________ >>>>>> >> > ghc-devs mailing list >>>>>> >> > ghc-devs at haskell.org >>>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> > >>>>>> > >>>>>> > >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>> >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From voldermort at hotmail.com Tue Sep 1 11:57:17 2015 From: voldermort at hotmail.com (Harry .) Date: Tue, 1 Sep 2015 11:57:17 +0000 Subject: Planning for the 7.12 release In-Reply-To: References: Message-ID: Proposal: Make Semigroup as a superclass of Monoid https://mail.haskell.org/pipermail/libraries/2015-April/025590.html From hvriedel at gmail.com Tue Sep 1 12:06:56 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Tue, 01 Sep 2015 14:06:56 +0200 Subject: Planning for the 7.12 release In-Reply-To: (Harry .'s message of "Tue, 1 Sep 2015 11:57:17 +0000") References: Message-ID: <87613u1gyn.fsf@gmail.com> On 2015-09-01 at 13:57:17 +0200, Harry . wrote: > Proposal: Make Semigroup as a superclass of Monoid > https://mail.haskell.org/pipermail/libraries/2015-April/025590.html The plan is to (at the very least) move Data.Semigroups and Data.List.NonEmpty to base for GHC 7.12 If we have enough time we will also implement compile-warnings in GHC 7.12 to prepare for the next phases, if not they'll follow with the next major release after GHC 7.12 (effectively extending/delaying the migration-plan[1] by one year) [1]: https://mail.haskell.org/pipermail/libraries/2015-March/025413.html From johan.tibell at gmail.com Tue Sep 1 17:23:35 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Tue, 1 Sep 2015 10:23:35 -0700 Subject: RFC: Unpacking sum types Message-ID: I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Tue Sep 1 18:31:14 2015 From: dan.doel at gmail.com (Dan Doel) Date: Tue, 1 Sep 2015 14:31:14 -0400 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior? For example: data C = C1 !Int | C2 ![Int] data D = D1 !Double {-# UNPACK #-} !C Naively we might think: data D' = D1 !Double !Tag !Int ![Int] But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess). -- Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above). -- Dan On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell wrote: > I have a draft design for unpacking sum types that I'd like some feedback > on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > -- Johan > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From thomasmiedema at gmail.com Tue Sep 1 18:34:08 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Tue, 1 Sep 2015 20:34:08 +0200 Subject: Proposal: accept pull requests on GitHub Message-ID: Hello all, my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. Some quotes from #ghc to pique your curiosity (there are some 50 more): * "is arc broken today?" * "arc is a frickin' mystery." * "i have a theory that i've managed to create a revision that phab can't handle." * "Diffs just seem to be too expensive to create ... I can't blame contributors for not wanting to do this for every atomic change" * "but seriously, we can't require this for contributing to GHC... the entry barrier is already high enough" GitHub has side-by-side diffs nowadays, and Travis-CI can run `./validate --fast` comfortably . *Proposal: accept pull requests from contributors on https://github.com/ghc/ghc .* Details: * use Travis-CI to validate pull requests. * keep using the Trac issue tracker (contributors are encouraged to put a link to their pull-request in the 'Differential Revisions' field). * keep using the Trac wiki. * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. * keep pushing to git.haskell.org, where the existing Git receive hooks can do their job keeping tabs, trailing whitespace and dangling submodule references out, notify Trac and send emails. Committers close pull-requests manually, just like they do Trac tickets. * keep running Phabricator for as long as necessary. * mention that pull requests are accepted on https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs. My expectation is that the majority of patches will start coming in via pull requests, the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops). Reviewers will get many more emails. Other arguments against GitHub are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub. I probably missed a few things, so fire away. Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at nh2.me Tue Sep 1 20:42:21 2015 From: mail at nh2.me (=?windows-1252?Q?Niklas_Hamb=FCchen?=) Date: Tue, 01 Sep 2015 22:42:21 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: Message-ID: <55E60DAD.503@nh2.me> Hi, I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked. I do however recommend accepting pull requests via Github. This is already the case for simple changes: In the past I asked Austin "can you pull this from my branch on Github called XXX", and it went in without problems and without me having to use arc locally. But this process could be more automated: For Ganeti (cluster manager made by Google, written largely in Haskell) I built a tool (https://github.com/google/pull-request-mailer) that listens for pull requests and sends them to the mailing list (Ganeti's preferred way of accepting patches and doing reviews). We built it because some people (me included) liked the Github workflow (push branch, click button) more than `git format-patch`+`git send-email`. You can see an example at https://github.com/ganeti/ganeti/pull/22. The tool then replies on Github that discussion of the change please be held on the mailing list. That has worked so far. It can also handle force-pushes when a PR gets updated based on feedback. Writing it and setting it up only took a few days. I think it wouldn't be too difficult to do the same for GHC: A small tool that imports Github PRs into Phabricator. I don't like the arc user experience. It's modeled in the same way as ReviewBoard, and just pushing a branch is easier in my opinion. However, Phabricator is quite good as a review tool. Its inability to review multiple commits is nasty, but I guess that'll be fixed at some point. If not, such an import tool I suggest could to the squashing for you. Unfortunately there is currently no open source review tool that can handle reviewing entire branches AND multiple revisions of such branches. It's possible to build them though, some companies have internal review tools that do it and they work extremely well. I believe that a simple automated import setup could address many of the points in https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. Niklas On 01/09/15 20:34, Thomas Miedema wrote: > Hello all, > > my arguments against Phabricator are here: > https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. > > Some quotes from #ghc to pique your curiosity (there are some 50 more): > * "is arc broken today?" > * "arc is a frickin' mystery." > * "i have a theory that i've managed to create a revision that phab > can't handle." > * "Diffs just seem to be too expensive to create ... I can't blame > contributors for not wanting to do this for every atomic change" > * "but seriously, we can't require this for contributing to GHC... the > entry barrier is already high enough" > > GitHub has side-by-side diffs > nowadays, and > Travis-CI can run `./validate --fast` comfortably > . > > *Proposal: accept pull requests from contributors on > https://github.com/ghc/ghc.* > > Details: > * use Travis-CI to validate pull requests. > * keep using the Trac issue tracker (contributors are encouraged to put > a link to their pull-request in the 'Differential Revisions' field). > * keep using the Trac wiki. > * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to > refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. > * keep pushing to git.haskell.org , where the > existing Git receive hooks can do their job keeping tabs, trailing > whitespace and dangling submodule references out, notify Trac and send > emails. Committers close pull-requests manually, just like they do Trac > tickets. > * keep running Phabricator for as long as necessary. > * mention that pull requests are accepted on > https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs. > > My expectation is that the majority of patches will start coming in via > pull requests, the number of contributions will go up, commits will be > smaller, and there will be more of them per pull request (contributors > will be able to put style changes and refactorings into separate > commits, without jumping through a bunch of hoops). > > Reviewers will get many more emails. Other arguments against GitHub are > here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub. > > I probably missed a few things, so fire away. > > Thanks, > Thomas > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From johan.tibell at gmail.com Wed Sep 2 01:09:48 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Tue, 1 Sep 2015 18:09:48 -0700 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: After some discussions with SPJ I've now rewritten the proposal in terms of unboxed sums (which should suffer from the extra seq problem you mention above). On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel wrote: > I wonder: are there issues with strict/unpacked fields in the sum > type, with regard to the 'fill in stuff' behavior? > > For example: > > data C = C1 !Int | C2 ![Int] > > data D = D1 !Double {-# UNPACK #-} !C > > Naively we might think: > > data D' = D1 !Double !Tag !Int ![Int] > > But this is obviously not going to work at the > Haskell-implemented-level. Since we're at a lower level, we could just > not seq the things from the opposite constructor, but are there > problems that arise from that? Also of course the !Int will probably > also be unpacked, so such prim types need different handling (fill > with 0, I guess). > > -- > > Also, I guess this is orthogonal, but having primitive, unboxed sums > (analogous to unboxed tuples) would be nice as well. Conceivably they > could be used as part of the specification of unpacked sums, since we > can apparently put unboxed tuples in data types now. I'm not certain > if they would cover all cases, though (like the strictness concerns > above). > > -- Dan > > > On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell > wrote: > > I have a draft design for unpacking sum types that I'd like some feedback > > on. In particular feedback both on: > > > > * the writing and clarity of the proposal and > > * the proposal itself. > > > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > > > -- Johan > > > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrnewton at gmail.com Wed Sep 2 01:44:03 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Wed, 02 Sep 2015 01:44:03 +0000 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: Just a small comment about syntax. Why is there an "_n" suffix on the type constructor? Isn't it syntactically evident how many things are in the |# .. | .. #| block? More generally, are the parser changes and the wild new syntax strictly necessary? Could we instead just have a new keyword, but have at look like a normal type constructor? For example, the type: (Sum# T1 T2 T3) Where "UnboxedSum" can't be partially applied, and is variable arity. Likewise, "MkSum#" could be a keyword/syntactic-form: (MkSum# 1 3 expr) case x of MkSum# 1 3 v -> e Here "1" and "3" are part of the syntactic form, not expressions. But it can probably be handled after parsing and doesn't require the "_n_m" business. -Ryan On Tue, Sep 1, 2015 at 6:10 PM Johan Tibell wrote: > After some discussions with SPJ I've now rewritten the proposal in terms > of unboxed sums (which should suffer from the extra seq problem you mention > above). > > On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel wrote: > >> I wonder: are there issues with strict/unpacked fields in the sum >> type, with regard to the 'fill in stuff' behavior? >> >> For example: >> >> data C = C1 !Int | C2 ![Int] >> >> data D = D1 !Double {-# UNPACK #-} !C >> >> Naively we might think: >> >> data D' = D1 !Double !Tag !Int ![Int] >> >> But this is obviously not going to work at the >> Haskell-implemented-level. Since we're at a lower level, we could just >> not seq the things from the opposite constructor, but are there >> problems that arise from that? Also of course the !Int will probably >> also be unpacked, so such prim types need different handling (fill >> with 0, I guess). >> >> -- >> >> Also, I guess this is orthogonal, but having primitive, unboxed sums >> (analogous to unboxed tuples) would be nice as well. Conceivably they >> could be used as part of the specification of unpacked sums, since we >> can apparently put unboxed tuples in data types now. I'm not certain >> if they would cover all cases, though (like the strictness concerns >> above). >> >> -- Dan >> >> >> On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell >> wrote: >> > I have a draft design for unpacking sum types that I'd like some >> feedback >> > on. In particular feedback both on: >> > >> > * the writing and clarity of the proposal and >> > * the proposal itself. >> > >> > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes >> > >> > -- Johan >> > >> > >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Wed Sep 2 02:12:27 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Tue, 01 Sep 2015 19:12:27 -0700 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: <1441159947.3393.13.camel@joachim-breitner.de> Hi, Am Mittwoch, den 02.09.2015, 01:44 +0000 schrieb Ryan Newton: > Why is there an "_n" suffix on the type constructor? Isn't it > syntactically evident how many things are in the |# .. | .. #| > block? Correct. > More generally, are the parser changes and the wild new syntax > strictly necessary? If we just add it to Core, to support UNPACK, then there is no parser involved anyways, and the pretty-printer may do fancy stuff. (Why not unicode subscript numbers like ? :-)) But we probably want to provide this also on the Haskell level, just like unboxed products, right? Then we should have a nice syntax. Personally, I find (# a | b | c #) visually more pleasing. (The disadvantage is that this works only for two or more alternatives, but the one-alternative-unboxed-union is isomorphic to the one-element -unboxed-tuple anyways, isn?t it?) > Likewise, "MkSum#" could be a keyword/syntactic-form: > > (MkSum# 1 3 expr) > case x of MkSum# 1 3 v -> e > > Here "1" and "3" are part of the syntactic form, not expressions. > But it can probably be handled after parsing and doesn't require the > "_n_m" business. If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From rrnewton at gmail.com Wed Sep 2 02:22:24 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Wed, 02 Sep 2015 02:22:24 +0000 Subject: RFC: Unpacking sum types In-Reply-To: <1441159947.3393.13.camel@joachim-breitner.de> References: <1441159947.3393.13.camel@joachim-breitner.de> Message-ID: > > If we expose it on the Haskell level, I find MkSum_1_2# the right thing > to do: It makes it clear that (conceptually) there really is a > constructor of that name, and it is distinct from MkSum_2_2#, and the > user cannot do computation with these indices. > I don't mind MkSum_1_2#, it avoids the awkwardness of attaching it to a closing delimiter. But... it does still introduce the idea of cutting up tokens to get numbers out of them, which is kind of hacky. (There seems to be a conserved particle of hackiness here that can't be eliminate, but it doesn't bother me too much.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Wed Sep 2 05:58:53 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Tue, 01 Sep 2015 22:58:53 -0700 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: Hi, just an idea that crossed my mind: Can we do without the worker/wrapper dance for data constructors if we instead phrase that in terms of pattern synonyms? Maybe that's a refactoring/code consolidation opportunity. Good night, Joachim Am 1. September 2015 10:23:35 PDT, schrieb Johan Tibell : >I have a draft design for unpacking sum types that I'd like some >feedback >on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > >https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > >-- Johan > > >------------------------------------------------------------------------ > >_______________________________________________ >ghc-devs mailing list >ghc-devs at haskell.org >http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From michael at diglumi.com Wed Sep 2 07:47:25 2015 From: michael at diglumi.com (Michael Smith) Date: Wed, 2 Sep 2015 00:47:25 -0700 Subject: Shared data type for extension flags Message-ID: #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the capababilty to Template Haskell to detect which language extensions enabled. Unfortunately, since template-haskell can't depend on ghc (as ghc depends on template-haskell), it can't simply re-export the ExtensionFlag type from DynFlags to the user. There is a second data type encoding the list of possible language extensions in the Cabal package, in Language.Haskell.Extension [3]. But template-haskell doesn't already depend on Cabal, and doing so seems like it would cause difficulties, as the two packages can be upgraded separately. So adding this new feature to Template Haskell requires introducing a *third* data type for language extensions. It also requires enumerating this full list in two more places, to convert back and forth between the TH Extension data type and GHC's internal ExtensionFlag data type. Is there another way here? Can there be one single shared data type for this somehow? [1] https://ghc.haskell.org/trac/ghc/ticket/10820 [2] https://phabricator.haskell.org/D1200 [3] https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Wed Sep 2 08:00:40 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Wed, 2 Sep 2015 10:00:40 +0200 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: Surely the easiest way here (including for other tooling - ie haskell-src-exts) is to create a package which just provides this enumeration. GHC, cabal, th, haskell-src-exts and so on then all depend on this package rather than creating their own enumeration. On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith wrote: > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > capababilty > to Template Haskell to detect which language extensions enabled. > Unfortunately, > since template-haskell can't depend on ghc (as ghc depends on > template-haskell), > it can't simply re-export the ExtensionFlag type from DynFlags to the user. > > There is a second data type encoding the list of possible language > extensions in > the Cabal package, in Language.Haskell.Extension [3]. But template-haskell > doesn't already depend on Cabal, and doing so seems like it would cause > difficulties, as the two packages can be upgraded separately. > > So adding this new feature to Template Haskell requires introducing a > *third* > data type for language extensions. It also requires enumerating this full > list > in two more places, to convert back and forth between the TH Extension data > type > and GHC's internal ExtensionFlag data type. > > Is there another way here? Can there be one single shared data type for this > somehow? > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > [2] https://phabricator.haskell.org/D1200 > [3] > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From michael at diglumi.com Wed Sep 2 08:20:30 2015 From: michael at diglumi.com (Michael Smith) Date: Wed, 2 Sep 2015 01:20:30 -0700 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: That sounds like a good approach. Are there other things that would go nicely in a shared package like this, in addition to the extension data type? On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering < matthewtpickering at gmail.com> wrote: > Surely the easiest way here (including for other tooling - ie > haskell-src-exts) is to create a package which just provides this > enumeration. GHC, cabal, th, haskell-src-exts and so on then all > depend on this package rather than creating their own enumeration. > > On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith wrote: > > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > > capababilty > > to Template Haskell to detect which language extensions enabled. > > Unfortunately, > > since template-haskell can't depend on ghc (as ghc depends on > > template-haskell), > > it can't simply re-export the ExtensionFlag type from DynFlags to the > user. > > > > There is a second data type encoding the list of possible language > > extensions in > > the Cabal package, in Language.Haskell.Extension [3]. But > template-haskell > > doesn't already depend on Cabal, and doing so seems like it would cause > > difficulties, as the two packages can be upgraded separately. > > > > So adding this new feature to Template Haskell requires introducing a > > *third* > > data type for language extensions. It also requires enumerating this full > > list > > in two more places, to convert back and forth between the TH Extension > data > > type > > and GHC's internal ExtensionFlag data type. > > > > Is there another way here? Can there be one single shared data type for > this > > somehow? > > > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > > [2] https://phabricator.haskell.org/D1200 > > [3] > > > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Wed Sep 2 10:43:57 2015 From: ben at well-typed.com (Ben Gamari) Date: Wed, 02 Sep 2015 12:43:57 +0200 Subject: more releases In-Reply-To: References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> Message-ID: <87oahlksnm.fsf@smart-cactus.org> Richard Eisenberg writes: > On Sep 1, 2015, at 12:01 AM, Herbert Valerio Riedel wrote: > >> I'd say mostly organisational overhead which can't be fully automated >> (afaik, Ben has already automated large parts but not everything can be): >> >> - Coordinating with people creating and testing the bindists > > This was the sort of thing I thought could be automated. I'm picturing > a system where Austin/Ben hits a button and everything whirs to life, > creating, testing, and posting bindists, with no people involved. > I can nearly do this for Linux with my existing tools. I can do 32- and 64-bit builds for both RedHat and Debian all on a single Debian 8 machine with the tools I developed during the course of the 7.10.2 release [1]. Windows is unfortunately still a challenge. I did the 7.10.2 builds on an EC2 instance and the experience wasn't terribly fun. I would love for this to be further automated but I've not done this yet. >> - Writing releases notes & announcment > > Release notes should, theoretically, be updated with the patches. > Announcement can be automated. > If I'm doing my job well the release notes shouldn't be a problem. I've been trying to be meticulous about ensuring that all new features come with acceptable release notes. >> - If bundled core-libraries are affected, coordination overhead with package >> maintainers (unless GHC HQ owned), verifying version bumps (API diff!) and >> changelogs have been updated accordingly, uploading to Hackage > > Any library version change would require a more proper release. Do > these libraries tend to change during a major release cycle? > The core libraries are perhaps the trickiest part of this. Currently the process goes something like this, 1. We branch off a stable GHC release 2. Development continues on `master`, eventually a breaking change is merged to one of the libraries 3. Eventually someone notices and bumps the library's version 4. More breaking changes are merged to the library 5. We branch off for another stable release, right before the release we manually push the libraries to Hackage 6. Repeat from (2) There can potentially be a lot of interface churn between steps 3 and 5. If we did releases in this period we would need to be much more careful about library versioning. I suspect this may end up being quite a bit of work to do properly. Technically we could punt on this problem and just do the same sort of stable/unstable versioning for the libraries that we already do with GHC itself. This would mean, however, that we couldn't upload the libraries to Hackage. >> - Uploading and signing packagees to download.haskell.org, and verifying >> the downloads > > This isn't automated? > It is now (see [2]). This shouldn't be a problem. >> Austin & Ben probably have more to add to this list >> > I'm sure they do. > > Again, I'd be fine if the answer from the community is "it's just not > what we need". But I wanted to see if there were > technical/practical/social reasons why this was or wasn't a good idea. > If we do think it's a good idea absent those reasons, then we can work > on addressing those concerns. > Technically I think there are no reasons why this isn't feasible with some investment. Exactly how much investment depends upon what exactly we want to achieve, * How often do we make these releases? * Which platforms do we support? * How carefully do we version included libraries? If we focus solely on Linux and punt on the library versioning issue I would say this wouldn't even difficult. I could easily setup my build machine to do a nightly bindist and push it to a server somewhere. Austin has also mentioned that Harbormaster builds could potentially produce bindists. The question is whether users want more rapid releases. Those working on GHC will use their own builds. Most users want something reasonably stable (in both the interface sense and the reliability sense) and therefore I suspect would stick with the releases. This leaves a relatively small number of potential users; namely those who want to play around with unreleased features yet aren't willing to do their own builds. Cheers, - Ben [1] https://github.com/bgamari/ghc-utils [2] https://github.com/bgamari/ghc-utils/blob/master/rel-eng/upload.sh -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From hvriedel at gmail.com Wed Sep 2 10:49:32 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Wed, 02 Sep 2015 12:49:32 +0200 Subject: more releases In-Reply-To: <87oahlksnm.fsf@smart-cactus.org> (Ben Gamari's message of "Wed, 02 Sep 2015 12:43:57 +0200") References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> Message-ID: <87fv2xcczn.fsf@gmail.com> On 2015-09-02 at 12:43:57 +0200, Ben Gamari wrote: [...] > The question is whether users want more rapid releases. Those working on > GHC will use their own builds. Most users want something reasonably > stable (in both the interface sense and the reliability sense) and > therefore I suspect would stick with the releases. This leaves a > relatively small number of potential users; namely those who want to > play around with unreleased features yet aren't willing to do their own > builds. Btw, for those who are willing to use Ubuntu there's already GHC HEAD builds available in my PPA, and I can easily keep creating GHC 7.10.3 snapshots in the same style like I usually do shortly before a stable point-release. From simonpj at microsoft.com Wed Sep 2 14:33:23 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 2 Sep 2015 14:33:23 +0000 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: we already have such a shared library, I think: bin-package-db. would that do? Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Michael Smith Sent: 02 September 2015 09:21 To: Matthew Pickering Cc: GHC developers Subject: Re: Shared data type for extension flags That sounds like a good approach. Are there other things that would go nicely in a shared package like this, in addition to the extension data type? On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering > wrote: Surely the easiest way here (including for other tooling - ie haskell-src-exts) is to create a package which just provides this enumeration. GHC, cabal, th, haskell-src-exts and so on then all depend on this package rather than creating their own enumeration. On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith > wrote: > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > capababilty > to Template Haskell to detect which language extensions enabled. > Unfortunately, > since template-haskell can't depend on ghc (as ghc depends on > template-haskell), > it can't simply re-export the ExtensionFlag type from DynFlags to the user. > > There is a second data type encoding the list of possible language > extensions in > the Cabal package, in Language.Haskell.Extension [3]. But template-haskell > doesn't already depend on Cabal, and doing so seems like it would cause > difficulties, as the two packages can be upgraded separately. > > So adding this new feature to Template Haskell requires introducing a > *third* > data type for language extensions. It also requires enumerating this full > list > in two more places, to convert back and forth between the TH Extension data > type > and GHC's internal ExtensionFlag data type. > > Is there another way here? Can there be one single shared data type for this > somehow? > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > [2] https://phabricator.haskell.org/D1200 > [3] > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Wed Sep 2 14:43:54 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 2 Sep 2015 14:43:54 +0000 Subject: [Haskell] ETA on 7.10.3? In-Reply-To: <6AEDE614-430C-4A68-9B2E-22B8FC0275FB@gmail.com> References: <87vbbtl2qz.fsf@smart-cactus.org> <6AEDE614-430C-4A68-9B2E-22B8FC0275FB@gmail.com> Message-ID: Ah, well https://github.com/ku-fpg/hermit/issues/144#issuecomment-128762767 links in turn to https://github.com/ku-fpg/hermit/issues/141, which is a long thread I can't follow. Ryan, Andy: if 7.10.2 is unusable for you, for some reason, please make a ticket to explain why, and ask for 7.10.3. Simon From: Haskell [mailto:haskell-bounces at haskell.org] On Behalf Of David Banas Sent: 02 September 2015 13:19 To: Ben Gamari Cc: haskell at haskell.org Subject: Re: [Haskell] ETA on 7.10.3? Hi Ben, Thanks for your reply. My problem is the project I'm currently working on is dependent upon HERMIT, which doesn't play well with 7.10.2, as per: https://github.com/ku-fpg/hermit/issues/144#issuecomment-128762767 (The nature of that comment caused me to think that 7.10.3 was in play.) Thanks, -db On Sep 2, 2015, at 12:05 AM, Ben Gamari > wrote: David Banas > writes: Hi, Does anyone have an ETA for ghc v7.10.3? (I'm trying to decide between waiting and backing up to 7.8.2, for a particular project.) Currently there are no plans to do a 7.10.3 release. 7.10.2 does has a few issues, but none of them are critical regressions but none of them appear critical enough to burn maintenance time on. Of course, we are willing to reevaluate in the event that new issues arise. What problems with 7.10.2 are you struggling with? Cheers, - Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From afarmer at ittc.ku.edu Wed Sep 2 15:40:15 2015 From: afarmer at ittc.ku.edu (Andrew Farmer) Date: Wed, 2 Sep 2015 08:40:15 -0700 Subject: [Haskell] ETA on 7.10.3? In-Reply-To: References: <87vbbtl2qz.fsf@smart-cactus.org> <6AEDE614-430C-4A68-9B2E-22B8FC0275FB@gmail.com> Message-ID: Sorry, I dropped the ball on creating a ticket. I just did so: https://ghc.haskell.org/trac/ghc/ticket/10829 (As an aside, the original ticket, #10528, had a milestone set as 7.10.3, so I just assumed a 7.10.3 was planned and coming soon.) On Wed, Sep 2, 2015 at 7:43 AM, Simon Peyton Jones wrote: > Ah, well https://github.com/ku-fpg/hermit/issues/144#issuecomment-128762767 > > links in turn to https://github.com/ku-fpg/hermit/issues/141, which is a > long thread I can?t follow. > > > > Ryan, Andy: if 7.10.2 is unusable for you, for some reason, please make a > ticket to explain why, and ask for 7.10.3. > > > Simon > > > > From: Haskell [mailto:haskell-bounces at haskell.org] On Behalf Of David Banas > Sent: 02 September 2015 13:19 > To: Ben Gamari > Cc: haskell at haskell.org > Subject: Re: [Haskell] ETA on 7.10.3? > > > > Hi Ben, > > > > Thanks for your reply. > > > > My problem is the project I?m currently working on is dependent upon HERMIT, > which doesn?t play well with 7.10.2, as per: > > > > https://github.com/ku-fpg/hermit/issues/144#issuecomment-128762767 > > > > (The nature of that comment caused me to think that 7.10.3 was in play.) > > > > Thanks, > > -db > > > > On Sep 2, 2015, at 12:05 AM, Ben Gamari wrote: > > > > David Banas writes: > > > Hi, > > Does anyone have an ETA for ghc v7.10.3? > (I'm trying to decide between waiting and backing up to 7.8.2, for a > particular project.) > > Currently there are no plans to do a 7.10.3 release. 7.10.2 does has a > few issues, but none of them are critical regressions but none of them > appear critical enough to burn maintenance time on. > > Of course, we are willing to reevaluate in the event that new issues > arise. What problems with 7.10.2 are you struggling with? > > Cheers, > > - Ben > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From greg at gregweber.info Wed Sep 2 15:43:02 2015 From: greg at gregweber.info (Greg Weber) Date: Wed, 2 Sep 2015 08:43:02 -0700 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <55E60DAD.503@nh2.me> References: <55E60DAD.503@nh2.me> Message-ID: I like Niklas's suggestion of a middle-ground approach. There are benefits to using phabricator (and arc), but there should be a lowered-bar approach where people can start contributing through github (even though they may be forced to do the code review on phabricator). On Tue, Sep 1, 2015 at 1:42 PM, Niklas Hamb?chen wrote: > Hi, > > I would recommend against moving code reviews to Github. > I like it and use it all the time for my own projects, but for a large > project like GHC, its code reviews are too basic (comments get lost in > multi-round reviews), and its customisation an process enforcement is > too weak; but that has all been mentioned already on the > https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked. > > I do however recommend accepting pull requests via Github. > > This is already the case for simple changes: In the past I asked Austin > "can you pull this from my branch on Github called XXX", and it went in > without problems and without me having to use arc locally. > > But this process could be more automated: > > For Ganeti (cluster manager made by Google, written largely in Haskell) > I built a tool (https://github.com/google/pull-request-mailer) that > listens for pull requests and sends them to the mailing list (Ganeti's > preferred way of accepting patches and doing reviews). We built it > because some people (me included) liked the Github workflow (push > branch, click button) more than `git format-patch`+`git send-email`. You > can see an example at https://github.com/ganeti/ganeti/pull/22. > The tool then replies on Github that discussion of the change please be > held on the mailing list. That has worked so far. > It can also handle force-pushes when a PR gets updated based on > feedback. Writing it and setting it up only took a few days. > > I think it wouldn't be too difficult to do the same for GHC: A small > tool that imports Github PRs into Phabricator. > > I don't like the arc user experience. It's modeled in the same way as > ReviewBoard, and just pushing a branch is easier in my opinion. > > However, Phabricator is quite good as a review tool. Its inability to > review multiple commits is nasty, but I guess that'll be fixed at some > point. If not, such an import tool I suggest could to the squashing for > you. > > Unfortunately there is currently no open source review tool that can > handle reviewing entire branches AND multiple revisions of such > branches. It's possible to build them though, some companies have > internal review tools that do it and they work extremely well. > > I believe that a simple automated import setup could address many of the > points in https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. > > Niklas > > On 01/09/15 20:34, Thomas Miedema wrote: > > Hello all, > > > > my arguments against Phabricator are here: > > https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. > > > > Some quotes from #ghc to pique your curiosity (there are some 50 more): > > * "is arc broken today?" > > * "arc is a frickin' mystery." > > * "i have a theory that i've managed to create a revision that phab > > can't handle." > > * "Diffs just seem to be too expensive to create ... I can't blame > > contributors for not wanting to do this for every atomic change" > > * "but seriously, we can't require this for contributing to GHC... the > > entry barrier is already high enough" > > > > GitHub has side-by-side diffs > > nowadays, and > > Travis-CI can run `./validate --fast` comfortably > > . > > > > *Proposal: accept pull requests from contributors on > > https://github.com/ghc/ghc.* > > > > Details: > > * use Travis-CI to validate pull requests. > > * keep using the Trac issue tracker (contributors are encouraged to put > > a link to their pull-request in the 'Differential Revisions' field). > > * keep using the Trac wiki. > > * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to > > refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. > > * keep pushing to git.haskell.org , where the > > existing Git receive hooks can do their job keeping tabs, trailing > > whitespace and dangling submodule references out, notify Trac and send > > emails. Committers close pull-requests manually, just like they do Trac > > tickets. > > * keep running Phabricator for as long as necessary. > > * mention that pull requests are accepted on > > https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs. > > > > My expectation is that the majority of patches will start coming in via > > pull requests, the number of contributions will go up, commits will be > > smaller, and there will be more of them per pull request (contributors > > will be able to put style changes and refactorings into separate > > commits, without jumping through a bunch of hoops). > > > > Reviewers will get many more emails. Other arguments against GitHub are > > here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub. > > > > I probably missed a few things, so fire away. > > > > Thanks, > > Thomas > > > > > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Wed Sep 2 15:44:15 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 2 Sep 2015 08:44:15 -0700 Subject: more releases In-Reply-To: <87oahlksnm.fsf@smart-cactus.org> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> Message-ID: I think some of my idea was misunderstood here: my goal was to have quick releases only from the stable branch. The goal would not be to release the new and shiny, but instead to get bugfixes out to users quicker. The new and shiny (master) would remain as it is now. In other words: more users would be affected by this change than just the vanguard. Richard On Sep 2, 2015, at 3:43 AM, Ben Gamari wrote: > Richard Eisenberg writes: > >> On Sep 1, 2015, at 12:01 AM, Herbert Valerio Riedel wrote: >> >>> I'd say mostly organisational overhead which can't be fully automated >>> (afaik, Ben has already automated large parts but not everything can be): >>> >>> - Coordinating with people creating and testing the bindists >> >> This was the sort of thing I thought could be automated. I'm picturing >> a system where Austin/Ben hits a button and everything whirs to life, >> creating, testing, and posting bindists, with no people involved. >> > I can nearly do this for Linux with my existing tools. I can do 32- and > 64-bit builds for both RedHat and Debian all on a single > Debian 8 machine with the tools I developed during the course of the > 7.10.2 release [1]. > > Windows is unfortunately still a challenge. I did the 7.10.2 builds on > an EC2 instance and the experience wasn't terribly fun. I would love for > this to be further automated but I've not done this yet. > >>> - Writing releases notes & announcment >> >> Release notes should, theoretically, be updated with the patches. >> Announcement can be automated. >> > If I'm doing my job well the release notes shouldn't be a problem. I've > been trying to be meticulous about ensuring that all new features come > with acceptable release notes. > >>> - If bundled core-libraries are affected, coordination overhead with package >>> maintainers (unless GHC HQ owned), verifying version bumps (API diff!) and >>> changelogs have been updated accordingly, uploading to Hackage >> >> Any library version change would require a more proper release. Do >> these libraries tend to change during a major release cycle? >> > The core libraries are perhaps the trickiest part of this. Currently the > process goes something like this, > > 1. We branch off a stable GHC release > 2. Development continues on `master`, eventually a breaking change is > merged to one of the libraries > 3. Eventually someone notices and bumps the library's version > 4. More breaking changes are merged to the library > 5. We branch off for another stable release, right before the release > we manually push the libraries to Hackage > 6. Repeat from (2) > > There can potentially be a lot of interface churn between steps 3 and 5. > If we did releases in this period we would need to be much more careful > about library versioning. I suspect this may end up being quite a bit of > work to do properly. > > Technically we could punt on this problem and just do the same sort of > stable/unstable versioning for the libraries that we already do with GHC > itself. This would mean, however, that we couldn't upload the libraries > to Hackage. > >>> - Uploading and signing packagees to download.haskell.org, and verifying >>> the downloads >> >> This isn't automated? >> > It is now (see [2]). This shouldn't be a problem. > >>> Austin & Ben probably have more to add to this list >>> >> I'm sure they do. >> >> Again, I'd be fine if the answer from the community is "it's just not >> what we need". But I wanted to see if there were >> technical/practical/social reasons why this was or wasn't a good idea. >> If we do think it's a good idea absent those reasons, then we can work >> on addressing those concerns. >> > Technically I think there are no reasons why this isn't feasible with > some investment. Exactly how much investment depends upon what > exactly we want to achieve, > > * How often do we make these releases? > * Which platforms do we support? > * How carefully do we version included libraries? > > If we focus solely on Linux and punt on the library versioning issue I > would say this wouldn't even difficult. I could easily setup my build > machine to do a nightly bindist and push it to a server somewhere. > Austin has also mentioned that Harbormaster builds could potentially > produce bindists. > > The question is whether users want more rapid releases. Those working on > GHC will use their own builds. Most users want something reasonably > stable (in both the interface sense and the reliability sense) and > therefore I suspect would stick with the releases. This leaves a > relatively small number of potential users; namely those who want to > play around with unreleased features yet aren't willing to do their own > builds. > > Cheers, > > - Ben > > > [1] https://github.com/bgamari/ghc-utils > [2] https://github.com/bgamari/ghc-utils/blob/master/rel-eng/upload.sh From ben at well-typed.com Wed Sep 2 16:04:33 2015 From: ben at well-typed.com (Ben Gamari) Date: Wed, 02 Sep 2015 18:04:33 +0200 Subject: more releases In-Reply-To: References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> Message-ID: <87si6wkdta.fsf@smart-cactus.org> Richard Eisenberg writes: > I think some of my idea was misunderstood here: my goal was to have > quick releases only from the stable branch. The goal would not be to > release the new and shiny, but instead to get bugfixes out to users > quicker. The new and shiny (master) would remain as it is now. In > other words: more users would be affected by this change than just the > vanguard. > I see. This is something we could certainly do. It would require, however, that we be more pro-active about continuing to merge things to the stable branch after the release. Currently the stable branch is essentially in the same state that it was in for the 7.10.2 release. I've left it this way as it takes time and care to cherry-pick patches to stable. Thusfar my poilcy has been to perform this work lazily until it's clear that we will do another stable release as otherwise the effort may well be wasted. So, even if the steps of building, testing, and uploading the release are streamlined more frequent releases are still far from free. Whether it's a worthwhile cost I don't know. This is a difficult question to answer without knowing more about how typical users actually acquire GHC. For instance, this effort would have minimal impact on users who get their compiler through their distribution's package manager. On the other hand, if most users download GHC bindists directly from the GHC download page, then perhaps this would be effort well-spent. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From michael at diglumi.com Wed Sep 2 16:26:51 2015 From: michael at diglumi.com (Michael Smith) Date: Wed, 02 Sep 2015 16:26:51 +0000 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: The package description for that is "The GHC compiler's view of the GHC package database format", and this doesn't really have to do with the package database format. Would it be okay to put this in there anyway? On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones wrote: > we already have such a shared library, I think: bin-package-db. would > that do? > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Michael > Smith > *Sent:* 02 September 2015 09:21 > *To:* Matthew Pickering > *Cc:* GHC developers > *Subject:* Re: Shared data type for extension flags > > > > That sounds like a good approach. Are there other things that would go > nicely > in a shared package like this, in addition to the extension data type? > > > > On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering < > matthewtpickering at gmail.com> wrote: > > Surely the easiest way here (including for other tooling - ie > haskell-src-exts) is to create a package which just provides this > enumeration. GHC, cabal, th, haskell-src-exts and so on then all > depend on this package rather than creating their own enumeration. > > > On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith wrote: > > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > > capababilty > > to Template Haskell to detect which language extensions enabled. > > Unfortunately, > > since template-haskell can't depend on ghc (as ghc depends on > > template-haskell), > > it can't simply re-export the ExtensionFlag type from DynFlags to the > user. > > > > There is a second data type encoding the list of possible language > > extensions in > > the Cabal package, in Language.Haskell.Extension [3]. But > template-haskell > > doesn't already depend on Cabal, and doing so seems like it would cause > > difficulties, as the two packages can be upgraded separately. > > > > So adding this new feature to Template Haskell requires introducing a > > *third* > > data type for language extensions. It also requires enumerating this full > > list > > in two more places, to convert back and forth between the TH Extension > data > > type > > and GHC's internal ExtensionFlag data type. > > > > Is there another way here? Can there be one single shared data type for > this > > somehow? > > > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > > [2] https://phabricator.haskell.org/D1200 > > [3] > > > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.zimm at gmail.com Wed Sep 2 18:39:33 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Wed, 2 Sep 2015 20:39:33 +0200 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: Would this be a feasible approach for harmonising the AST between GHC and TH too? Alan On 2 Sep 2015 09:27, "Michael Smith" wrote: > The package description for that is "The GHC compiler's view of the GHC > package database format", and this doesn't really have to do with the > package database format. Would it be okay to put this in there anyway? > > On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones > wrote: > >> we already have such a shared library, I think: bin-package-db. would >> that do? >> >> >> >> Simon >> >> >> >> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Michael >> Smith >> *Sent:* 02 September 2015 09:21 >> *To:* Matthew Pickering >> *Cc:* GHC developers >> *Subject:* Re: Shared data type for extension flags >> >> >> >> That sounds like a good approach. Are there other things that would go >> nicely >> in a shared package like this, in addition to the extension data type? >> >> >> >> On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering < >> matthewtpickering at gmail.com> wrote: >> >> Surely the easiest way here (including for other tooling - ie >> haskell-src-exts) is to create a package which just provides this >> enumeration. GHC, cabal, th, haskell-src-exts and so on then all >> depend on this package rather than creating their own enumeration. >> >> >> On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith >> wrote: >> > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the >> > capababilty >> > to Template Haskell to detect which language extensions enabled. >> > Unfortunately, >> > since template-haskell can't depend on ghc (as ghc depends on >> > template-haskell), >> > it can't simply re-export the ExtensionFlag type from DynFlags to the >> user. >> > >> > There is a second data type encoding the list of possible language >> > extensions in >> > the Cabal package, in Language.Haskell.Extension [3]. But >> template-haskell >> > doesn't already depend on Cabal, and doing so seems like it would cause >> > difficulties, as the two packages can be upgraded separately. >> > >> > So adding this new feature to Template Haskell requires introducing a >> > *third* >> > data type for language extensions. It also requires enumerating this >> full >> > list >> > in two more places, to convert back and forth between the TH Extension >> data >> > type >> > and GHC's internal ExtensionFlag data type. >> > >> > Is there another way here? Can there be one single shared data type for >> this >> > somehow? >> > >> > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 >> > [2] https://phabricator.haskell.org/D1200 >> > [3] >> > >> https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html >> > >> >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > >> >> >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Wed Sep 2 18:51:38 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Wed, 2 Sep 2015 11:51:38 -0700 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: Message-ID: <55E7453A.90309@gmail.com> On 01/09/2015 11:34, Thomas Miedema wrote: > Hello all, > > my arguments against Phabricator are here: > https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. Thanks for taking the time to summarize all the issues. Personally, I think github's support for code reviews is too weak to recommend it over Phabricator. The multiple-email problem is a killer all by itself. We can improve the workflow for Phabricator to address some of the issues you raise are fixable, such as fixing the base revision to use, and ignoring untracked files (these are local settings, I believe). Stacks of commits are hard to reviewers to follow, so making them easier might have a detrimental effect on our processes. It might feel better for the author, but discovering what changed between two branches of multiple commits on github is almost impossible. Instead the recommended workflow seems to be to add more commits, which makes the history harder to read later. I have only had to update my arc once. Is that a big problem? Cheers Simon > Some quotes from #ghc to pique your curiosity (there are some 50 more): > * "is arc broken today?" > * "arc is a frickin' mystery." > * "i have a theory that i've managed to create a revision that phab > can't handle." > * "Diffs just seem to be too expensive to create ... I can't blame > contributors for not wanting to do this for every atomic change" > * "but seriously, we can't require this for contributing to GHC... the > entry barrier is already high enough" > > GitHub has side-by-side diffs > nowadays, and > Travis-CI can run `./validate --fast` comfortably > . > > *Proposal: accept pull requests from contributors on > https://github.com/ghc/ghc.* > > Details: > * use Travis-CI to validate pull requests. > * keep using the Trac issue tracker (contributors are encouraged to > put a link to their pull-request in the 'Differential Revisions' field). > * keep using the Trac wiki. > * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to > refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. > * keep pushing to git.haskell.org , where the > existing Git receive hooks can do their job keeping tabs, trailing > whitespace and dangling submodule references out, notify Trac and send > emails. Committers close pull-requests manually, just like they do Trac > tickets. > * keep running Phabricator for as long as necessary. > * mention that pull requests are accepted on > https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs. > > My expectation is that the majority of patches will start coming in via > pull requests, the number of contributions will go up, commits will be > smaller, and there will be more of them per pull request (contributors > will be able to put style changes and refactorings into separate > commits, without jumping through a bunch of hoops). > > Reviewers will get many more emails. Other arguments against GitHub are > here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub. > > I probably missed a few things, so fire away. > > Thanks, > Thomas > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From tuncer.ayaz at gmail.com Wed Sep 2 19:21:00 2015 From: tuncer.ayaz at gmail.com (Tuncer Ayaz) Date: Wed, 2 Sep 2015 21:21:00 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <55E7453A.90309@gmail.com> References: <55E7453A.90309@gmail.com> Message-ID: On Wed, Sep 2, 2015 at 8:51 PM, Simon Marlow wrote: > Stacks of commits are hard to reviewers to follow, so making them > easier might have a detrimental effect on our processes. It might > feel better for the author, but discovering what changed between two > branches of multiple commits on github is almost impossible. Instead > the recommended workflow seems to be to add more commits, which > makes the history harder to read later. I've reviewed+merged various big diffs in the form of branches published as pull requests (on and off GitHub), and being able to see each change separately with its own commit message was way easier than one big diff with a summarized message. If Phabricator would use merge commits, reading multi-commit history, especially what commits got merged together (aka what branch was integrated), is easy. Also, bisecting is more precise without collapsed diffs. Therefore, I wouldn't say the single-commit collapsed view is the right choice for all diffs. From michael at diglumi.com Wed Sep 2 19:33:24 2015 From: michael at diglumi.com (Michael Smith) Date: Wed, 02 Sep 2015 19:33:24 +0000 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: I don't know about the entire AST. GHC's AST contains a lot of complexity that one wouldn't want to expose at the TH level. And the separation allows GHC to change the internal AST around while maintaining a stable interface for packages depending on TH. That said, there are some bits that I could see being shared. Fixity and Strict from TH come to mind. On Wed, Sep 2, 2015, 11:39 Alan & Kim Zimmerman wrote: > Would this be a feasible approach for harmonising the AST between GHC and > TH too? > > Alan > On 2 Sep 2015 09:27, "Michael Smith" wrote: > >> The package description for that is "The GHC compiler's view of the GHC >> package database format", and this doesn't really have to do with the >> package database format. Would it be okay to put this in there anyway? >> >> On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones >> wrote: >> >>> we already have such a shared library, I think: bin-package-db. would >>> that do? >>> >>> >>> >>> Simon >>> >>> >>> >>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Michael >>> Smith >>> *Sent:* 02 September 2015 09:21 >>> *To:* Matthew Pickering >>> *Cc:* GHC developers >>> *Subject:* Re: Shared data type for extension flags >>> >>> >>> >>> That sounds like a good approach. Are there other things that would go >>> nicely >>> in a shared package like this, in addition to the extension data type? >>> >>> >>> >>> On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering < >>> matthewtpickering at gmail.com> wrote: >>> >>> Surely the easiest way here (including for other tooling - ie >>> haskell-src-exts) is to create a package which just provides this >>> enumeration. GHC, cabal, th, haskell-src-exts and so on then all >>> depend on this package rather than creating their own enumeration. >>> >>> >>> On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith >>> wrote: >>> > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the >>> > capababilty >>> > to Template Haskell to detect which language extensions enabled. >>> > Unfortunately, >>> > since template-haskell can't depend on ghc (as ghc depends on >>> > template-haskell), >>> > it can't simply re-export the ExtensionFlag type from DynFlags to the >>> user. >>> > >>> > There is a second data type encoding the list of possible language >>> > extensions in >>> > the Cabal package, in Language.Haskell.Extension [3]. But >>> template-haskell >>> > doesn't already depend on Cabal, and doing so seems like it would cause >>> > difficulties, as the two packages can be upgraded separately. >>> > >>> > So adding this new feature to Template Haskell requires introducing a >>> > *third* >>> > data type for language extensions. It also requires enumerating this >>> full >>> > list >>> > in two more places, to convert back and forth between the TH Extension >>> data >>> > type >>> > and GHC's internal ExtensionFlag data type. >>> > >>> > Is there another way here? Can there be one single shared data type >>> for this >>> > somehow? >>> > >>> > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 >>> > [2] https://phabricator.haskell.org/D1200 >>> > [3] >>> > >>> https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html >>> > >>> >>> > _______________________________________________ >>> > ghc-devs mailing list >>> > ghc-devs at haskell.org >>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> > >>> >>> >>> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From _deepfire at feelingofgreen.ru Wed Sep 2 20:42:30 2015 From: _deepfire at feelingofgreen.ru (Kosyrev Serge) Date: Wed, 02 Sep 2015 23:42:30 +0300 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <55E7453A.90309@gmail.com> (sfid-20150902_231247_674400_122691D5) (Simon Marlow's message of "Wed, 2 Sep 2015 11:51:38 -0700") References: <55E7453A.90309@gmail.com> Message-ID: <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> Simon Marlow writes: > On 01/09/2015 11:34, Thomas Miedema wrote: >> Hello all, >> >> my arguments against Phabricator are here: >> https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. > > Thanks for taking the time to summarize all the issues. > > Personally, I think github's support for code reviews is too weak to recommend it > over Phabricator. The multiple-email problem is a killer all by itself. As a wild idea -- did anyone look at /Gitlab/ instead? I didn't look into its review functionality to any meaninful degree, but: - it largely tries to replicate the Github workflow - Gitlab CE is open source - it evolves fairly quickly -- ? ???????e? / respectfully, ??????? ?????? -- ?And those who were seen dancing were thought to be insane by those who could not hear the music.? ? Friedrich Wilhelm Nietzsche From thomasmiedema at gmail.com Wed Sep 2 21:00:00 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Wed, 2 Sep 2015 23:00:00 +0200 Subject: Testsuite and validate changes Message-ID: All, I made the following changes today: * `make accept` now runs all tests for a single way (instead of all ways) * `make test` now runs all tests for a single way (instead of all ways) * `./validate` now runs all tests for a single way (instead of skipping some tests) * Phabricator now runs all tests for a single way (instead of skipping some tests) You can run `make slowtest` in the root directory, or `make slow` in the testsuite directory, to get the old behavior of `make test` back. More information: * https://ghc.haskell.org/trac/ghc/wiki/Building/RunningTests/Running#Speedsettings * https://phabricator.haskell.org/D1178 * Note [validate and testsuite speed] in the toplevel Makefile Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at nh2.me Wed Sep 2 21:09:06 2015 From: mail at nh2.me (=?UTF-8?B?TmlrbGFzIEhhbWLDvGNoZW4=?=) Date: Wed, 02 Sep 2015 23:09:06 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> Message-ID: <55E76572.3050405@nh2.me> On 02/09/15 22:42, Kosyrev Serge wrote: > As a wild idea -- did anyone look at /Gitlab/ instead? Hi, yes. It does not currently have a sufficient review functionality (cannot handle multiple revisions easily). On 02/09/15 20:51, Simon Marlow wrote: > It might feel better > for the author, but discovering what changed between two branches of > multiple commits on github is almost impossible. I disagree with the first part of this: When the UI of the review tool is good, it is easy to follow. But there's no open-source implementation of that around. I agree that it is not easy to follow on Github. From rf at rufflewind.com Wed Sep 2 22:42:06 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Wed, 2 Sep 2015 18:42:06 -0400 Subject: Foreign calls and periodic alarm signals Message-ID: TL;DR: Does 'foreign import safe' silence the periodic alarm signals? I received a report on this rather strange bug in 'directory': https://github.com/haskell/directory/issues/35#issuecomment-136890912 I've concluded based on the dtruss log that it's caused by the timer signal that the GHC runtime emits. Somewhere inside the guts of 'realpath' on Mac OS X, there is a function that does the moral equivalent of: while (statfs64(?) && errno == EINTR); On a slow filesystem like SSHFS, this can cause a permanent hang from the barrage of signals. The reporter found that using 'foreign import safe' mitigates the issue. What I'm curious mainly is that: is something that the GHC runtime guarantees -- is using 'foreign import safe' assured to turn off the periodic signals for that thread? I tried reading this article [1], which seems to be the only documentation I could find about this, and it didn't really go into much depth about them. (I also couldn't find any info about how frequently they occur, on which threads they occur, or which specific signal it uses.) I'm also concerned whether there are other foreign functions out in the wild that could suffer the same bug, but remain hidden because they normally complete before the next alarm signal. [1]: https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals From allbery.b at gmail.com Thu Sep 3 00:10:29 2015 From: allbery.b at gmail.com (Brandon Allbery) Date: Wed, 2 Sep 2015 20:10:29 -0400 Subject: [Haskell-cafe] Foreign calls and periodic alarm signals In-Reply-To: <20150902235620.7FFD7F3936@mail.avvanta.com> References: <20150902235620.7FFD7F3936@mail.avvanta.com> Message-ID: On Wed, Sep 2, 2015 at 7:56 PM, Donn Cave wrote: > Sure are, though I don't know of any that have been identified so > directly as yours. I mean it sounds like you know where and how it's > breaking. Usually we just know something's dying on an interrupt and > then think to try turning off the signal barrage. It's interesting > that you're getting a stall instead, due to an EINTR loop. > network is moderately infamous for (formerly?) using unsafe calls that block.... -- brandon s allbery kf8nh sine nomine associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.stolarek at p.lodz.pl Thu Sep 3 03:57:59 2015 From: jan.stolarek at p.lodz.pl (Jan Stolarek) Date: Thu, 3 Sep 2015 05:57:59 +0200 Subject: HEADS UP: interface file format change, full rebuild required Message-ID: <201509030557.59638.jan.stolarek@p.lodz.pl> I just pushed injective type families patch, which changes interface file format. Full rebuild of GHC is required after you pull. Jan From austin at well-typed.com Thu Sep 3 04:41:46 2015 From: austin at well-typed.com (Austin Seipp) Date: Wed, 2 Sep 2015 23:41:46 -0500 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <55E76572.3050405@nh2.me> References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> Message-ID: (JFYI: I hate to announce my return with a giant novel of negative-nancy-ness about a proposal that just came up. I'm sorry about this!) TL;DR: I'm strongly -1 on this, because I think it introduces a lot of associated costs for everyone, the benefits aren't really clear, and I think it obscures the real core issue about "how do we get more contributors" and how to make that happen. Needless to say, GitHub does not magically solve both of these AFAICS. As is probably already widely known, I'm fairly against GitHub because I think at best its tools are mediocre and inappropriate for GHC - but I also don't think this proposal or the alternatives stemming from it are very good, and that it reduces visibility of the real, core complaints about what is wrong. Some of those problems may be with Phabricator, but it's hard to sort the wheat from the chaff, so to speak. For one, having two code review tools of any form is completely bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to switch, we should just switch. Having to have people decide how to contribute with two tools is as crazy as having two VCSs and just a way of asking people to get *more* confused, and have us answer more questions. That's something we need to avoid. For the same reason, I'm also not a fan of 'use third party thing to augment other thing to remove its deficiencies making it OK', because the problem is _it adds surface area_ and other problems in other cases. It is a solution that should be considered a last resort, because it is a logical solution that applies to everything. If we have a bot that moves GH PRs into Phab and then review them there, the surface area of what we have to maintain and explain has suddenly exploded: because now instead of 1 thing we have 3 things (GH, Phab, bot) and the 3 interactions between them, for a multiplier of *six* things we have to deal with. And then we use reviewable,io, because GH reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian. We can logically 'automate' everything in all ways to make all contributors happy, but there's a real *cognitive* overhead to this and humans don't scale as well as computers do. It is not truly 'automated away' if the cognitive burden is still there. I also find it extremely strange to tell people "By the way, this method in which you've contributed, as was requested by community members, is actually a complete proxy for the real method of contributing, you can find all your imported code here". How is this supposed to make contribution *easier* as opposed to just more confusing? Now you've got the impression you're using "the real thing" when in reality it's shoved off somewhere else to have the nitpicking done. Just using Phabricator would be less complicated, IMO, and much more direct. The same thing goes for reviewable.io. Adding it as a layer over GitHub just makes the surface area larger, and puts less under our control. And is it going to exist in the same form in 2 or 3 years? Will it continue to offer the same tools, the same workflows that we "like", and what happens when we hit a wall? It's easy to say "probably" or "sure" to all this, until we hit something we dislike and have no possibility of fixing. And once you do all this, BTW, you can 'never go back'. It seems so easy to just say 'submit pull requests' once and nothing else, right? Wrong. Once you commit to that infrastructure, it is *there* and simply taking it out from under the feet of those using it is not only unfortunate, it is *a huge timesink to undo it all*. Which amounts to it never happening. Oh, but you can import everything elsewhere! The problem is you *can't* import everything, but more importantly you can't *import my memories in another way*, so it's a huge blow to contributors to ask them about these mental time sinks, then to forget them all. And as your project grows, this becomes more of a memory as you made a first and last choice to begin with. Phabricator was 'lucky' here because it had the gateway into being the first review tool for us. But that wasn't because it was *better* than GitHub. It was because we were already using it, and it did not interact badly with our other tools or force us to compromise things - so the *cost* was low. The cost is immeasurably higher by default against GitHub because of this, at least to me. That's just how it is sometimes. Keep in mind there is a cost to everything and how you fix it. GitHub is not a simple patch to add a GHC feature. It is a question that fundamentally concerns itself with the future of the project for a long time. The costs must be analyzed more aggressively. Again, Phabricator had 'first child' preferential treatment. That's not something we can undo now. I know this sounds like a lot of ad hoc mumbo jumbo, but please bear with me: we need to identify the *root issue* here to fix it. Otherwise we will pay for the costs of an improper fix for a long time, and we are going to keep having this conversation over, and over again. And we need to weigh in the cost of fixing it, which is why I mention that so much. So with all this in mind, you're back to just using GitHub. But again GitHub is quite mediocre at best. So what is the point of all this? It's hinted at here: > the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops). The real hint is that "the number of contributions will go up". That's a noble goal and I think it's at the heart of this proposal. Here's the meat of it question: what is the cost of achieving this goal? That is, what amount of work is sufficient to make this goal realizable, and finally - why is GitHub *the best use of our time for achieving this?* That's one aspect of the cost - that it's the best use of the time. I feel like this is fundamentally why I always seem to never 'get' this argument, and I'm sure it's very frustrating on behalf of the people who have talked to me about it and like GitHub. But I feel like I've never gotten a straight answer for GHC. If the goal is actually "make more people contribute", that's pretty broad. I can make that very easy: give everyone who ever submits a patch push access. This is a legitimate way to run large projects that has worked. People will almost certainly be more willing to commit, especially when overhead on patch submission is reduced so much. Why not just do that instead? It's not like we even mandate code review, although we could. You could reasonably trust CI to catch and revert things a lot of the time for people who commit directly to master. We all do it sometimes. I'm being serious about this. I can start doing that tomorrow because the *cost is low*, both now and reasonably speaking into some foreseeable future. It is one of many solutions to raw heart of the proposal. GitHub is not a low cost move, but also, it is a *long term cost* because of the technical deficiencies it won't aim to address (merge commits are ugly, branch reviews are weak, ticket/PR namespace overlaps with Trac, etc etc) or that we'll have to work around. That means that if we want GitHub to fix the "give us more contributors" problem, and it has a high cost, it not only has _to fix the problem_, it also has to do that well enough to offset its cost. I don't think it's clear that is the case right now, among a lot of other solutions. I don't think the root issue is "We _need_ GitHub to get more contributors". It sounds like the complaint is more "I don't like how Phabricator works right now". That's an important distinction, because the latter is not only more specific, it's more actionable: - Things like Arcanist can be tracked as a Git submodule. There is little to no pain in this, it's low cost, and it can always be synchronized with Phabricator. This eliminates the "Must clone arcanist" and "need to upgrade arcanist" points. - Similarly when Phabricator sometimes kills a lot of builds, it's because I do an upgrade. That's mostly an error on my part and I can simply schedule upgrades regularly, barring hotfixes or somesuch. That should basically eliminate these. The other build issues are from picking the wrong base commit from the revision, I think, which I believe should be fixable upstream (I need to get a solid example of one that isn't a mega ultra patch.) - If Harbormaster is not building dependent patches as mentioned in WhyNotPhabricator, that is a bug, and I have not been aware of it. Please make me aware of it so I can file bugs! I seriously don't look at _every_ patch, I need to know this. That could have probably been fixed ASAP otherwise. - We can get rid of the awkwardness of squashes etc by using Phabricator's "immutable" history, although it introduces merge commits. Whether this is acceptable is up to debate (I dislike merge commits, but could live with it). - I do not understand point #3, about answering questions. Here's the reality: every single one of those cases is *almost always an error*. That's not a joke. Forgetting to commit a file, amending changes in the working tree, and specifying a reviewer are all total errors as it stands today. Why is this a minus? It catches a useful class of 'interaction bugs'. If it's because sometimes Phabricator yells about build arifacts in the tree, those should be .gitignore'd. If it's because you have to 'git stash' sometimes, this is fairly trivial IMO. Finally, specifying reviewers IS inconvenient, but currently needed. We could easily assign a '#reviewers' tag that would add default reviewers. - In the future, Phabricator will hopefully be able to automatically assign the right reviewers to every single incoming patch, based on the source file paths in the tree, using the Owners tool. Technically, we could do that today if we wanted, it's just a little more effort to add more Herald rules. This will be far, far more robust than anything GitHub can offer, and eliminates point #3. - Styling, linting etc errors being included, because reviews are hard to create: This is tangential IMO. We need to just bite the bullet on this and settle on some lint and coding styles, and apply them to the tree uniformly. The reality is *nobody ever does style changes on their own*, and they are always accompanied by a diff, and they always have to redo the work of pulling them out, Phab or not. Literally 99% of the time we ask for this, it happens this way. Perhaps instead we should just eliminate this class of work by just running linters over all of the source code at once, and being happy with it. Doing this in fact has other benefits: like `arc lint` will always _correctly_ report when linting errors are violated. And we can reject patches that violate them, because they will always be accurate. - As for some of the quotes, some of them are funny, but the real message lies in the context. :) In particular, there have been several cases (such as the DWARF work) where the idea was "write 30 commits and put them on Phabricator". News flash: *this is bad*, no matter whether you're using Phabricator or not, because it makes reviewing the whole thing immensely difficult from a reviewer perspective. The point here is that we can clear this up by being more communicative about what we expect of authors of large patches, and communicating your intent ASAP so we can get patches in as fast as possible. Writing a patch is the easiest part of the work. And more: - Clean up the documentation, it's a mess. It feels nice that everything has clear, lucid explanations on the wiki, but the wiki is ridiculously massive and we have a tendancy for 'link creep' where we spread things out. The contributors docs could probably stand to be streamlined. We would have to do this anyway, moving to GitHub or not. - Improve the homepage, directly linking to this aforementioned page. - Make it clear what we expect of contributors. I feel like a lot of this could be explained by having a 5 minute drive-by guide for patches, and then a longer 10-minute guide about A) How to style things, B) How to format your patches if you're going to contribute regularly, C) Why it is this way, and D) finally links to all the other things you need to know. People going into Phabricator expecting it to behave like GitHub is a problem (more a cultural problem IMO but that's another story), and if this can't be directly fixed, the best thing to do is make it clear why it isn't. Those are just some of the things OTTOMH, but this email is already way too long. This is what I mean though: fixing most of these is going to have *seriously smaller cost* than moving to GitHub. It does not account for "The GitHub factor" of people contributing "just because it's on GitHub", but again, that value has to outweigh the other costs. I'm not seriously convinced it does. I know it's work to fix these things. But GitHub doesn't really magically make a lot of our needs go away, and it's not going to magically fix things like style or lint errors, the fact Travis-CI is still pretty insufficient for us in the long term (and Harbormaster is faster, on our own hardware, too), or that it will cause needlessly higher amounts of spam through Trac and GitHub itself. I don't think settling on it as - what seems to be - a first resort, is a really good idea. On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hamb?chen wrote: > On 02/09/15 22:42, Kosyrev Serge wrote: >> As a wild idea -- did anyone look at /Gitlab/ instead? > > Hi, yes. It does not currently have a sufficient review functionality > (cannot handle multiple revisions easily). > > On 02/09/15 20:51, Simon Marlow wrote: >> It might feel better >> for the author, but discovering what changed between two branches of >> multiple commits on github is almost impossible. > > I disagree with the first part of this: When the UI of the review tool > is good, it is easy to follow. But there's no open-source implementation > of that around. > > I agree that it is not easy to follow on Github. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From austin at well-typed.com Thu Sep 3 04:46:10 2015 From: austin at well-typed.com (Austin Seipp) Date: Wed, 2 Sep 2015 23:46:10 -0500 Subject: HEADS UP: interface file format change, full rebuild required In-Reply-To: <201509030557.59638.jan.stolarek@p.lodz.pl> References: <201509030557.59638.jan.stolarek@p.lodz.pl> Message-ID: A long time coming. Congratulations! On Wed, Sep 2, 2015 at 10:57 PM, Jan Stolarek wrote: > I just pushed injective type families patch, which changes interface file format. Full rebuild of > GHC is required after you pull. > > Jan > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From michael at diglumi.com Thu Sep 3 05:03:31 2015 From: michael at diglumi.com (Michael Smith) Date: Wed, 2 Sep 2015 22:03:31 -0700 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> Message-ID: On Wed, Sep 2, 2015 at 9:41 PM, Austin Seipp wrote: > - Make it clear what we expect of contributors. I feel like a lot of > this could be explained by having a 5 minute drive-by guide for > patches, and then a longer 10-minute guide about A) How to style > things, B) How to format your patches if you're going to contribute > regularly, C) Why it is this way, and D) finally links to all the > other things you need to know. People going into Phabricator expecting > it to behave like GitHub is a problem (more a cultural problem IMO but > that's another story), and if this can't be directly fixed, the best > thing to do is make it clear why it isn't. > This is tangential to the issue of the code review system, and I don't want to derail the discussion here, but if you're talking about a drive-by guide for patches, I'd add E) straightforward instructions on how to get GHC building *fast* for development. A potential contributor won't even reach the patch submission stage if they can't get the build system set up properly, and the current documentation here is spread out and somewhat intimidating for a newcomer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Thu Sep 3 06:27:16 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 2 Sep 2015 23:27:16 -0700 Subject: addTopDecls restrictions Message-ID: <77F076EA-745A-4340-8F3F-92030A93E81A@cis.upenn.edu> Hi Geoff, The TH addTopDecls function is restricted to only a few kinds of declarations (functions, mostly). This set has been expanded in #10486 (https://ghc.haskell.org/trac/ghc/ticket/10486). Do you remember why the set of allowed declarations is restricted? It looks to me like any declaration would be OK. Thanks! Richard From joehillen at gmail.com Thu Sep 3 07:18:03 2015 From: joehillen at gmail.com (Joe Hillenbrand) Date: Thu, 3 Sep 2015 00:18:03 -0700 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> Message-ID: > As a wild idea -- did anyone look at /Gitlab/ instead? My personal experience with Gitlab at a previous job is that it is extremely unstable. I'd say even more unstable than trac and phabricator. It's especially bad when dealing with long files. From michael at diglumi.com Thu Sep 3 07:22:11 2015 From: michael at diglumi.com (Michael Smith) Date: Thu, 3 Sep 2015 00:22:11 -0700 Subject: A process for reporting security-sensitive issues Message-ID: I feel there should be some process for reporting security-sensitive issues in GHC -- for example, #9562 and #10826 in Trac. Perhaps something like the SensitiveTicketsPlugin [3] could be used? [1] https://ghc.haskell.org/trac/ghc/ticket/9562 [2] https://ghc.haskell.org/trac/ghc/ticket/10826 [3] https://trac-hacks.org/wiki/SensitiveTicketsPlugin -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Thu Sep 3 09:53:40 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 3 Sep 2015 11:53:40 +0200 Subject: Proposal: accept pull requests on GitHub Message-ID: > > The real hint is that "the number of contributions will go up". That's > a noble goal and I think it's at the heart of this proposal. > It's not. What's at the heart of my proposal is that `arc` sucks. Most of those quotes I posted are from regular contributors (here's another one: "arcanist kinda makes stuff even more confusing than Git by itself"). Newcomers will give it their best shot, thinking it's just another thing they need to learn, thinking it's their fault for running into problems, thinking they'll get the hang of it eventually. Except they won't, or at least I haven't, after using it for over a year. Maybe the fundamental problem with Phabricator is that it doesn't understand Git well, and the problems I posted on https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator are just symptoms of it. I'm having trouble putting this into words though (something about branches and submodules). Perhaps someone else can? In my opinion it's is a waste of our time trying to improve `arc` (it is 34000 lines of PHP btw + another 70000 LOC for libphutil), when `pull requests` are an obvious alternative that most of the Haskell community already uses. When you're going to require contributors to use a non-standard tool to get patches to your code review system, it better just work. `arc` is clearly failing us here, and I'm saying enough is enough. I need to think about your other points. Thank you for the thorough reply. > Here's the meat of it question: what is the cost of achieving this > goal? That is, what amount of work is sufficient to make this goal > realizable, and finally - why is GitHub *the best use of our time for > achieving this?* That's one aspect of the cost - that it's the best > use of the time. I feel like this is fundamentally why I always seem > to never 'get' this argument, and I'm sure it's very frustrating on > behalf of the people who have talked to me about it and like GitHub. > But I feel like I've never gotten a straight answer for GHC. > > If the goal is actually "make more people contribute", that's pretty > broad. I can make that very easy: give everyone who ever submits a > patch push access. This is a legitimate way to run large projects that > has worked. People will almost certainly be more willing to commit, > especially when overhead on patch submission is reduced so much. Why > not just do that instead? It's not like we even mandate code review, > although we could. You could reasonably trust CI to catch and revert > things a lot of the time for people who commit directly to master. We > all do it sometimes. > > I'm being serious about this. I can start doing that tomorrow because > the *cost is low*, both now and reasonably speaking into some > foreseeable future. It is one of many solutions to raw heart of the > proposal. GitHub is not a low cost move, but also, it is a *long term > cost* because of the technical deficiencies it won't aim to address > (merge commits are ugly, branch reviews are weak, ticket/PR namespace > overlaps with Trac, etc etc) or that we'll have to work around. > > That means that if we want GitHub to fix the "give us more > contributors" problem, and it has a high cost, it not only has _to fix > the problem_, it also has to do that well enough to offset its cost. I > don't think it's clear that is the case right now, among a lot of > other solutions. > > I don't think the root issue is "We _need_ GitHub to get more > contributors". It sounds like the complaint is more "I don't like how > Phabricator works right now". That's an important distinction, because > the latter is not only more specific, it's more actionable: > > - Things like Arcanist can be tracked as a Git submodule. There is > little to no pain in this, it's low cost, and it can always be > synchronized with Phabricator. This eliminates the "Must clone > arcanist" and "need to upgrade arcanist" points. > > - Similarly when Phabricator sometimes kills a lot of builds, it's > because I do an upgrade. That's mostly an error on my part and I can > simply schedule upgrades regularly, barring hotfixes or somesuch. That > should basically eliminate these. The other build issues are from > picking the wrong base commit from the revision, I think, which I > believe should be fixable upstream (I need to get a solid example of > one that isn't a mega ultra patch.) > > - If Harbormaster is not building dependent patches as mentioned in > WhyNotPhabricator, that is a bug, and I have not been aware of it. > Please make me aware of it so I can file bugs! I seriously don't look > at _every_ patch, I need to know this. That could have probably been > fixed ASAP otherwise. > > - We can get rid of the awkwardness of squashes etc by using > Phabricator's "immutable" history, although it introduces merge > commits. Whether this is acceptable is up to debate (I dislike merge > commits, but could live with it). > > - I do not understand point #3, about answering questions. Here's > the reality: every single one of those cases is *almost always an > error*. That's not a joke. Forgetting to commit a file, amending > changes in the working tree, and specifying a reviewer are all total > errors as it stands today. Why is this a minus? It catches a useful > class of 'interaction bugs'. If it's because sometimes Phabricator > yells about build arifacts in the tree, those should be .gitignore'd. > If it's because you have to 'git stash' sometimes, this is fairly > trivial IMO. Finally, specifying reviewers IS inconvenient, but > currently needed. We could easily assign a '#reviewers' tag that would > add default reviewers. > - In the future, Phabricator will hopefully be able to > automatically assign the right reviewers to every single incoming > patch, based on the source file paths in the tree, using the Owners > tool. Technically, we could do that today if we wanted, it's just a > little more effort to add more Herald rules. This will be far, far > more robust than anything GitHub can offer, and eliminates point #3. > > - Styling, linting etc errors being included, because reviews are > hard to create: This is tangential IMO. We need to just bite the > bullet on this and settle on some lint and coding styles, and apply > them to the tree uniformly. The reality is *nobody ever does style > changes on their own*, and they are always accompanied by a diff, and > they always have to redo the work of pulling them out, Phab or not. > Literally 99% of the time we ask for this, it happens this way. > Perhaps instead we should just eliminate this class of work by just > running linters over all of the source code at once, and being happy > with it. > > Doing this in fact has other benefits: like `arc lint` will always > _correctly_ report when linting errors are violated. And we can reject > patches that violate them, because they will always be accurate. > > - As for some of the quotes, some of them are funny, but the real > message lies in the context. :) In particular, there have been several > cases (such as the DWARF work) where the idea was "write 30 commits > and put them on Phabricator". News flash: *this is bad*, no matter > whether you're using Phabricator or not, because it makes reviewing > the whole thing immensely difficult from a reviewer perspective. The > point here is that we can clear this up by being more communicative > about what we expect of authors of large patches, and communicating > your intent ASAP so we can get patches in as fast as possible. Writing > a patch is the easiest part of the work. > > And more: > > - Clean up the documentation, it's a mess. It feels nice that > everything has clear, lucid explanations on the wiki, but the wiki is > ridiculously massive and we have a tendancy for 'link creep' where we > spread things out. The contributors docs could probably stand to be > streamlined. We would have to do this anyway, moving to GitHub or not. > > - Improve the homepage, directly linking to this aforementioned page. > > - Make it clear what we expect of contributors. I feel like a lot of > this could be explained by having a 5 minute drive-by guide for > patches, and then a longer 10-minute guide about A) How to style > things, B) How to format your patches if you're going to contribute > regularly, C) Why it is this way, and D) finally links to all the > other things you need to know. People going into Phabricator expecting > it to behave like GitHub is a problem (more a cultural problem IMO but > that's another story), and if this can't be directly fixed, the best > thing to do is make it clear why it isn't. > > Those are just some of the things OTTOMH, but this email is already > way too long. This is what I mean though: fixing most of these is > going to have *seriously smaller cost* than moving to GitHub. It does > not account for "The GitHub factor" of people contributing "just > because it's on GitHub", but again, that value has to outweigh the > other costs. I'm not seriously convinced it does. > > I know it's work to fix these things. But GitHub doesn't really > magically make a lot of our needs go away, and it's not going to > magically fix things like style or lint errors, the fact Travis-CI is > still pretty insufficient for us in the long term (and Harbormaster is > faster, on our own hardware, too), or that it will cause needlessly > higher amounts of spam through Trac and GitHub itself. I don't think > settling on it as - what seems to be - a first resort, is a really > good idea. > > > On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hamb?chen wrote: > > On 02/09/15 22:42, Kosyrev Serge wrote: > >> As a wild idea -- did anyone look at /Gitlab/ instead? > > > > Hi, yes. It does not currently have a sufficient review functionality > > (cannot handle multiple revisions easily). > > > > On 02/09/15 20:51, Simon Marlow wrote: > >> It might feel better > >> for the author, but discovering what changed between two branches of > >> multiple commits on github is almost impossible. > > > > I disagree with the first part of this: When the UI of the review tool > > is good, it is easy to follow. But there's no open-source implementation > > of that around. > > > > I agree that it is not easy to follow on Github. > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > -- > Regards, > > Austin Seipp, Haskell Consultant > Well-Typed LLP, http://www.well-typed.com/ > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tab at snarc.org Thu Sep 3 10:43:55 2015 From: tab at snarc.org (Vincent Hanquez) Date: Thu, 3 Sep 2015 11:43:55 +0100 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: Message-ID: <55E8246B.8040108@snarc.org> On 03/09/2015 10:53, Thomas Miedema wrote: > > The real hint is that "the number of contributions will go up". That's > a noble goal and I think it's at the heart of this proposal. > > > When you're going to require contributors to use a non-standard tool > to get patches to your code review system, it better just work. `arc` > is clearly failing us here, and I'm saying enough is enough. Not only this, but there's (probably) lots of small/janitorial contributions that do not need the full power of phabricator or any sophisticated code review. Not accepting github PRs and forcing everyone to go through an uncommon tool (however formidable), is quite likely to turn those contributions away IMHO. -- Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Thu Sep 3 11:48:31 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 3 Sep 2015 13:48:31 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <55E8246B.8040108@snarc.org> References: <55E8246B.8040108@snarc.org> Message-ID: On Thu, Sep 3, 2015 at 12:43 PM, Vincent Hanquez wrote: > there's (probably) lots of small/janitorial contributions that do not need > the full power of phabricator or any sophisticated code review. > Austin's point, and I agree, is that we shouldn't optimize the system for those contributions. Cleanup, documentation and other small patches are very much welcomed, and they usually get merged within a few days. To make a truly better GHC though, we very much depend on expert contributors, say to implement and review Backpack or DWARF-based backtraces. My point is that `arc` is hurting these expert contributors as much, if not more than everyone else. To get more expert contributors you need more newcomers, but don't optimize the system only for the newcomers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuncer.ayaz at gmail.com Thu Sep 3 12:29:00 2015 From: tuncer.ayaz at gmail.com (Tuncer Ayaz) Date: Thu, 3 Sep 2015 14:29:00 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> Message-ID: On Thu, Sep 3, 2015 at 6:41 AM, Austin Seipp wrote: > (JFYI: I hate to announce my return with a giant novel of > negative-nancy-ness about a proposal that just came up. I'm sorry > about this!) > > TL;DR: I'm strongly -1 on this, because I think it introduces a lot > of associated costs for everyone, the benefits aren't really clear, > and I think it obscures the real core issue about "how do we get > more contributors" and how to make that happen. Needless to say, > GitHub does not magically solve both of these AFAICS. Let me start off by saying I'm not arguing for GitHub or anything else to replace Phabricator. I'm merely trying to understand the problems with merge commits and patch sets. > - We can get rid of the awkwardness of squashes etc by using > Phabricator's "immutable" history, although it introduces merge > commits. Whether this is acceptable is up to debate (I dislike merge > commits, but could live with it). I'm genuinely curious about the need to avoid merge commits. I do avoid merge-master-to-topic-branch commits in submitted diffs, but unless you always only merge a single cumulative commit for each diff, merge commits are very useful for vcs history. > - As for some of the quotes, some of them are funny, but the real > message lies in the context. :) In particular, there have been > several cases (such as the DWARF work) where the idea was "write 30 > commits and put them on Phabricator". News flash: *this is bad*, no > matter whether you're using Phabricator or not, because it makes > reviewing the whole thing immensely difficult from a reviewer > perspective. The point here is that we can clear this up by being > more communicative about what we expect of authors of large patches, > and communicating your intent ASAP so we can get patches in as fast > as possible. Writing a patch is the easiest part of the work. I would also like to understand why reviewing a single commit is easier than the steps (commits) that led to the whole diff. Maybe I review stuff differently, but, as I wrote yesterday, I've always found it easier to follow the changes when it's split into proper commits. And instead of "big patch" I should have written "non-trivial patch". A 100-line unified diff can be equally hard to follow as a 1000-line diff, unless each diff hunk is accompanied with code comments. But comments don't always make sense in the code, and often enough it's best to keep it in the commit message only. Hence the need for splitting the work, and ideally committing as you work on it, with a final cleanup of rearranging commits into a proper set of commits. I'm repeating myself, but git-bisect is much more precise with relevant changes split up as they happened. From rpglover64 at gmail.com Thu Sep 3 13:59:55 2015 From: rpglover64 at gmail.com (Alex Rozenshteyn) Date: Thu, 3 Sep 2015 09:59:55 -0400 Subject: more releases In-Reply-To: <87si6wkdta.fsf@smart-cactus.org> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> <87si6wkdta.fsf@smart-cactus.org> Message-ID: I have the impression (no data to back it up, though) that no small number of users download bindists (because most OS packages are out of date: Debian Unstable is still on 7.8.4, as is Ubuntu Wily; Arch is on 7.10.1). On Wed, Sep 2, 2015 at 12:04 PM, Ben Gamari wrote: > Richard Eisenberg writes: > > > I think some of my idea was misunderstood here: my goal was to have > > quick releases only from the stable branch. The goal would not be to > > release the new and shiny, but instead to get bugfixes out to users > > quicker. The new and shiny (master) would remain as it is now. In > > other words: more users would be affected by this change than just the > > vanguard. > > > I see. This is something we could certainly do. > > It would require, however, that we be more pro-active about > continuing to merge things to the stable branch after the release. > Currently the stable branch is essentially in the same state that it was > in for the 7.10.2 release. I've left it this way as it takes time and > care to cherry-pick patches to stable. Thusfar my poilcy has been to > perform this work lazily until it's clear that we will do > another stable release as otherwise the effort may well be wasted. > > So, even if the steps of building, testing, and uploading the release > are streamlined more frequent releases are still far from free. Whether > it's a worthwhile cost I don't know. > > This is a difficult question to answer without knowing more about how > typical users actually acquire GHC. For instance, this effort would > have minimal impact on users who get their compiler through their > distribution's package manager. On the other hand, if most users > download GHC bindists directly from the GHC download page, then perhaps > this would be effort well-spent. > > Cheers, > > - Ben > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Sep 3 16:08:23 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 3 Sep 2015 16:08:23 +0000 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: <20150903061043.11268.51958@phabricator.haskell.org> References: <20150903061043.11268.51958@phabricator.haskell.org> Message-ID: <6bac15f299b2494187fdc47167cae02d@DB4PR30MB030.064d.mgd.msft.net> Edward | Jan's injective type families commit is causing tcfail220 to fail, but | that's unrelated to this ticket. This is true. I told Jan to commit anyway because tcfail220 is a "hsig" test, and a) I know that hsigs are in flux (although I am not clear about how) b) I don't understand them enough to fix. So I hope it's ok to have broken this. Jan and I can certainly help when you want to fix it. Meanwhile would you mark it as expect-broken. (Although I am not sure that it's worth opening a fresh ticket for it.) thanks Simon | -----Original Message----- | From: noreply at phabricator.haskell.org | [mailto:noreply at phabricator.haskell.org] | Sent: 03 September 2015 07:11 | To: Simon Peyton Jones | Subject: [Differential] [Commented On] D1182: Implement improved error | messages for ambiguous type variables (#10733) | | KaneTW added a comment. | | Jan's injective type families commit is causing tcfail220 to fail, but | that's unrelated to this ticket. | | | REPOSITORY | rGHC Glasgow Haskell Compiler | | REVISION DETAIL | https://phabricator.haskell.org/D1182 | | EMAIL PREFERENCES | https://phabricator.haskell.org/settings/panel/emailpreferences/ | | To: KaneTW, simonpj, bgamari, austin | Cc: goldfire, simonpj, thomie From ezyang at mit.edu Thu Sep 3 16:13:29 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Thu, 03 Sep 2015 09:13:29 -0700 Subject: Fwd: RE: D1182: Implement improved error messages for ambiguous type variables (#10733) Message-ID: <1441296802-sup-4146@sabre> It's certainly true that hsig is in flux, but it doesn't seem like injective type families should have broken this test. I'll take a look. Edward Excerpts from Simon Peyton Jones's message of 2015-09-03 09:08:23 -0700: > Edward > > | Jan's injective type families commit is causing tcfail220 to fail, but > | that's unrelated to this ticket. > > This is true. I told Jan to commit anyway because tcfail220 is a "hsig" test, and > a) I know that hsigs are in flux (although I am not clear about how) > b) I don't understand them enough to fix. > > So I hope it's ok to have broken this. Jan and I can certainly help when you want to fix it. > > Meanwhile would you mark it as expect-broken. (Although I am not sure that it's worth opening a fresh ticket for it.) > > thanks > > Simon > > | -----Original Message----- > | From: noreply at phabricator.haskell.org > | [mailto:noreply at phabricator.haskell.org] > | Sent: 03 September 2015 07:11 > | To: Simon Peyton Jones > | Subject: [Differential] [Commented On] D1182: Implement improved error > | messages for ambiguous type variables (#10733) > | > | KaneTW added a comment. > | > | Jan's injective type families commit is causing tcfail220 to fail, but > | that's unrelated to this ticket. > | > | > | REPOSITORY > | rGHC Glasgow Haskell Compiler > | > | REVISION DETAIL > | https://phabricator.haskell.org/D1182 > | > | EMAIL PREFERENCES > | https://phabricator.haskell.org/settings/panel/emailpreferences/ > | > | To: KaneTW, simonpj, bgamari, austin > | Cc: goldfire, simonpj, thomie --- End forwarded message --- From thomasmiedema at gmail.com Thu Sep 3 16:17:35 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 3 Sep 2015 18:17:35 +0200 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: <1441296802-sup-4146@sabre> References: <1441296802-sup-4146@sabre> Message-ID: The bug is trigger by Maybe now being a wired-in type. See https://phabricator.haskell.org/D1208 for a workaround. On Thu, Sep 3, 2015 at 6:13 PM, Edward Z. Yang wrote: > It's certainly true that hsig is in flux, but it doesn't seem like > injective type families should have broken this test. I'll take a look. > > Edward > > Excerpts from Simon Peyton Jones's message of 2015-09-03 09:08:23 -0700: > > Edward > > > > | Jan's injective type families commit is causing tcfail220 to fail, but > > | that's unrelated to this ticket. > > > > This is true. I told Jan to commit anyway because tcfail220 is a "hsig" > test, and > > a) I know that hsigs are in flux (although I am not clear about how) > > b) I don't understand them enough to fix. > > > > So I hope it's ok to have broken this. Jan and I can certainly help > when you want to fix it. > > > > Meanwhile would you mark it as expect-broken. (Although I am not sure > that it's worth opening a fresh ticket for it.) > > > > thanks > > > > Simon > > > > | -----Original Message----- > > | From: noreply at phabricator.haskell.org > > | [mailto:noreply at phabricator.haskell.org] > > | Sent: 03 September 2015 07:11 > > | To: Simon Peyton Jones > > | Subject: [Differential] [Commented On] D1182: Implement improved error > > | messages for ambiguous type variables (#10733) > > | > > | KaneTW added a comment. > > | > > | Jan's injective type families commit is causing tcfail220 to fail, but > > | that's unrelated to this ticket. > > | > > | > > | REPOSITORY > > | rGHC Glasgow Haskell Compiler > > | > > | REVISION DETAIL > > | https://phabricator.haskell.org/D1182 > > | > > | EMAIL PREFERENCES > > | https://phabricator.haskell.org/settings/panel/emailpreferences/ > > | > > | To: KaneTW, simonpj, bgamari, austin > > | Cc: goldfire, simonpj, thomie > --- End forwarded message --- > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hvriedel at gmail.com Thu Sep 3 16:41:02 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Thu, 03 Sep 2015 18:41:02 +0200 Subject: Shared data type for extension flags In-Reply-To: (Matthew Pickering's message of "Wed, 2 Sep 2015 10:00:40 +0200") References: Message-ID: <8737yvbgm9.fsf@gmail.com> On 2015-09-02 at 10:00:40 +0200, Matthew Pickering wrote: > Surely the easiest way here (including for other tooling - ie > haskell-src-exts) is to create a package which just provides this > enumeration. GHC, cabal, th, haskell-src-exts and so on then all > depend on this package rather than creating their own enumeration. I'm not sure this is such a good idea having a package many packages depend on if `ghc` is one of them, as this forces every install-plan which ends up involving the ghc package to be pinned to the very same version the `ghc` package was compiled against. This is a general problem affecting packages `ghc` depends upon (and as a side-note starting with GHC 7.10, we were finally able to cut the package-dependency between `ghc` and `Cabal`) Also, Cabal is not GHC specific, and contains a list of known extensions (`KnownExtension`) across multiple Haskell compilers https://github.com/haskell/cabal/blob/master/Cabal/Language/Haskell/Extension.hs and I assume the extension enumeration needed for GHC would be tailored to GHC's need and omit extensions not relevant to GHC, as well as include experimental/internal ones not suited for consumption by Cabal. From rwbarton at gmail.com Thu Sep 3 16:51:03 2015 From: rwbarton at gmail.com (Reid Barton) Date: Thu, 3 Sep 2015 12:51:03 -0400 Subject: Shared data type for extension flags In-Reply-To: <8737yvbgm9.fsf@gmail.com> References: <8737yvbgm9.fsf@gmail.com> Message-ID: On Thu, Sep 3, 2015 at 12:41 PM, Herbert Valerio Riedel wrote: > On 2015-09-02 at 10:00:40 +0200, Matthew Pickering wrote: > > Surely the easiest way here (including for other tooling - ie > > haskell-src-exts) is to create a package which just provides this > > enumeration. GHC, cabal, th, haskell-src-exts and so on then all > > depend on this package rather than creating their own enumeration. > > I'm not sure this is such a good idea having a package many packages > depend on if `ghc` is one of them, as this forces every install-plan > which ends up involving the ghc package to be pinned to the very same > version the `ghc` package was compiled against. > > This is a general problem affecting packages `ghc` depends upon (and as > a side-note starting with GHC 7.10, we were finally able to cut the > package-dependency between `ghc` and `Cabal`) > Surely this argument does not apply to a package created to hold data types that would otherwise live in the template-haskell or ghc packages. Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Thu Sep 3 17:51:43 2015 From: ezyang at mit.edu (Edward Z Yang) Date: Thu, 3 Sep 2015 17:51:43 +0000 Subject: Using GHC API to compile Haskell file In-Reply-To: References: <1440368677-sup-472@sabre>, Message-ID: Hello Neil, Sorry about the delay; I hadn't gotten around to seeing if I could reproduce it. Here is a working copy of the program which appears to work with GHC 7.10.2 on 64-bit Windows: module Main where import GHC import GHC.Paths ( libdir ) import DynFlags import SysTools main = do defaultErrorHandler defaultFatalMessager defaultFlushOut $ do runGhc (Just libdir) $ do dflags <- getSessionDynFlags setSessionDynFlags (gopt_set dflags Opt_Static) target <- guessTarget "Test.hs" Nothing setTargets [target] load LoadAllTargets Here is how I tested it: stack ghc -- -package ghc -package ghc-paths --make Main.hs (after stack installing ghc-paths) Did you mean the error occurred when you did set Opt_Static? I can?t reproduce your specific error in that case either. Cheers, Edward Sent from Windows Mail From: Neil Mitchell Sent: ?Monday?, ?August? ?24?, ?2015 ?12?:?42? ?AM To: Edward Z Yang Cc: ghc-devs at haskell.org Thanks Edward, that fixed the issue with GHC 7.8.3. While trying to replicate with 7.10.2 to submit a bug report, I got a different error, even with your fix included: C:\Users\NDMIT_~1\AppData\Local\Temp\ghc2428_1\ghc_4.o:ghc_3.c:(.text+0x55): undefined reference to `ZCMain_main_closure' Doing another diff of the command lines, I see ghc --make includes "Test.o" on the Link line, but the API doesn't. Thanks, Neil On Mon, Aug 24, 2015 at 12:00 AM, Edward Z. Yang wrote: > The problem is that the default code is trying to build a dynamically > linked executable, but the Windows distributions don't come with dlls > by default. > > Why doesn't the GHC API code pick this up? Based on snooping > ghc/Main.hs, it's probably because you need to call parseDynamicFlags* > which will call updateWays which will turn off -dynamic-too if the > platform doesn't support it. > > GHC bug? Absolutely! Please file a ticket. > > Edward > > Excerpts from Neil Mitchell's message of 2015-08-23 05:43:28 -0700: >> Hi, >> >> Is this the right place for GHC API queries? If not, is there anywhere better? >> >> I want to compile a Haskell module, much like `ghc --make` or `ghc -c` >> does. The sample code on the Haskell wiki >> (https://wiki.haskell.org/GHC/As_a_library#A_Simple_Example), >> StackOverflow (http://stackoverflow.com/a/5631338/160673) and in GHC >> API slides (http://sneezy.cs.nott.ac.uk/fplunch/weblog/wp-content/uploads/2008/12/ghc-api-slidesnotes.pdf) >> says: >> >> import GHC >> import GHC.Paths ( libdir ) >> import DynFlags >> >> main = >> defaultErrorHandler defaultFatalMessager defaultFlushOut $ do >> runGhc (Just libdir) $ do >> dflags <- getSessionDynFlags >> setSessionDynFlags dflags >> target <- guessTarget "Test.hs" Nothing >> setTargets [target] >> load LoadAllTargets >> >> However, given a `Test.hs` file with the contents `main = print 1`, I >> get the error: >> >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSbase-4.7.0.1-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSinteger-gmp-0.5.1.0-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSghc-prim-0.3.1.0-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSrts-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lffi-6 >> collect2: ld returned 1 exit status >> >> Has the recipe changed? >> >> By turning up the verbosity, I was able to compare the command line >> passed to the linker. The failing GHC API call contains: >> >> "-lHSbase-4.7.0.1-ghc7.8.3" "-lHSinteger-gmp-0.5.1.0-ghc7.8.3" >> "-lHSghc-prim-0.3.1.0-ghc7.8.3" "-lHSrts-ghc7.8.3" "-lffi-6" >> >> While the succeeding ghc --make contains: >> >> "-lHSbase-4.7.0.1" "-lHSinteger-gmp-0.5.1.0" >> "-lHSghc-prim-0.3.1.0" "-lHSrts" "-lCffi-6" >> >> Should I be getting DynFlags differently to influence those link variables? >> >> Thanks, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Thu Sep 3 19:02:58 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Thu, 3 Sep 2015 12:02:58 -0700 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: <6bac15f299b2494187fdc47167cae02d@DB4PR30MB030.064d.mgd.msft.net> References: <20150903061043.11268.51958@phabricator.haskell.org> <6bac15f299b2494187fdc47167cae02d@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55E89962.4020304@gmail.com> On 03/09/2015 09:08, Simon Peyton Jones wrote: > Edward > > | Jan's injective type families commit is causing tcfail220 to fail, but > | that's unrelated to this ticket. > > This is true. I told Jan to commit anyway because tcfail220 is a "hsig" test, and > a) I know that hsigs are in flux (although I am not clear about how) > b) I don't understand them enough to fix. > > So I hope it's ok to have broken this. Jan and I can certainly help when you want to fix it. In general we shouldn't commit anything that breaks validate, because this causes problems for other developers. The right thing to do would be to mark it expect_broken before committing. Cheers Simon > > Meanwhile would you mark it as expect-broken. (Although I am not sure that it's worth opening a fresh ticket for it.) > > thanks > > Simon > > > | -----Original Message----- > | From: noreply at phabricator.haskell.org > | [mailto:noreply at phabricator.haskell.org] > | Sent: 03 September 2015 07:11 > | To: Simon Peyton Jones > | Subject: [Differential] [Commented On] D1182: Implement improved error > | messages for ambiguous type variables (#10733) > | > | KaneTW added a comment. > | > | Jan's injective type families commit is causing tcfail220 to fail, but > | that's unrelated to this ticket. > | > | > | REPOSITORY > | rGHC Glasgow Haskell Compiler > | > | REVISION DETAIL > | https://phabricator.haskell.org/D1182 > | > | EMAIL PREFERENCES > | https://phabricator.haskell.org/settings/panel/emailpreferences/ > | > | To: KaneTW, simonpj, bgamari, austin > | Cc: goldfire, simonpj, thomie > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From jan.stolarek at p.lodz.pl Thu Sep 3 19:57:38 2015 From: jan.stolarek at p.lodz.pl (Jan Stolarek) Date: Thu, 3 Sep 2015 21:57:38 +0200 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: <55E89962.4020304@gmail.com> References: <6bac15f299b2494187fdc47167cae02d@DB4PR30MB030.064d.mgd.msft.net> <55E89962.4020304@gmail.com> Message-ID: <201509032157.38426.jan.stolarek@p.lodz.pl> > In general we shouldn't commit anything that breaks validate, because > this causes problems for other developers. The right thing to do would > be to mark it expect_broken before committing. Sorry for that. I was actually thinking about marking the test as expect_broken, but then the problem would be completely hidden> I wanted to discuss a possible solution with Simon and Edward first but it looks like Thomas already found a workaround. Jan From ezyang at mit.edu Thu Sep 3 21:12:31 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Thu, 03 Sep 2015 14:12:31 -0700 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: References: <1441296802-sup-4146@sabre> Message-ID: <1441314734-sup-6168@sabre> Thanks Thomas, I think this workaround is fine. Excerpts from Thomas Miedema's message of 2015-09-03 09:17:35 -0700: > The bug is trigger by Maybe now being a wired-in type. See > https://phabricator.haskell.org/D1208 for a workaround. > > On Thu, Sep 3, 2015 at 6:13 PM, Edward Z. Yang wrote: > > > It's certainly true that hsig is in flux, but it doesn't seem like > > injective type families should have broken this test. I'll take a look. > > > > Edward > > > > Excerpts from Simon Peyton Jones's message of 2015-09-03 09:08:23 -0700: > > > Edward > > > > > > | Jan's injective type families commit is causing tcfail220 to fail, but > > > | that's unrelated to this ticket. > > > > > > This is true. I told Jan to commit anyway because tcfail220 is a "hsig" > > test, and > > > a) I know that hsigs are in flux (although I am not clear about how) > > > b) I don't understand them enough to fix. > > > > > > So I hope it's ok to have broken this. Jan and I can certainly help > > when you want to fix it. > > > > > > Meanwhile would you mark it as expect-broken. (Although I am not sure > > that it's worth opening a fresh ticket for it.) > > > > > > thanks > > > > > > Simon > > > > > > | -----Original Message----- > > > | From: noreply at phabricator.haskell.org > > > | [mailto:noreply at phabricator.haskell.org] > > > | Sent: 03 September 2015 07:11 > > > | To: Simon Peyton Jones > > > | Subject: [Differential] [Commented On] D1182: Implement improved error > > > | messages for ambiguous type variables (#10733) > > > | > > > | KaneTW added a comment. > > > | > > > | Jan's injective type families commit is causing tcfail220 to fail, but > > > | that's unrelated to this ticket. > > > | > > > | > > > | REPOSITORY > > > | rGHC Glasgow Haskell Compiler > > > | > > > | REVISION DETAIL > > > | https://phabricator.haskell.org/D1182 > > > | > > > | EMAIL PREFERENCES > > > | https://phabricator.haskell.org/settings/panel/emailpreferences/ > > > | > > > | To: KaneTW, simonpj, bgamari, austin > > > | Cc: goldfire, simonpj, thomie > > --- End forwarded message --- > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > From marlowsd at gmail.com Thu Sep 3 22:12:07 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Thu, 3 Sep 2015 15:12:07 -0700 Subject: Foreign calls and periodic alarm signals In-Reply-To: References: Message-ID: <55E8C5B7.9030309@gmail.com> On 02/09/2015 15:42, Phil Ruffwind wrote: > TL;DR: Does 'foreign import safe' silence the periodic alarm signals? No it doesn't. Perhaps the fact that a safe FFI call may create another worker thread means that the timer signal has gone to the other thread and didn't interrupt the thread making the statfs64() call. There's pthread_setmask() that could help, but it's pretty difficult to do this in a consistent way because we'd have to pthread_setmask() every thread that runs Haskell code, including calls from outside. I'm not sure yet what the right solution is, but a good start would be to open a ticket. Cheers Simon > I received a report on this rather strange bug in 'directory': > > https://github.com/haskell/directory/issues/35#issuecomment-136890912 > > I've concluded based on the dtruss log that it's caused by the timer > signal that the GHC runtime emits. Somewhere inside the guts of > 'realpath' on Mac OS X, there is a function that does the moral > equivalent of: > > while (statfs64(?) && errno == EINTR); > > On a slow filesystem like SSHFS, this can cause a permanent hang from > the barrage of signals. > > The reporter found that using 'foreign import safe' mitigates the > issue. What I'm curious mainly is that: is something that the GHC > runtime guarantees -- is using 'foreign import safe' assured to turn > off the periodic signals for that thread? > > I tried reading this article [1], which seems to be the only > documentation I could find about this, and it didn't really go into > much depth about them. (I also couldn't find any info about how > frequently they occur, on which threads they occur, or which specific > signal it uses.) > > I'm also concerned whether there are other foreign functions out in > the wild that could suffer the same bug, but remain hidden because > they normally complete before the next alarm signal. > > [1]: https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From _deepfire at feelingofgreen.ru Thu Sep 3 22:31:53 2015 From: _deepfire at feelingofgreen.ru (Kosyrev Serge) Date: Fri, 04 Sep 2015 01:31:53 +0300 Subject: UNS: Re: Proposal: accept pull requests on GitHub In-Reply-To: (sfid-20150903_113912_535316_3EC6A2E7) (Joe Hillenbrand's message of "Thu, 3 Sep 2015 00:18:03 -0700") References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> Message-ID: <877fo7w2w6.fsf@andromedae.feelingofgreen.ru> Joe Hillenbrand writes: >> As a wild idea -- did anyone look at /Gitlab/ instead? > > My personal experience with Gitlab at a previous job is that it is > extremely unstable. I'd say even more unstable than trac and > phabricator. It's especially bad when dealing with long files. Curiously, for the nearly three years that we've been dealing with it, I couldn't have pointed at a single instability (or even just a bug), despite using a moderately loaded instance of Gitlab. Also, not being a huge enterprise yet, Gitlab folks /might/ potentially be more responsive to feature requests from a prominent open-source project.. -- ? ???????e? / respectfully, ??????? ?????? From thomasmiedema at gmail.com Fri Sep 4 00:15:44 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Fri, 4 Sep 2015 02:15:44 +0200 Subject: [Diffusion] [Committed] rGHCbe0ce8718ea4: Fix for crash in setnumcapabilities001 Message-ID: Simon, for what it's worth, I sporadically (< once per month) see this test timing out on Phabricator. Latest occurence: https://phabricator.haskell.org/harbormaster/build/5904/?l=0 Thomas On Fri, Jun 26, 2015 at 10:32 AM, simonmar (Simon Marlow) < noreply at phabricator.haskell.org> wrote: > simonmar committed rGHCbe0ce8718ea4: Fix for crash in > setnumcapabilities001 (authored by simonmar). > > Fix for crash in setnumcapabilities001 > > getNewNursery() was unconditionally incrementing next_nursery, which > is normally fine but it broke an assumption in > storageAddCapabilities(). This manifested as an occasional crash in > the setnumcapabilities001 test. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at well-typed.com Fri Sep 4 05:50:12 2015 From: adam at well-typed.com (Adam Gundry) Date: Fri, 04 Sep 2015 06:50:12 +0100 Subject: A process for reporting security-sensitive issues In-Reply-To: References: Message-ID: <55E93114.6050800@well-typed.com> On 03/09/15 08:22, Michael Smith wrote: > I feel there should be some process for reporting security-sensitive issues > in GHC -- for example, #9562 and #10826 in Trac. Perhaps something like the > SensitiveTicketsPlugin [3] could be used? > > [1] https://ghc.haskell.org/trac/ghc/ticket/9562 > [2] https://ghc.haskell.org/trac/ghc/ticket/10826 > [3] https://trac-hacks.org/wiki/SensitiveTicketsPlugin Thanks for raising this. While I see where you are coming from, I'm going to argue against it, because I think it creates a false impression of the security guarantees GHC provides. Such a process may give the impression that there are people directly tasked with handling such security bugs, which is not currently the case. I think it is unreasonable for the security of a system to depend on GHC having no type soundness bugs, particularly since GHC is actively used for developing experimental type system features. #9562 has been open for a year and we don't have a good solution. Relatedly, I think the Safe Haskell documentation should prominently warn about the existence of #9562 and the possibility of other type soundness bugs, like it does for compilation safety issues. What do others think? Adam -- Adam Gundry, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From spam at scientician.net Fri Sep 4 05:55:52 2015 From: spam at scientician.net (Bardur Arantsson) Date: Fri, 4 Sep 2015 07:55:52 +0200 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> Message-ID: On 09/03/2015 09:18 AM, Joe Hillenbrand wrote: >> As a wild idea -- did anyone look at /Gitlab/ instead? > > My personal experience with Gitlab at a previous job is that it is > extremely unstable. I'd say even more unstable than trac and > phabricator. It's especially bad when dealing with long files. > If we're talking alternative systems, then I can personally recommend Gerrit (https://www.gerritcodereview.com/) which, while it *looks* pretty basic, it works really well with the general Git workflow. For example, it tracks commits in individual reviews, but tracks dependencies between those commits. So when e.g. you push a new series of commits implementing a feature, all those reviews just get a new "version" and you can diff between different versions of each individual commit -- this often cuts down drastically on how much you have to re-review when a new version is submitted. You can also specify auto-merge when a review gets +2 (or +1, or whatever), including rebase-before-merge-and-ff instead of having merge commits which just clutter the history needlessly. You can set up various rules using a predicate-based rules engine, for example about a review needing two approvals and/or always needing approval from an (external) build system, etc. The only setup it needs in a git hook... which it will tell you exactly how to install with a single command when you push your first review. (It's some scp command, I seem to recall.) Caveat: I haven't tried using it on Windows. Regards, From rf at rufflewind.com Fri Sep 4 07:52:33 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Fri, 4 Sep 2015 03:52:33 -0400 Subject: Foreign calls and periodic alarm signals In-Reply-To: <55E8C5B7.9030309@gmail.com> References: <55E8C5B7.9030309@gmail.com> Message-ID: > a good start would be to open a ticket. Okay, done: https://ghc.haskell.org/trac/ghc/ticket/10840 From ezyang at mit.edu Fri Sep 4 08:03:45 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 01:03:45 -0700 Subject: Unlifted data types Message-ID: <1441353701-sup-9422@sabre> Hello friends, After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal: https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal. Cheers, Edward From ndmitchell at gmail.com Fri Sep 4 12:40:19 2015 From: ndmitchell at gmail.com (Neil Mitchell) Date: Fri, 4 Sep 2015 13:40:19 +0100 Subject: Using GHC API to compile Haskell file In-Reply-To: References: <1440368677-sup-472@sabre> Message-ID: > Sorry about the delay; I hadn't gotten around to seeing if I could reproduce > it. Here is a working copy of the program which appears to work with GHC > 7.10.2 on 64-bit Windows: Thanks, that does indeed solve it the first bit. To try and make it a bit clearer what I'm after, I've put the stuff in a git repo: https://github.com/ndmitchell/ghc-process/blob/master/Main.hs Looking at Main.hs, there are three modes, Process (run ghc.exe 3 times), APIMake (the code you sent me), and APISingle (attempt to replicate the 3 ghc.exe invokations through the GHC API). The first two work perfectly, following Edward's tweaks. The final one fails at linking. So I have two questions: 1) Is there any way to do the two compilations sharing some cached state, e.g. loaded packages/.hi files, so each compilation goes faster. 2) Is there any way to do the link alone through the GHC API. Thanks, Neil From eric at seidel.io Fri Sep 4 15:29:59 2015 From: eric at seidel.io (Eric Seidel) Date: Fri, 04 Sep 2015 08:29:59 -0700 Subject: Unlifted data types In-Reply-To: <1441353701-sup-9422@sabre> References: <1441353701-sup-9422@sabre> Message-ID: <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of foo :: !Maybe Int -> !Maybe Int foo x = x bar = foo (Just undefined) are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct? Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while foo1 :: !Int -> !Int foo1 x = x and foo2 :: Int -> Int foo2 !x = x have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable? Eric On Fri, Sep 4, 2015, at 01:03, Edward Z. Yang wrote: > Hello friends, > > After many discussions and beers at ICFP, I've written up my current > best understanding of the unlifted data types proposal: > > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes > > Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, > Edward Kmett, and any others who I may have forgotten for crystallizing > this proposal. > > Cheers, > Edward > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ezyang at mit.edu Fri Sep 4 15:43:48 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 08:43:48 -0700 Subject: Unlifted data types In-Reply-To: <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> References: <1441353701-sup-9422@sabre> <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> Message-ID: <1441381088-sup-172@sabre> Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700: > You mention NFData in the motivation but then say that !Maybe !Int is > not allowed. This leads me to wonder what the semantics of > > foo :: !Maybe Int -> !Maybe Int > foo x = x > > bar = foo (Just undefined) > > are. Based on the FAQ it sounds like foo would *not* force the > undefined, is that correct? Yes. So maybe NFData is a *bad* example! > Also, there's a clear connection between these UnliftedTypes and > BangPatterns, but as I understand it the ! is essentially a new type > constructor. So while > > foo1 :: !Int -> !Int > foo1 x = x > > and > > foo2 :: Int -> Int > foo2 !x = x > > have the same runtime behavior, they have different types, so you can't > pass a regular Int to foo1. Is that desirable? Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions. Edward From ezyang at mit.edu Fri Sep 4 15:45:37 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 08:45:37 -0700 Subject: Unlifted data types In-Reply-To: <1441381088-sup-172@sabre> References: <1441353701-sup-9422@sabre> <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> <1441381088-sup-172@sabre> Message-ID: <1441381504-sup-5051@sabre> Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700: > Yes. Actually, you have a good point that we'd like to have functions > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we > can't generate 'Coercible' instances for these types unless Coercible becomes > polykinded. Perhaps we can make a new type class, or just magic > polymorphic functions. Michael Greenberg points out on Twitter that suspend must be a special form, just like lambda abstraction. Edward From eric at seidel.io Fri Sep 4 16:06:15 2015 From: eric at seidel.io (Eric Seidel) Date: Fri, 04 Sep 2015 09:06:15 -0700 Subject: Unlifted data types In-Reply-To: <1441381088-sup-172@sabre> References: <1441353701-sup-9422@sabre> <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> <1441381088-sup-172@sabre> Message-ID: <1441382775.352880.374932065.34A2C130@webmail.messagingengine.com> Another good example would be foo :: ![Int] -> ![Int] Does this force just the first constructor or the whole spine? My guess would be the latter. On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote: > Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700: > > You mention NFData in the motivation but then say that !Maybe !Int is > > not allowed. This leads me to wonder what the semantics of > > > > foo :: !Maybe Int -> !Maybe Int > > foo x = x > > > > bar = foo (Just undefined) > > > > are. Based on the FAQ it sounds like foo would *not* force the > > undefined, is that correct? > > Yes. So maybe NFData is a *bad* example! > > > Also, there's a clear connection between these UnliftedTypes and > > BangPatterns, but as I understand it the ! is essentially a new type > > constructor. So while > > > > foo1 :: !Int -> !Int > > foo1 x = x > > > > and > > > > foo2 :: Int -> Int > > foo2 !x = x > > > > have the same runtime behavior, they have different types, so you can't > > pass a regular Int to foo1. Is that desirable? > > Yes. Actually, you have a good point that we'd like to have functions > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we > can't generate 'Coercible' instances for these types unless Coercible > becomes > polykinded. Perhaps we can make a new type class, or just magic > polymorphic functions. > > Edward From dan.doel at gmail.com Fri Sep 4 16:57:42 2015 From: dan.doel at gmail.com (Dan Doel) Date: Fri, 4 Sep 2015 12:57:42 -0400 Subject: Unlifted data types In-Reply-To: <1441353701-sup-9422@sabre> References: <1441353701-sup-9422@sabre> Message-ID: All your examples are non-recursive types. So, if I have: data Nat = Zero | Suc Nat what is !Nat? Does it just have the outer-most part unlifted? Is the intention to make the !a in data type declarations first-class, so that when we say: data Nat = Zero | Suc !Nat the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain? On Fri, Sep 4, 2015 at 4:03 AM, Edward Z. Yang wrote: > Hello friends, > > After many discussions and beers at ICFP, I've written up my current > best understanding of the unlifted data types proposal: > > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes > > Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, > Edward Kmett, and any others who I may have forgotten for crystallizing > this proposal. > > Cheers, > Edward > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ezyang at mit.edu Fri Sep 4 18:12:39 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 11:12:39 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> Message-ID: <1441390306-sup-6240@sabre> Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700: > All your examples are non-recursive types. So, if I have: > > data Nat = Zero | Suc Nat > > what is !Nat? Does it just have the outer-most part unlifted? Just the outermost part. > Is the intention to make the !a in data type declarations first-class, > so that when we say: > > data Nat = Zero | Suc !Nat > > the !Nat part is now an entity in itself, and it is, for this > declaration, the set of naturals, whereas Nat is the flat domain? No, in fact, there is a semantic difference between this and strict fields (which Paul pointed out to me.) There's now an updated proposal on the Trac which partially solves this problem. Edward From ezyang at mit.edu Fri Sep 4 18:14:50 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 11:14:50 -0700 Subject: Unlifted data types In-Reply-To: <1441382775.352880.374932065.34A2C130@webmail.messagingengine.com> References: <1441353701-sup-9422@sabre> <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> <1441381088-sup-172@sabre> <1441382775.352880.374932065.34A2C130@webmail.messagingengine.com> Message-ID: <1441390373-sup-5413@sabre> Hello Eric, You can't tell; the head not withstanding, `[a]` is still a lazy list, so you would need to look at the function body to see if any extra forcing goes on. `Force` does not induce `seq`ing: it is an obligation for the call-site. (Added it to the FAQ). Edward Excerpts from Eric Seidel's message of 2015-09-04 09:06:15 -0700: > Another good example would be > > foo :: ![Int] -> ![Int] > > Does this force just the first constructor or the whole spine? My guess > would be the latter. > > On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote: > > Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700: > > > You mention NFData in the motivation but then say that !Maybe !Int is > > > not allowed. This leads me to wonder what the semantics of > > > > > > foo :: !Maybe Int -> !Maybe Int > > > foo x = x > > > > > > bar = foo (Just undefined) > > > > > > are. Based on the FAQ it sounds like foo would *not* force the > > > undefined, is that correct? > > > > Yes. So maybe NFData is a *bad* example! > > > > > Also, there's a clear connection between these UnliftedTypes and > > > BangPatterns, but as I understand it the ! is essentially a new type > > > constructor. So while > > > > > > foo1 :: !Int -> !Int > > > foo1 x = x > > > > > > and > > > > > > foo2 :: Int -> Int > > > foo2 !x = x > > > > > > have the same runtime behavior, they have different types, so you can't > > > pass a regular Int to foo1. Is that desirable? > > > > Yes. Actually, you have a good point that we'd like to have functions > > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we > > can't generate 'Coercible' instances for these types unless Coercible > > becomes > > polykinded. Perhaps we can make a new type class, or just magic > > polymorphic functions. > > > > Edward From dan.doel at gmail.com Fri Sep 4 20:09:26 2015 From: dan.doel at gmail.com (Dan Doel) Date: Fri, 4 Sep 2015 16:09:26 -0400 Subject: Unlifted data types In-Reply-To: <1441390306-sup-6240@sabre> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> Message-ID: Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind. So now we have partial, extended natural numbers: data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat A flat domain of natural numbers: data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat And two sets of natural numbers: Force FNat :: Unlifted data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing). I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead. Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached. ---- It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums. On Fri, Sep 4, 2015 at 2:12 PM, Edward Z. Yang wrote: > Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700: >> All your examples are non-recursive types. So, if I have: >> >> data Nat = Zero | Suc Nat >> >> what is !Nat? Does it just have the outer-most part unlifted? > > Just the outermost part. > >> Is the intention to make the !a in data type declarations first-class, >> so that when we say: >> >> data Nat = Zero | Suc !Nat >> >> the !Nat part is now an entity in itself, and it is, for this >> declaration, the set of naturals, whereas Nat is the flat domain? > > No, in fact, there is a semantic difference between this and strict > fields (which Paul pointed out to me.) There's now an updated proposal > on the Trac which partially solves this problem. > > Edward From ezyang at mit.edu Fri Sep 4 21:23:33 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 14:23:33 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> Message-ID: <1441400654-sup-1647@sabre> Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700: > Okay. That answers another question I had, which was whether MutVar# > and such would go in the new kind. > > So now we have partial, extended natural numbers: > > data PNat :: * where > PZero :: PNat > PSuc :: PNat -> PNat > > A flat domain of natural numbers: > > data FNat :: * where > FZero :: FNat > FSuc :: !FNat -> FNat > > And two sets of natural numbers: > > Force FNat :: Unlifted > > data UNat :: Unlifted where > UZero :: UNat > USuc :: UNat -> UNat > > And really perhaps two flat domains (and three sets), if you use Force > instead of !, which would differ on who ensures the evaluation. That's > kind of a lot of incompatible definitions of essentially the same > thing (PNat being the significantly different thing). > > I was kind of more enthused about first class !a. For instance, if you > think about the opening quote by Bob Harper, he's basically wrong. The > flat domain FNat is the natural numbers (existing in an overall lazy > language), and has the reasoning properties he wants to teach students > about with very little complication. It'd be satisfying to recognize > that unlifting the outer-most part gets you exactly there, with > whatever performance characteristics that implies. Or to get rid of ! > and use Unlifted definitions instead. > > Maybe backwards compatibility mandates the duplication, but it'd be > nice if some synthesis could be reached. I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write: let x = undefined y = Strict x in True No error occurs with: data Strict = Strict !a But an error occurs with: data Strict = Strict (Force a) One possibility for how to reconcile the difference for BC is to posit that there are just two different constructors: Strict :: a -> Strict a Strict! :: Force a -> Strict a But this kind of special handling is a bit bothersome. Consider: data SPair a b = SPair (!a, !b) The constructor has what type? Probably SPair :: (Force a, Force b) -> SPair a and not: SPair :: (a, b) -> SPair a > It'd also be good to think about/specify how this is going to interact > with unpacked/unboxed sums. I don't think it interacts any differently than with unpacked/unboxed products today. Edward From eir at cis.upenn.edu Fri Sep 4 21:26:45 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Fri, 4 Sep 2015 14:26:45 -0700 Subject: A process for reporting security-sensitive issues In-Reply-To: <55E93114.6050800@well-typed.com> References: <55E93114.6050800@well-typed.com> Message-ID: I agree with Adam. I've been a little worried about users relying on Safe Haskell, despite #9562. Advertising that Safe Haskell is just a "best effort" (for a rather high bar for "best") but not a guarantee would be nice. Richard On Sep 3, 2015, at 10:50 PM, Adam Gundry wrote: > On 03/09/15 08:22, Michael Smith wrote: >> I feel there should be some process for reporting security-sensitive issues >> in GHC -- for example, #9562 and #10826 in Trac. Perhaps something like the >> SensitiveTicketsPlugin [3] could be used? >> >> [1] https://ghc.haskell.org/trac/ghc/ticket/9562 >> [2] https://ghc.haskell.org/trac/ghc/ticket/10826 >> [3] https://trac-hacks.org/wiki/SensitiveTicketsPlugin > > Thanks for raising this. While I see where you are coming from, I'm > going to argue against it, because I think it creates a false impression > of the security guarantees GHC provides. Such a process may give the > impression that there are people directly tasked with handling such > security bugs, which is not currently the case. > > I think it is unreasonable for the security of a system to depend on GHC > having no type soundness bugs, particularly since GHC is actively used > for developing experimental type system features. #9562 has been open > for a year and we don't have a good solution. > > Relatedly, I think the Safe Haskell documentation should prominently > warn about the existence of #9562 and the possibility of other type > soundness bugs, like it does for compilation safety issues. > > What do others think? > > Adam > > > -- > Adam Gundry, Haskell Consultant > Well-Typed LLP, http://www.well-typed.com/ > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From roma at ro-che.info Fri Sep 4 21:41:38 2015 From: roma at ro-che.info (Roman Cheplyaka) Date: Sat, 5 Sep 2015 00:41:38 +0300 Subject: Unlifted data types In-Reply-To: <1441400654-sup-1647@sabre> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> Message-ID: <55EA1012.2070708@ro-che.info> On 05/09/15 00:23, Edward Z. Yang wrote: > I would certainly agree that in terms of the data that is representable, > there is not much difference; but there is a lot of difference for the > client between Force and a strict field. If I write: > > let x = undefined > y = Strict x > in True > > No error occurs with: > > data Strict = Strict !a > > But an error occurs with: > > data Strict = Strict (Force a) At what point does the error occur here? When evaluating True? What about the following two expressions? const False (let x = undefined y = Strict x in True) let x = undefined y = const False (Strict x) in True Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From roma at ro-che.info Fri Sep 4 21:43:19 2015 From: roma at ro-che.info (Roman Cheplyaka) Date: Sat, 5 Sep 2015 00:43:19 +0300 Subject: Unlifted data types In-Reply-To: <55EA1012.2070708@ro-che.info> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <55EA1012.2070708@ro-che.info> Message-ID: <55EA1077.9010705@ro-che.info> On 05/09/15 00:41, Roman Cheplyaka wrote: > On 05/09/15 00:23, Edward Z. Yang wrote: >> I would certainly agree that in terms of the data that is representable, >> there is not much difference; but there is a lot of difference for the >> client between Force and a strict field. If I write: >> >> let x = undefined >> y = Strict x >> in True >> >> No error occurs with: >> >> data Strict = Strict !a >> >> But an error occurs with: >> >> data Strict = Strict (Force a) > > At what point does the error occur here? When evaluating True? > > What about the following two expressions? > > const False > (let x = undefined > y = Strict x > in True) > > let x = undefined > y = const False (Strict x) > in True On second though, the second one shouldn't even compile because of the kind error, right? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From dan.doel at gmail.com Fri Sep 4 21:48:49 2015 From: dan.doel at gmail.com (Dan Doel) Date: Fri, 4 Sep 2015 17:48:49 -0400 Subject: Unlifted data types In-Reply-To: <1441400654-sup-1647@sabre> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> Message-ID: On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang wrote: > But this kind of special handling is a bit bothersome. Consider: > > data SPair a b = SPair (!a, !b) > > The constructor has what type? Probably > > SPair :: (Force a, Force b) -> SPair a > > and not: > > SPair :: (a, b) -> SPair a I don't really understand what this example is showing. I don't think SPair is a legal declaration in any scenario. - In current Haskell it's illegal; you can only put ! directly on fields - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a) > I don't think it interacts any differently than with unpacked/unboxed > products today. I meant like: If T :: Unlifted, then am I allowed to do: data U = MkU {-# UNPACK #-} T ... and what are its semantics? If T is a sum, presumably it's related to the unpacked sums proposal from a couple days ago. Does stuff from this proposal make that proposal simpler? Should they reference things in one another? Will there be optimizations that turn: data E a b :: Unlifted where L :: a -> E a b R :: b -> E a b into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes. -- Dan From dan.doel at gmail.com Fri Sep 4 21:56:03 2015 From: dan.doel at gmail.com (Dan Doel) Date: Fri, 4 Sep 2015 17:56:03 -0400 Subject: Unlifted data types In-Reply-To: <55EA1012.2070708@ro-che.info> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <55EA1012.2070708@ro-che.info> Message-ID: If x :: t, and t :: Unlifted, then let x = e in e' has a value that depends on evaluating e regardless of its use in e' (or other things in the let, if they exist). It would be like writing let !x = e in e' today. -- Dan On Fri, Sep 4, 2015 at 5:41 PM, Roman Cheplyaka wrote: > On 05/09/15 00:23, Edward Z. Yang wrote: >> I would certainly agree that in terms of the data that is representable, >> there is not much difference; but there is a lot of difference for the >> client between Force and a strict field. If I write: >> >> let x = undefined >> y = Strict x >> in True >> >> No error occurs with: >> >> data Strict = Strict !a >> >> But an error occurs with: >> >> data Strict = Strict (Force a) > > At what point does the error occur here? When evaluating True? > > What about the following two expressions? > > const False > (let x = undefined > y = Strict x > in True) > > let x = undefined > y = const False (Strict x) > in True > > Roman > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From mike at izbicki.me Fri Sep 4 23:39:24 2015 From: mike at izbicki.me (Mike Izbicki) Date: Fri, 4 Sep 2015 16:39:24 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: I'm still having trouble creating Core code that can extract superclass dictionaries from a given dictionary. I suspect the problem is that I don't actually understand what the Core code to do this is supposed to look like. I keep getting the errors mentioned above when I try what I think should work. Can anyone help me figure this out? Is there any chance this is a bug in how GHC parses Core? On Tue, Aug 25, 2015 at 9:24 PM, Mike Izbicki wrote: > The purpose of the plugin is to automatically improve the numerical > stability of Haskell code. It is supposed to identify numeric > expressions, then use Herbie (https://github.com/uwplse/herbie) to > generate a numerically stable version, then rewrite the numerically > stable version back into the code. The first two steps were really > easy. It's the last step of inserting back into the code that I'm > having tons of trouble with. Core is a lot more complicated than I > thought :) > > I'm not sure what you mean by the CoreExpr representation? Here's the > output of the pretty printer you gave: > App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp > Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy > TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] > ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD > {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) > (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp > Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy > TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = > "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try > = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd > [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many > Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many > Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy > (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy > TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = > "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = > }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd > [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), > absd = Use Many (UProd [Use Many > Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] > (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var > Id{$dFloating,aBM,TyConApp Floating [TyVarTy > TyVar{a}],VanillaId,Info{0,SpecInfo [] > ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD > {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy > TyVar{a},VanillaId,Info{0,SpecInfo [] > ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD > {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy > TyVar{a},VanillaId,Info{0,SpecInfo [] > ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma > {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = > Nothing, inl_act = AlwaysActive, inl_rule = > FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD > {strd = Lazy, absd = Use Many Used},0}}) > > You can find my pretty printer (and all the other code for the plugin) > at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 > > The function getDictMap > (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) > is where I'm constructing the dictionaries that are getting inserted > back into the Core. > > On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: >> It seems like in your App syntax you're having a non-function in function >> position. You can see this by looking at what failing function >> (splitFunTy_maybe) is doing: >> >> splitFunTy_maybe :: Type -> Maybe (Type, Type) >> -- ^ Attempts to extract the argument and result types from a type >> ... (definition is not important) ... >> >> Then it's used like this at the error site: >> >> (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ >> splitFunTy_maybe fun_ty >> >> In your case this function is returning Nothing and then exceptJust is >> signalling the panic. >> >> Your code looked correct to me, I don't see any problems with that. Maybe you're >> using something wrong as selectors. Could you paste CoreExpr representation of >> your program? >> >> It may also be the case that the panic is caused by something else, maybe your >> syntax is invalidating some assumptions/invariants in GHC but it's not >> immediately checked etc. Working at the Core level is frustrating at times. >> >> Can I ask what kind of plugin are you working on? >> >> (Btw, how did you generate this representation of AST? Did you write it >> manually? If you have a pretty-printer, would you mind sharing it?) >> >> 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >>> Thanks ?mer! >>> >>> I'm able to get dictionaries for the superclasses of a class now, but >>> I get an error whenever I try to get a dictionary for a >>> super-superclass. Here's the Haskell expression I'm working with: >>> >>> test1 :: Floating a => a -> a >>> test1 x1 = x1+x1 >>> >>> The original core is: >>> >>> + @ a $dNum_aJu x1 x1 >>> >>> But my plugin is replacing it with the core: >>> >>> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >>> >>> The only difference is the way I'm getting the Num dictionary. The >>> corresponding AST (annotated with variable names and types) is: >>> >>> App >>> (App >>> (App >>> (App >>> (Var +::forall a. Num a => a -> a -> a) >>> (Type a) >>> ) >>> (App >>> (Var $p1Fractional::forall a. Fractional a => Num a) >>> (App >>> (Var $p1Floating::forall a. Floating a => Fractional a) >>> (Var $dFloating_aJq::Floating a) >>> ) >>> ) >>> ) >>> (Var x1::'a') >>> ) >>> (Var x1::'a') >>> >>> When I insert, GHC gives the following error: >>> >>> ghc: panic! (the 'impossible' happened) >>> (GHC version 7.10.1 for x86_64-unknown-linux): >>> expectJust cpeBody:collect_args >>> >>> What am I doing wrong with extracting these super-superclass >>> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >>> figure out what it's trying to do, so I'm not sure why it's failing on >>> my core. >>> >>> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>>> Mike, here's a piece of code that may be helpful to you: >>>> >>>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>>> >>>> Copy this module to your plugin, it doesn't have any dependencies other than >>>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>>> >>>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>>> else we can learn what Core is generated for a given code, and reverse-engineer >>>> to figure out details. >>>> >>>> Hope it helps. >>>> >>>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>>> How can I get access to these superclass dictionaries? >>>>> >>>>> I don't have a working code for this but this should get you started: >>>>> >>>>> let ord_dictionary :: Id = ... >>>>> ord_class :: Class = ... >>>>> in >>>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>>> >>>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>>> >>>>> I assumed you already have ord_dictionary, it should be passed to your function >>>>> already if you had `(Ord a) => ` in your function. >>>>> >>>>> >>>>> Now I realized you asked for getting Num from Floating. I think you should >>>>> follow a similar path except you need two applications, first to get Fractional >>>>> from Floating and second to get Num from Fractional: >>>>> >>>>> mkApps (Var (head (classSCSels fractional_class))) >>>>> [mkApps (Var (head (classSCSels floating_class))) >>>>> [Var floating_dictionary]] >>>>> >>>>> Return value should be a Num dictionary. From dan.doel at gmail.com Sat Sep 5 01:21:29 2015 From: dan.doel at gmail.com (Dan Doel) Date: Fri, 4 Sep 2015 21:21:29 -0400 Subject: Unlifted data types In-Reply-To: <1441400654-sup-1647@sabre> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> Message-ID: Here are some additional thoughts. If we examine an analogue of some of your examples: data MutVar a = MV (MutVar# RealWorld a) main = do let mv# = undefined let mv = MV mv# putStrLn "Okay." The above is illegal. Instead we _must_ write: let !mv# = undefined which signals that evaluation is occurring. So it is impossible to accidentally go from: main = do let mv = MV undefined putStrLn "Okay." which prints "Okay.", to something that throws an exception, without having a pretty good indication that you're doing so. I would guess this is desirable, so perhaps it should be mandated for Unlifted as well. ---- However, the above point confuses me with respect to another example. The proposal says that: data Id :: * -> Unlifted where Id :: a -> Id a could/should be compiled with no overhead over `a`, like a newtype. However, if Unlifted things have operational semantics like #, what does the following do: let x :: Id a !x = Id undefined The ! should evaluate to the Id constructor, but we're not representing it, so it actually doesn't evaluate anything? But: let x :: Id a !x = undefined throws an exception? Whereas for newtypes, both throw exceptions with a !x definition, or don't with an x definition? Is it actually possible to make Id behave this way without any representational overhead? I'm a little skeptical. I think that only Force (and Box) might be able to have no representational overhead. -- Dan On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang wrote: > Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700: >> Okay. That answers another question I had, which was whether MutVar# >> and such would go in the new kind. >> >> So now we have partial, extended natural numbers: >> >> data PNat :: * where >> PZero :: PNat >> PSuc :: PNat -> PNat >> >> A flat domain of natural numbers: >> >> data FNat :: * where >> FZero :: FNat >> FSuc :: !FNat -> FNat >> >> And two sets of natural numbers: >> >> Force FNat :: Unlifted >> >> data UNat :: Unlifted where >> UZero :: UNat >> USuc :: UNat -> UNat >> >> And really perhaps two flat domains (and three sets), if you use Force >> instead of !, which would differ on who ensures the evaluation. That's >> kind of a lot of incompatible definitions of essentially the same >> thing (PNat being the significantly different thing). >> >> I was kind of more enthused about first class !a. For instance, if you >> think about the opening quote by Bob Harper, he's basically wrong. The >> flat domain FNat is the natural numbers (existing in an overall lazy >> language), and has the reasoning properties he wants to teach students >> about with very little complication. It'd be satisfying to recognize >> that unlifting the outer-most part gets you exactly there, with >> whatever performance characteristics that implies. Or to get rid of ! >> and use Unlifted definitions instead. >> >> Maybe backwards compatibility mandates the duplication, but it'd be >> nice if some synthesis could be reached. > > I would certainly agree that in terms of the data that is representable, > there is not much difference; but there is a lot of difference for the > client between Force and a strict field. If I write: > > let x = undefined > y = Strict x > in True > > No error occurs with: > > data Strict = Strict !a > > But an error occurs with: > > data Strict = Strict (Force a) > > One possibility for how to reconcile the difference for BC is to posit > that there are just two different constructors: > > Strict :: a -> Strict a > Strict! :: Force a -> Strict a > > But this kind of special handling is a bit bothersome. Consider: > > data SPair a b = SPair (!a, !b) > > The constructor has what type? Probably > > SPair :: (Force a, Force b) -> SPair a > > and not: > > SPair :: (a, b) -> SPair a > >> It'd also be good to think about/specify how this is going to interact >> with unpacked/unboxed sums. > > I don't think it interacts any differently than with unpacked/unboxed > products today. > > Edward From ezyang at mit.edu Sat Sep 5 03:38:27 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 04 Sep 2015 20:38:27 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> Message-ID: <1441423737-sup-9277@sabre> Excerpts from Dan Doel's message of 2015-09-04 14:48:49 -0700: > I don't really understand what this example is showing. I don't think > SPair is a legal > declaration in any scenario. > > - In current Haskell it's illegal; you can only put ! directly on fields > - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a) This is true. Perhaps it should be possible to define data types which are levity polymorphic, so SPair can kind as * -> * -> *, Unlifted -> Unlifted -> *, etc. > > I don't think it interacts any differently than with unpacked/unboxed > > products today. > > I meant like: > > If T :: Unlifted, then am I allowed to do: > > data U = MkU {-# UNPACK #-} T ... > > and what are its semantics? If T is a sum, presumably it's related to > the unpacked > sums proposal from a couple days ago. Does stuff from this proposal > make that proposal > simpler? Should they reference things in one another? Ah, this is a good question. I think you can just directly UNPACK unlifted types, without a strict bang pattern. I've added a note to the proposal. > Will there be optimizations that turn: > > data E a b :: Unlifted where > L :: a -> E a b > R :: b -> E a b > > into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes. Yes, it should follow the same rules as https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes#Unpacking Edward From omeragacan at gmail.com Sat Sep 5 04:16:34 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Sat, 5 Sep 2015 00:16:34 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Hi Mike, I'll try to hack an example for you some time tomorrow(I'm returning from ICFP and have some long flights ahead of me). But in the meantime, here's a working Core code, generated by GHC: f_rjH :: forall a_alz. Ord a_alz => a_alz -> Bool f_rjH = \ (@ a_aCH) ($dOrd_aCI :: Ord a_aCH) (eta_B1 :: a_aCH) -> == @ a_aCH (GHC.Classes.$p1Ord @ a_aCH $dOrd_aCI) eta_B1 eta_B1 You can clearly see here how Eq dictionary is selected from Ord dicitonary($dOrd_aCI in the example), it's just an application of selector to type and dictionary, that's all. This is generated from this code: {-# NOINLINE f #-} f :: Ord a => a -> Bool f x = x == x Compile it with this: ghc --make -fforce-recomp -O0 -ddump-simpl -ddump-to-file Main.hs -dsuppress-idinfo > Can anyone help me figure this out? Is there any chance this is a bug in how > GHC parses Core? This seems unlikely, because GHC doesn't have a Core parser and there's no Core parsing going on here, you're parsing your Code in the form of AST(CoreExpr, CoreProgram etc. defined in CoreSyn.hs). Did you mean something else and am I misunderstanding? 2015-09-04 19:39 GMT-04:00 Mike Izbicki : > I'm still having trouble creating Core code that can extract > superclass dictionaries from a given dictionary. I suspect the > problem is that I don't actually understand what the Core code to do > this is supposed to look like. I keep getting the errors mentioned > above when I try what I think should work. > > Can anyone help me figure this out? Is there any chance this is a bug > in how GHC parses Core? > > On Tue, Aug 25, 2015 at 9:24 PM, Mike Izbicki wrote: >> The purpose of the plugin is to automatically improve the numerical >> stability of Haskell code. It is supposed to identify numeric >> expressions, then use Herbie (https://github.com/uwplse/herbie) to >> generate a numerically stable version, then rewrite the numerically >> stable version back into the code. The first two steps were really >> easy. It's the last step of inserting back into the code that I'm >> having tons of trouble with. Core is a lot more complicated than I >> thought :) >> >> I'm not sure what you mean by the CoreExpr representation? Here's the >> output of the pretty printer you gave: >> App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp >> Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy >> TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] >> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >> {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) >> (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp >> Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy >> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >> "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try >> = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >> [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many >> Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many >> Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy >> (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy >> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >> "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = >> }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >> [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), >> absd = Use Many (UProd [Use Many >> Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] >> (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var >> Id{$dFloating,aBM,TyConApp Floating [TyVarTy >> TyVar{a}],VanillaId,Info{0,SpecInfo [] >> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >> {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy >> TyVar{a},VanillaId,Info{0,SpecInfo [] >> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >> {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy >> TyVar{a},VanillaId,Info{0,SpecInfo [] >> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >> Nothing, inl_act = AlwaysActive, inl_rule = >> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >> {strd = Lazy, absd = Use Many Used},0}}) >> >> You can find my pretty printer (and all the other code for the plugin) >> at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 >> >> The function getDictMap >> (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) >> is where I'm constructing the dictionaries that are getting inserted >> back into the Core. >> >> On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: >>> It seems like in your App syntax you're having a non-function in function >>> position. You can see this by looking at what failing function >>> (splitFunTy_maybe) is doing: >>> >>> splitFunTy_maybe :: Type -> Maybe (Type, Type) >>> -- ^ Attempts to extract the argument and result types from a type >>> ... (definition is not important) ... >>> >>> Then it's used like this at the error site: >>> >>> (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ >>> splitFunTy_maybe fun_ty >>> >>> In your case this function is returning Nothing and then exceptJust is >>> signalling the panic. >>> >>> Your code looked correct to me, I don't see any problems with that. Maybe you're >>> using something wrong as selectors. Could you paste CoreExpr representation of >>> your program? >>> >>> It may also be the case that the panic is caused by something else, maybe your >>> syntax is invalidating some assumptions/invariants in GHC but it's not >>> immediately checked etc. Working at the Core level is frustrating at times. >>> >>> Can I ask what kind of plugin are you working on? >>> >>> (Btw, how did you generate this representation of AST? Did you write it >>> manually? If you have a pretty-printer, would you mind sharing it?) >>> >>> 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >>>> Thanks ?mer! >>>> >>>> I'm able to get dictionaries for the superclasses of a class now, but >>>> I get an error whenever I try to get a dictionary for a >>>> super-superclass. Here's the Haskell expression I'm working with: >>>> >>>> test1 :: Floating a => a -> a >>>> test1 x1 = x1+x1 >>>> >>>> The original core is: >>>> >>>> + @ a $dNum_aJu x1 x1 >>>> >>>> But my plugin is replacing it with the core: >>>> >>>> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >>>> >>>> The only difference is the way I'm getting the Num dictionary. The >>>> corresponding AST (annotated with variable names and types) is: >>>> >>>> App >>>> (App >>>> (App >>>> (App >>>> (Var +::forall a. Num a => a -> a -> a) >>>> (Type a) >>>> ) >>>> (App >>>> (Var $p1Fractional::forall a. Fractional a => Num a) >>>> (App >>>> (Var $p1Floating::forall a. Floating a => Fractional a) >>>> (Var $dFloating_aJq::Floating a) >>>> ) >>>> ) >>>> ) >>>> (Var x1::'a') >>>> ) >>>> (Var x1::'a') >>>> >>>> When I insert, GHC gives the following error: >>>> >>>> ghc: panic! (the 'impossible' happened) >>>> (GHC version 7.10.1 for x86_64-unknown-linux): >>>> expectJust cpeBody:collect_args >>>> >>>> What am I doing wrong with extracting these super-superclass >>>> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >>>> figure out what it's trying to do, so I'm not sure why it's failing on >>>> my core. >>>> >>>> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>>>> Mike, here's a piece of code that may be helpful to you: >>>>> >>>>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>>>> >>>>> Copy this module to your plugin, it doesn't have any dependencies other than >>>>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>>>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>>>> >>>>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>>>> else we can learn what Core is generated for a given code, and reverse-engineer >>>>> to figure out details. >>>>> >>>>> Hope it helps. >>>>> >>>>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>>>> How can I get access to these superclass dictionaries? >>>>>> >>>>>> I don't have a working code for this but this should get you started: >>>>>> >>>>>> let ord_dictionary :: Id = ... >>>>>> ord_class :: Class = ... >>>>>> in >>>>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>>>> >>>>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>>>> >>>>>> I assumed you already have ord_dictionary, it should be passed to your function >>>>>> already if you had `(Ord a) => ` in your function. >>>>>> >>>>>> >>>>>> Now I realized you asked for getting Num from Floating. I think you should >>>>>> follow a similar path except you need two applications, first to get Fractional >>>>>> from Floating and second to get Num from Fractional: >>>>>> >>>>>> mkApps (Var (head (classSCSels fractional_class))) >>>>>> [mkApps (Var (head (classSCSels floating_class))) >>>>>> [Var floating_dictionary]] >>>>>> >>>>>> Return value should be a Num dictionary. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From omeragacan at gmail.com Sat Sep 5 04:18:51 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Sat, 5 Sep 2015 00:18:51 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Typo: "You're parsing your code" I mean "You're passing your code" 2015-09-05 0:16 GMT-04:00 ?mer Sinan A?acan : > Hi Mike, > > I'll try to hack an example for you some time tomorrow(I'm returning from ICFP > and have some long flights ahead of me). > > But in the meantime, here's a working Core code, generated by GHC: > > f_rjH :: forall a_alz. Ord a_alz => a_alz -> Bool > f_rjH = > \ (@ a_aCH) ($dOrd_aCI :: Ord a_aCH) (eta_B1 :: a_aCH) -> > == @ a_aCH (GHC.Classes.$p1Ord @ a_aCH $dOrd_aCI) eta_B1 eta_B1 > > You can clearly see here how Eq dictionary is selected from Ord > dicitonary($dOrd_aCI in the example), it's just an application of selector to > type and dictionary, that's all. > > This is generated from this code: > > {-# NOINLINE f #-} > f :: Ord a => a -> Bool > f x = x == x > > Compile it with this: > > ghc --make -fforce-recomp -O0 -ddump-simpl -ddump-to-file Main.hs > -dsuppress-idinfo > >> Can anyone help me figure this out? Is there any chance this is a bug in how >> GHC parses Core? > > This seems unlikely, because GHC doesn't have a Core parser and there's no Core > parsing going on here, you're parsing your Code in the form of AST(CoreExpr, > CoreProgram etc. defined in CoreSyn.hs). Did you mean something else and am I > misunderstanding? > > 2015-09-04 19:39 GMT-04:00 Mike Izbicki : >> I'm still having trouble creating Core code that can extract >> superclass dictionaries from a given dictionary. I suspect the >> problem is that I don't actually understand what the Core code to do >> this is supposed to look like. I keep getting the errors mentioned >> above when I try what I think should work. >> >> Can anyone help me figure this out? Is there any chance this is a bug >> in how GHC parses Core? >> >> On Tue, Aug 25, 2015 at 9:24 PM, Mike Izbicki wrote: >>> The purpose of the plugin is to automatically improve the numerical >>> stability of Haskell code. It is supposed to identify numeric >>> expressions, then use Herbie (https://github.com/uwplse/herbie) to >>> generate a numerically stable version, then rewrite the numerically >>> stable version back into the code. The first two steps were really >>> easy. It's the last step of inserting back into the code that I'm >>> having tons of trouble with. Core is a lot more complicated than I >>> thought :) >>> >>> I'm not sure what you mean by the CoreExpr representation? Here's the >>> output of the pretty printer you gave: >>> App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp >>> Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy >>> TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] >>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>> {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) >>> (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp >>> Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy >>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>> "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try >>> = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>> [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many >>> Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many >>> Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy >>> (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy >>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>> "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = >>> }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>> [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), >>> absd = Use Many (UProd [Use Many >>> Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] >>> (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var >>> Id{$dFloating,aBM,TyConApp Floating [TyVarTy >>> TyVar{a}],VanillaId,Info{0,SpecInfo [] >>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>> {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy >>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>> {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy >>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>> Nothing, inl_act = AlwaysActive, inl_rule = >>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>> {strd = Lazy, absd = Use Many Used},0}}) >>> >>> You can find my pretty printer (and all the other code for the plugin) >>> at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 >>> >>> The function getDictMap >>> (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) >>> is where I'm constructing the dictionaries that are getting inserted >>> back into the Core. >>> >>> On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: >>>> It seems like in your App syntax you're having a non-function in function >>>> position. You can see this by looking at what failing function >>>> (splitFunTy_maybe) is doing: >>>> >>>> splitFunTy_maybe :: Type -> Maybe (Type, Type) >>>> -- ^ Attempts to extract the argument and result types from a type >>>> ... (definition is not important) ... >>>> >>>> Then it's used like this at the error site: >>>> >>>> (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ >>>> splitFunTy_maybe fun_ty >>>> >>>> In your case this function is returning Nothing and then exceptJust is >>>> signalling the panic. >>>> >>>> Your code looked correct to me, I don't see any problems with that. Maybe you're >>>> using something wrong as selectors. Could you paste CoreExpr representation of >>>> your program? >>>> >>>> It may also be the case that the panic is caused by something else, maybe your >>>> syntax is invalidating some assumptions/invariants in GHC but it's not >>>> immediately checked etc. Working at the Core level is frustrating at times. >>>> >>>> Can I ask what kind of plugin are you working on? >>>> >>>> (Btw, how did you generate this representation of AST? Did you write it >>>> manually? If you have a pretty-printer, would you mind sharing it?) >>>> >>>> 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >>>>> Thanks ?mer! >>>>> >>>>> I'm able to get dictionaries for the superclasses of a class now, but >>>>> I get an error whenever I try to get a dictionary for a >>>>> super-superclass. Here's the Haskell expression I'm working with: >>>>> >>>>> test1 :: Floating a => a -> a >>>>> test1 x1 = x1+x1 >>>>> >>>>> The original core is: >>>>> >>>>> + @ a $dNum_aJu x1 x1 >>>>> >>>>> But my plugin is replacing it with the core: >>>>> >>>>> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >>>>> >>>>> The only difference is the way I'm getting the Num dictionary. The >>>>> corresponding AST (annotated with variable names and types) is: >>>>> >>>>> App >>>>> (App >>>>> (App >>>>> (App >>>>> (Var +::forall a. Num a => a -> a -> a) >>>>> (Type a) >>>>> ) >>>>> (App >>>>> (Var $p1Fractional::forall a. Fractional a => Num a) >>>>> (App >>>>> (Var $p1Floating::forall a. Floating a => Fractional a) >>>>> (Var $dFloating_aJq::Floating a) >>>>> ) >>>>> ) >>>>> ) >>>>> (Var x1::'a') >>>>> ) >>>>> (Var x1::'a') >>>>> >>>>> When I insert, GHC gives the following error: >>>>> >>>>> ghc: panic! (the 'impossible' happened) >>>>> (GHC version 7.10.1 for x86_64-unknown-linux): >>>>> expectJust cpeBody:collect_args >>>>> >>>>> What am I doing wrong with extracting these super-superclass >>>>> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >>>>> figure out what it's trying to do, so I'm not sure why it's failing on >>>>> my core. >>>>> >>>>> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>>>>> Mike, here's a piece of code that may be helpful to you: >>>>>> >>>>>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>>>>> >>>>>> Copy this module to your plugin, it doesn't have any dependencies other than >>>>>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>>>>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>>>>> >>>>>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>>>>> else we can learn what Core is generated for a given code, and reverse-engineer >>>>>> to figure out details. >>>>>> >>>>>> Hope it helps. >>>>>> >>>>>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>>>>> How can I get access to these superclass dictionaries? >>>>>>> >>>>>>> I don't have a working code for this but this should get you started: >>>>>>> >>>>>>> let ord_dictionary :: Id = ... >>>>>>> ord_class :: Class = ... >>>>>>> in >>>>>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>>>>> >>>>>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>>>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>>>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>>>>> >>>>>>> I assumed you already have ord_dictionary, it should be passed to your function >>>>>>> already if you had `(Ord a) => ` in your function. >>>>>>> >>>>>>> >>>>>>> Now I realized you asked for getting Num from Floating. I think you should >>>>>>> follow a similar path except you need two applications, first to get Fractional >>>>>>> from Floating and second to get Num from Fractional: >>>>>>> >>>>>>> mkApps (Var (head (classSCSels fractional_class))) >>>>>>> [mkApps (Var (head (classSCSels floating_class))) >>>>>>> [Var floating_dictionary]] >>>>>>> >>>>>>> Return value should be a Num dictionary. >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ezyang at mit.edu Sat Sep 5 07:06:26 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Sat, 05 Sep 2015 00:06:26 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> Message-ID: <1441436053-sup-5590@sabre> Excerpts from Dan Doel's message of 2015-09-04 18:21:29 -0700: > Here are some additional thoughts. > > If we examine an analogue of some of your examples: > > data MutVar a = MV (MutVar# RealWorld a) > > main = do > let mv# = undefined > let mv = MV mv# > putStrLn "Okay." > > The above is illegal. Instead we _must_ write: This doesn't typecheck, but for a different reason: undefined :: a where a :: *, so you can't match up the kinds. error is actually extremely special in this case: it lives in OpenKind and matches both * and #. But let's suppose that we s/undefined/error "foo"/... > let !mv# = undefined > > which signals that evaluation is occurring. Also not true. Because error "foo" is inferred to have kind #, the bang pattern happens implicitly. > So it is impossible to > accidentally go from: > > main = do > let mv = MV undefined > putStrLn "Okay." > > which prints "Okay.", to something that throws an exception, without > having a pretty good indication that you're doing so. I would guess > this is desirable, so perhaps it should be mandated for Unlifted as > well. Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly. > However, the above point confuses me with respect to another example. > The proposal says that: > > data Id :: * -> Unlifted where > Id :: a -> Id a > > could/should be compiled with no overhead over `a`, like a newtype. > However, if Unlifted things have operational semantics like #, what > does the following do: > > let x :: Id a > !x = Id undefined > > The ! should evaluate to the Id constructor, but we're not > representing it, so it actually doesn't evaluate anything? But: That's correct. Id is a box containing a lifted value. The box is unlifted, but the inner value can be lifted. > let x :: Id a > !x = undefined > > throws an exception? Yes, exactly. > Whereas for newtypes, both throw exceptions with > a !x definition, or don't with an x definition? Also correct. They key thing is to distinguish error in kind * and error in kind #. You can make a table: | Id (error "foo") | error "foo" | ---------------------+-----------------------+-------------------+ newtype Id :: * -> * | error "foo" :: * | error "foo" :: * | data Id :: * -> # | Id (error "foo" :: *) | error "foo" :: # | > Is it actually > possible to make Id behave this way without any representational > overhead? Yes. The reason is that an error "foo" :: # *immediately fails* (rather than attempt to allocate an Id). So the outer level of error doesn't ever need to be represented on the heap, so we can just represent the inner liftedness. Here's another way of looking at it: error in kind # is not a bottom at all. It's just a way of bailing immediately. HOWEVER... > I'm a little skeptical. I think that only Force (and Box) might be > able to have no representational overhead. It seems like it might be easier to explain if just Force and Box get optimized, and we don't bother with others; I only really care about those two operators being optimized. Edward From andrew.gibiansky at gmail.com Sat Sep 5 08:39:02 2015 From: andrew.gibiansky at gmail.com (Andrew Gibiansky) Date: Sat, 5 Sep 2015 01:39:02 -0700 Subject: Proposal: Argument Do Message-ID: Trac: https://ghc.haskell.org/trac/ghc/ticket/10843 I would like the following to be valid Haskell code: main = when True do putStrLn "Hello!" Instead of requiring a dollar sign before the "do". This would parse as main = when True (do putStrLn "Hello!") Has this been tried before? It seems fairly simple -- is there some complexity I'm missing? I've always been confused as to why the parser requires `$` there, and I've heard a lot of others ask about this as well. Perhaps we could fix that? PS. Regardless of whether this goes anywhere, it was fun to learn how to hack on GHC. It was surprisingly easy; I wrote up my experience here . The GHC wiki is outstanding; pretty much every intro question about ghc development I had was answered on a fairly easy-to-find wiki age. (Except for some stuff related to generating documentation and docbook, but whatever.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Sat Sep 5 09:32:41 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Sat, 5 Sep 2015 11:32:41 +0200 Subject: Proposal: Argument Do In-Reply-To: References: Message-ID: Hi Andrew, thank you for the write-up. There are some good hints in there for how to make the documentation better. If you had used `BuildFlavour = stage2`, as the Newcomers page suggests, you'd have had some less trouble. I'll go and edit the HowtomakeGHCbuildquickly section, because it is outdated. > From the Newcomers page, it?s not quite clear exactly how to make it only build Stage 2, even though it suggests doing so. The newcomers page says: - ## edit build.mk to remove the comment marker # on the line stage=2 - To speed up the development cycle, the final edit of build.mk makes sure that only the stage-2 compiler will be rebuild after this (see here about stages). Maybe you missed the comment about editing build.mk? Can you make suggestions how to make this more clear? I added some whitespace, but I'm not sure that's enough. Thanks, Thomas > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Sat Sep 5 09:43:37 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Sat, 5 Sep 2015 11:43:37 +0200 Subject: Proposal: Argument Do In-Reply-To: References: Message-ID: > > If you had used `BuildFlavour = stage2` as the Newcomers page suggests, > you'd have had some less trouble. > That should say `BuildFlavour = devel2`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.gibiansky at gmail.com Sat Sep 5 14:14:56 2015 From: andrew.gibiansky at gmail.com (Andrew Gibiansky) Date: Sat, 5 Sep 2015 07:14:56 -0700 Subject: Proposal: Argument Do In-Reply-To: References: Message-ID: Thomas, Thanks for cleaning stuff up on the Newcomers page and others. I think all the things that were somewhat confusing before are now much clearer and less vague. -- Andrew On Sat, Sep 5, 2015 at 2:43 AM, Thomas Miedema wrote: > If you had used `BuildFlavour = stage2` as the Newcomers page suggests, >> you'd have had some less trouble. >> > > That should say `BuildFlavour = devel2`. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Sat Sep 5 17:35:44 2015 From: dan.doel at gmail.com (Dan Doel) Date: Sat, 5 Sep 2015 13:35:44 -0400 Subject: Unlifted data types In-Reply-To: <1441436053-sup-5590@sabre> References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <1441436053-sup-5590@sabre> Message-ID: On Sat, Sep 5, 2015 at 3:06 AM, Edward Z. Yang wrote: >> If we examine an analogue of some of your examples: >> >> data MutVar a = MV (MutVar# RealWorld a) >> >> main = do >> let mv# = undefined >> let mv = MV mv# >> putStrLn "Okay." >> >> The above is illegal. Instead we _must_ write: > > This doesn't typecheck, but for a different reason: undefined :: a > where a :: *, so you can't match up the kinds. > > error is actually extremely special in this case: it lives in OpenKind > and matches both * and #. But let's suppose that we > s/undefined/error "foo"/... > >> let !mv# = undefined >> >> which signals that evaluation is occurring. > > Also not true. Because error "foo" is inferred to have kind #, the bang > pattern happens implicitly. I tried with `error` first, and it worked exactly the way I described. But I guess it's a type inference weirdness. If I annotate mv# with MutVar# it will work, whereas otherwise it will be inferred that mv# :: a where a :: *, instead of #. Whereas !x is a pattern which requires monomorphism of x, and so it figures out mv# :: MutVar# .... Kind of an odd corner case where breaking cycles causes things _not_ to type check, due to open kinds not being first class. I thought I remembered that at some point it was decided that `let` bindings of unboxed things should be required to have bangs on the bindings, to indicate the evaluation order. Maybe I'm thinking of something else (was it that it was originally required and we got rid of it?). > Nope, if you just float the error call out of MV, you will go from > "Okay." to an exception. Notice that *data constructors* are what are > used to induce suspension. This is why we don't have a 'suspend' > special form; instead, 'Box' is used directly. I know that it's the floating that makes a difference, not the bang pattern. The point would be to make the syntax require the bang pattern to give a visual indication of when it happens, and make it illegal to look like you're doing a normal let that doesn't change the value (although having it actually be a bang pattern would be bad, because it'd restrict polymorphism of the definition). Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing. For instance, we can go from: let mv = MutVar undefined to: let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv# and the result is the same, because it is the definition of mv that is lazy. Constructors in complex expressions---and all subexpressions for that matter---just get compiled this way. E.G. let f :: MutVar# RealWorld a -> MutVar a f mv# = f mv# in flip const (f undefined) $ putStrLn "okay" No constructors involved, but no error. >> Is it actually >> possible to make Id behave this way without any representational >> overhead? > > Yes. The reason is that an error "foo" :: # *immediately fails* (rather > than attempt to allocate an Id). So the outer level of error doesn't > ever need to be represented on the heap, so we can just represent the > inner liftedness. Okay. So, there isn't representational overhead, but there is overhead, where you call a function or something (which will just return its argument), whereas newtype constructors end up not having any cost whatsoever? -- Dan From singpolyma at singpolyma.net Sat Sep 5 20:06:53 2015 From: singpolyma at singpolyma.net (Stephen Paul Weber) Date: Sat, 5 Sep 2015 20:06:53 +0000 Subject: more releases In-Reply-To: References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> Message-ID: <20150905200653.GC7303@singpolyma.net> >having a large number of versions of GHC out there can make it difficult >for library authors, package curators, and large open source projects, due >to variety of what people are using. For point releases, if we do it right, this *should* not happen, since the changes *should* be backwards-compatible and so testing against the oldest release on the current major version *should* mean all subsequent point releases work as well. IMHO, any violation of this assumption *should* be considered a (serious) bug. From hvr at gnu.org Sun Sep 6 14:06:00 2015 From: hvr at gnu.org (Herbert Valerio Riedel) Date: Sun, 06 Sep 2015 16:06:00 +0200 Subject: Arcanist "lite" Haskell reimplementation (was: Proposal: accept pull requests on GitHub) In-Reply-To: (Thomas Miedema's message of "Thu, 3 Sep 2015 11:53:40 +0200") References: Message-ID: <87si6rprqv.fsf@gnu.org> On 2015-09-03 at 11:53:40 +0200, Thomas Miedema wrote: [...] > In my opinion it's is a waste of our time trying to improve `arc` (it is > 34000 lines of PHP btw + another 70000 LOC for libphutil), when `pull > requests` are an obvious alternative that most of the Haskell community > already uses. [...] I went ahead wasting some time and hacked up `arc-lite` for fun: https://github.com/haskell-infra/arc-lite It's currently at 407 Haskell SLOCs according to sloccount(1), and emulates the `arc` CLI as a drop-in replacement. As a proof-of-concept I've implemented the 3 simple operations - `arc install-certificate` - `arc list` - `arc call-conduit` If we wasted even more time, this could result in - Simplify installation of Arcanist for GHC contributors via Hacked (i.e. just `cabal install arc-lite`) - Implement a simple `arc diff`-like operation for submitting patches to Phabricator - Implement convenience operations tailored to GHC development - Teach arc-lite to behave more Git-idomatic - Make `arc-lite` automatically manage multi-commit code-reviews by splitting them up and submit them as multiple inter-dependendant code-revisions - ... Any comments? Cheers, hvr --8<---------------cut here---------------start------------->8--- arc-list - Arcanist "lite" (CLI tool for Phabricator) Usage: arc-lite [--verbose] [--conduit-token TOKEN] [--conduit-uri URI] COMMAND Available options: -h,--help Show this help text --verbose Whether to be verbose --conduit-token TOKEN Ignore configured credentials and use an explicit API token instead --conduit-uri URI Ignore configured Conduit URI and use an explicit one instead Available commands: list List your open Differential revisions call-conduit Perform raw Conduit method call install-certificate Installs Conduit credentials into your ~/.arcrc for the given install of Phabricator --8<---------------cut here---------------end--------------->8--- From dan.doel at gmail.com Sun Sep 6 20:56:35 2015 From: dan.doel at gmail.com (Dan Doel) Date: Sun, 6 Sep 2015 16:56:35 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <1441436053-sup-5590@sabre> Message-ID: On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel wrote: > Also, the constructor isn't exactly relevant, so much as whether the > unlifted error occurs inside the definition of a lifted thing. So, in light of this, `Box` is not necessary to define `suspend`. We can simply write: suspend :: Force a -> a suspend (Force x) = x and the fact that `a` has kind * means that `suspend undefined` only throws an exception if you inspect it. `Box` as currently defined (not the previous GADT definition) is novel in that it allows you to suspend unlifted types that weren't derived from `Force`. And it would probably be useful to have coercions between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I think) it is not necessary for mediating between `Force a` and `a`. -- Dan From simonpj at microsoft.com Mon Sep 7 08:17:12 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 08:17:12 +0000 Subject: Thanks to Reid and Thomas Message-ID: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> Thomas, Reid, As I get back from ICFP, I?d like to take the opportunity to thank you for huge amount of work that you two personally have put into GHC recently. Your interventions are always thoughtful, supportive, and on target. GHC is a huge project, and lots of people contribute to it. I am truly grateful to all of them. But you two have been particularly active in the last year and I wanted to say thank you. Onward and upward, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.knop.nospam at gmail.com Mon Sep 7 09:31:19 2015 From: william.knop.nospam at gmail.com (William Knop) Date: Mon, 7 Sep 2015 05:31:19 -0400 Subject: Thanks to Reid and Thomas In-Reply-To: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> References: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <0699E066-8024-43E3-8451-D0990B11EA4C@gmail.com> Onward and upward! Those who are dedicated to getting things done on a day to day basis-- you have done a great service for us all and I can't properly express my appreciation. Making GHC sensible to the the rest of us is so important. Those who presented have enlightened and excited. I especially look forward to the confluence of automated static complexity analysis and super compilation, as well as the ideas surrounding "levity" in dependent type theory. I idly wonder about how the ideas from homotopy type theory WRT cubical sets might fit in. Truly interesting stuff. Cheers and thank you for your hard work, Will > On Sep 7, 2015, at 4:17 AM, Simon Peyton Jones wrote: > > Thomas, Reid, > > As I get back from ICFP, I?d like to take the opportunity to thank you for huge amount of work that you two personally have put into GHC recently. Your interventions are always thoughtful, supportive, and on target. > > GHC is a huge project, and lots of people contribute to it. I am truly grateful to all of them. But you two have been particularly active in the last year and I wanted to say thank you. > > Onward and upward, > > Simon > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at joachim-breitner.de Mon Sep 7 11:02:02 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 07 Sep 2015 13:02:02 +0200 Subject: RFC: Unpacking sum types In-Reply-To: References: Message-ID: <1441623722.1570.25.camel@joachim-breitner.de> Hi, Am Dienstag, den 01.09.2015, 10:23 -0700 schrieb Johan Tibell: > I have a draft design for unpacking sum types that I'd like some > feedback on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes The current proposed layout for a data D a = D a {-# UNPACK #-} !(Maybe a) would be [D?s pointer] [a] [tag (0 or 1)] [Just?s a] So the representation of D foo (Just bar) is [D_info] [&foo] [1] [&bar] and of D foo Nothing is [D_info] [&foo] [0] [&dummy] where dummy is something that makes the GC happy. But assuming this dummy object is something that is never a valid heap objects of its own, then this should be sufficient to distinguish the two cases, and we could actually have that the representation of D foo (Just bar) is [D_info] [&foo] [&bar] and of D foo Nothing is [D_info] [&foo] [&dummy] and an case analysis on D would compare the pointer in the third word with the well-known address of dummy to determine if we have Nothing or Just. This saves one word. If we generate a number of such static dummy objects, we can generalize this tag-field avoiding trick to other data types than Maybe. It seems that it is worth doing that if * the number of constructors is no more than the number of static dummy objects, and * there is one constructor which has more pointer fields than all other constructors. Also, this trick cannot be applied repeatedly: If we have data D = D {-# UNPACK #-} !(Maybe a) | D'Nothing data E = E {-# UNPACK #-} !(D a) then it cannot be applied when unpacking D into E. (Or maybe it can, but care has to be taken that D?s Nothing is represented by a different dummy object than Maybe?s Nothing.) Anyways, this is an optimization that can be implemented once unboxed sum type are finished and working reliably. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From simonpj at microsoft.com Mon Sep 7 11:56:07 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 11:56:07 +0000 Subject: Thanks to Reid and Thomas In-Reply-To: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> References: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <729f3b24078b4732b2a03e521755560f@DB4PR30MB030.064d.mgd.msft.net> PS: auto-complete failed me. I meant Reid Barton, not Reinhard Wilhelm, of course ?. Sorry Reid. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Simon Peyton Jones Sent: 07 September 2015 09:17 To: Thomas Miedema; Reinhard Wilhelm Cc: ghc-devs at haskell.org Subject: Thanks to Reid and Thomas Thomas, Reid, As I get back from ICFP, I?d like to take the opportunity to thank you for huge amount of work that you two personally have put into GHC recently. Your interventions are always thoughtful, supportive, and on target. GHC is a huge project, and lots of people contribute to it. I am truly grateful to all of them. But you two have been particularly active in the last year and I wanted to say thank you. Onward and upward, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Mon Sep 7 12:33:54 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Mon, 7 Sep 2015 14:33:54 +0200 Subject: Thanks to Reid and Thomas In-Reply-To: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> References: <4ae0e9fa716745f8b741e0a877ff6611@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On Mon, Sep 7, 2015 at 10:17 AM, Simon Peyton Jones wrote: > Thomas, Reid, > > > > As I get back from ICFP, I?d like to take the opportunity to thank you for > huge amount of work that you two personally have put into GHC recently. > Your interventions are always thoughtful, supportive, and on target. > > > GHC is a huge project, and lots of people contribute to it. I am truly > grateful to all of them. But you two have been particularly active in the > last year and I wanted to say thank you. > Thank you for the kind words. -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Mon Sep 7 13:47:01 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 13:47:01 +0000 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> Message-ID: <1469c7be53ed4f0dab3872de9fe5ad54@DB4PR30MB030.064d.mgd.msft.net> I am very much at the ignorant end of this debate: I'll just use whatever I'm told to use. But I do resonate with this observation from Austin: | For one, having two code review tools of any form is completely | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to | switch, we should just switch. Having to have people decide how to | contribute with two tools is as crazy as having two VCSs and just a | way of asking people to get *more* confused, and have us answer more | questions. That's something we need to avoid. As a code contributor and reviewer, this is awkward. As a contributor, how do I choose? As a reviewer I'm presumably forced to learn both tools. But I'll go with the flow... I do not have a well-informed opinion about the tradeoffs. (I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.) Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | Austin Seipp | Sent: 03 September 2015 05:42 | To: Niklas Hamb?chen | Cc: Simon Marlow; ghc-devs at haskell.org | Subject: Re: Proposal: accept pull requests on GitHub | | (JFYI: I hate to announce my return with a giant novel of negative- | nancy-ness about a proposal that just came up. I'm sorry about this!) | | TL;DR: I'm strongly -1 on this, because I think it introduces a lot of | associated costs for everyone, the benefits aren't really clear, and I | think it obscures the real core issue about "how do we get more | contributors" and how to make that happen. Needless to say, GitHub | does not magically solve both of these AFAICS. | | As is probably already widely known, I'm fairly against GitHub because | I think at best its tools are mediocre and inappropriate for GHC - but | I also don't think this proposal or the alternatives stemming from it | are very good, and that it reduces visibility of the real, core | complaints about what is wrong. Some of those problems may be with | Phabricator, but it's hard to sort the wheat from the chaff, so to | speak. | | For one, having two code review tools of any form is completely | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to | switch, we should just switch. Having to have people decide how to | contribute with two tools is as crazy as having two VCSs and just a | way of asking people to get *more* confused, and have us answer more | questions. That's something we need to avoid. | | For the same reason, I'm also not a fan of 'use third party thing to | augment other thing to remove its deficiencies making it OK', because | the problem is _it adds surface area_ and other problems in other | cases. It is a solution that should be considered a last resort, | because it is a logical solution that applies to everything. If we | have a bot that moves GH PRs into Phab and then review them there, the | surface area of what we have to maintain and explain has suddenly | exploded: because now instead of 1 thing we have 3 things (GH, Phab, | bot) and the 3 interactions between them, for a multiplier of *six* | things we have to deal with. And then we use reviewable,io, because GH | reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian. | We can logically 'automate' everything in all ways to make all | contributors happy, but there's a real *cognitive* overhead to this | and humans don't scale as well as computers do. It is not truly | 'automated away' if the cognitive burden is still there. | | I also find it extremely strange to tell people "By the way, this | method in which you've contributed, as was requested by community | members, is actually a complete proxy for the real method of | contributing, you can find all your imported code here". How is this | supposed to make contribution *easier* as opposed to just more | confusing? Now you've got the impression you're using "the real thing" | when in reality it's shoved off somewhere else to have the nitpicking | done. Just using Phabricator would be less complicated, IMO, and much | more direct. | | The same thing goes for reviewable.io. Adding it as a layer over | GitHub just makes the surface area larger, and puts less under our | control. And is it going to exist in the same form in 2 or 3 years? | Will it continue to offer the same tools, the same workflows that we | "like", and what happens when we hit a wall? It's easy to say | "probably" or "sure" to all this, until we hit something we dislike | and have no possibility of fixing. | | And once you do all this, BTW, you can 'never go back'. It seems so | easy to just say 'submit pull requests' once and nothing else, right? | Wrong. Once you commit to that infrastructure, it is *there* and | simply taking it out from under the feet of those using it is not only | unfortunate, it is *a huge timesink to undo it all*. Which amounts to | it never happening. Oh, but you can import everything elsewhere! The | problem is you *can't* import everything, but more importantly you | can't *import my memories in another way*, so it's a huge blow to | contributors to ask them about these mental time sinks, then to forget | them all. And as your project grows, this becomes more of a memory as | you made a first and last choice to begin with. | | Phabricator was 'lucky' here because it had the gateway into being the | first review tool for us. But that wasn't because it was *better* than | GitHub. It was because we were already using it, and it did not | interact badly with our other tools or force us to compromise things - | so the *cost* was low. The cost is immeasurably higher by default | against GitHub because of this, at least to me. That's just how it is | sometimes. | | Keep in mind there is a cost to everything and how you fix it. GitHub | is not a simple patch to add a GHC feature. It is a question that | fundamentally concerns itself with the future of the project for a | long time. The costs must be analyzed more aggressively. Again, | Phabricator had 'first child' preferential treatment. That's not | something we can undo now. | | I know this sounds like a lot of ad hoc mumbo jumbo, but please bear | with me: we need to identify the *root issue* here to fix it. | Otherwise we will pay for the costs of an improper fix for a long | time, and we are going to keep having this conversation over, and over | again. And we need to weigh in the cost of fixing it, which is why I | mention that so much. | | So with all this in mind, you're back to just using GitHub. But again | GitHub is quite mediocre at best. So what is the point of all this? | It's hinted at here: | | > the number of contributions will go up, commits will be smaller, and | there will be more of them per pull request (contributors will be able | to put style changes and refactorings into separate commits, without | jumping through a bunch of hoops). | | The real hint is that "the number of contributions will go up". That's | a noble goal and I think it's at the heart of this proposal. | | Here's the meat of it question: what is the cost of achieving this | goal? That is, what amount of work is sufficient to make this goal | realizable, and finally - why is GitHub *the best use of our time for | achieving this?* That's one aspect of the cost - that it's the best | use of the time. I feel like this is fundamentally why I always seem | to never 'get' this argument, and I'm sure it's very frustrating on | behalf of the people who have talked to me about it and like GitHub. | But I feel like I've never gotten a straight answer for GHC. | | If the goal is actually "make more people contribute", that's pretty | broad. I can make that very easy: give everyone who ever submits a | patch push access. This is a legitimate way to run large projects that | has worked. People will almost certainly be more willing to commit, | especially when overhead on patch submission is reduced so much. Why | not just do that instead? It's not like we even mandate code review, | although we could. You could reasonably trust CI to catch and revert | things a lot of the time for people who commit directly to master. We | all do it sometimes. | | I'm being serious about this. I can start doing that tomorrow because | the *cost is low*, both now and reasonably speaking into some | foreseeable future. It is one of many solutions to raw heart of the | proposal. GitHub is not a low cost move, but also, it is a *long term | cost* because of the technical deficiencies it won't aim to address | (merge commits are ugly, branch reviews are weak, ticket/PR namespace | overlaps with Trac, etc etc) or that we'll have to work around. | | That means that if we want GitHub to fix the "give us more | contributors" problem, and it has a high cost, it not only has _to fix | the problem_, it also has to do that well enough to offset its cost. I | don't think it's clear that is the case right now, among a lot of | other solutions. | | I don't think the root issue is "We _need_ GitHub to get more | contributors". It sounds like the complaint is more "I don't like how | Phabricator works right now". That's an important distinction, because | the latter is not only more specific, it's more actionable: | | - Things like Arcanist can be tracked as a Git submodule. There is | little to no pain in this, it's low cost, and it can always be | synchronized with Phabricator. This eliminates the "Must clone | arcanist" and "need to upgrade arcanist" points. | | - Similarly when Phabricator sometimes kills a lot of builds, it's | because I do an upgrade. That's mostly an error on my part and I can | simply schedule upgrades regularly, barring hotfixes or somesuch. That | should basically eliminate these. The other build issues are from | picking the wrong base commit from the revision, I think, which I | believe should be fixable upstream (I need to get a solid example of | one that isn't a mega ultra patch.) | | - If Harbormaster is not building dependent patches as mentioned in | WhyNotPhabricator, that is a bug, and I have not been aware of it. | Please make me aware of it so I can file bugs! I seriously don't look | at _every_ patch, I need to know this. That could have probably been | fixed ASAP otherwise. | | - We can get rid of the awkwardness of squashes etc by using | Phabricator's "immutable" history, although it introduces merge | commits. Whether this is acceptable is up to debate (I dislike merge | commits, but could live with it). | | - I do not understand point #3, about answering questions. Here's | the reality: every single one of those cases is *almost always an | error*. That's not a joke. Forgetting to commit a file, amending | changes in the working tree, and specifying a reviewer are all total | errors as it stands today. Why is this a minus? It catches a useful | class of 'interaction bugs'. If it's because sometimes Phabricator | yells about build arifacts in the tree, those should be .gitignore'd. | If it's because you have to 'git stash' sometimes, this is fairly | trivial IMO. Finally, specifying reviewers IS inconvenient, but | currently needed. We could easily assign a '#reviewers' tag that would | add default reviewers. | - In the future, Phabricator will hopefully be able to | automatically assign the right reviewers to every single incoming | patch, based on the source file paths in the tree, using the Owners | tool. Technically, we could do that today if we wanted, it's just a | little more effort to add more Herald rules. This will be far, far | more robust than anything GitHub can offer, and eliminates point #3. | | - Styling, linting etc errors being included, because reviews are | hard to create: This is tangential IMO. We need to just bite the | bullet on this and settle on some lint and coding styles, and apply | them to the tree uniformly. The reality is *nobody ever does style | changes on their own*, and they are always accompanied by a diff, and | they always have to redo the work of pulling them out, Phab or not. | Literally 99% of the time we ask for this, it happens this way. | Perhaps instead we should just eliminate this class of work by just | running linters over all of the source code at once, and being happy | with it. | | Doing this in fact has other benefits: like `arc lint` will always | _correctly_ report when linting errors are violated. And we can reject | patches that violate them, because they will always be accurate. | | - As for some of the quotes, some of them are funny, but the real | message lies in the context. :) In particular, there have been several | cases (such as the DWARF work) where the idea was "write 30 commits | and put them on Phabricator". News flash: *this is bad*, no matter | whether you're using Phabricator or not, because it makes reviewing | the whole thing immensely difficult from a reviewer perspective. The | point here is that we can clear this up by being more communicative | about what we expect of authors of large patches, and communicating | your intent ASAP so we can get patches in as fast as possible. Writing | a patch is the easiest part of the work. | | And more: | | - Clean up the documentation, it's a mess. It feels nice that | everything has clear, lucid explanations on the wiki, but the wiki is | ridiculously massive and we have a tendancy for 'link creep' where we | spread things out. The contributors docs could probably stand to be | streamlined. We would have to do this anyway, moving to GitHub or not. | | - Improve the homepage, directly linking to this aforementioned | page. | | - Make it clear what we expect of contributors. I feel like a lot of | this could be explained by having a 5 minute drive-by guide for | patches, and then a longer 10-minute guide about A) How to style | things, B) How to format your patches if you're going to contribute | regularly, C) Why it is this way, and D) finally links to all the | other things you need to know. People going into Phabricator expecting | it to behave like GitHub is a problem (more a cultural problem IMO but | that's another story), and if this can't be directly fixed, the best | thing to do is make it clear why it isn't. | | Those are just some of the things OTTOMH, but this email is already | way too long. This is what I mean though: fixing most of these is | going to have *seriously smaller cost* than moving to GitHub. It does | not account for "The GitHub factor" of people contributing "just | because it's on GitHub", but again, that value has to outweigh the | other costs. I'm not seriously convinced it does. | | I know it's work to fix these things. But GitHub doesn't really | magically make a lot of our needs go away, and it's not going to | magically fix things like style or lint errors, the fact Travis-CI is | still pretty insufficient for us in the long term (and Harbormaster is | faster, on our own hardware, too), or that it will cause needlessly | higher amounts of spam through Trac and GitHub itself. I don't think | settling on it as - what seems to be - a first resort, is a really | good idea. | | | On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hamb?chen wrote: | > On 02/09/15 22:42, Kosyrev Serge wrote: | >> As a wild idea -- did anyone look at /Gitlab/ instead? | > | > Hi, yes. It does not currently have a sufficient review | functionality | > (cannot handle multiple revisions easily). | > | > On 02/09/15 20:51, Simon Marlow wrote: | >> It might feel better | >> for the author, but discovering what changed between two branches | of | >> multiple commits on github is almost impossible. | > | > I disagree with the first part of this: When the UI of the review | tool | > is good, it is easy to follow. But there's no open-source | > implementation of that around. | > | > I agree that it is not easy to follow on Github. | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs at haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | > | | | | -- | Regards, | | Austin Seipp, Haskell Consultant | Well-Typed LLP, http://www.well-typed.com/ | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Mon Sep 7 13:59:54 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 13:59:54 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> It was fun to meet and discuss this. Did someone volunteer to write a wiki page that describes the proposed design? And, I earnestly hope, also describes the menagerie of currently available array types and primops so that users can have some chance of picking the right one?! Thanks Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Ryan Newton Sent: 31 August 2015 23:11 To: Edward Kmett; Johan Tibell Cc: Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates Subject: Re: ArrayArrays Dear Edward, Ryan Yates, and other interested parties -- So when should we meet up about this? May I propose the Tues afternoon break for everyone at ICFP who is interested in this topic? We can meet out in the coffee area and congregate around Edward Kmett, who is tall and should be easy to find ;-). I think Ryan is going to show us how to use his new primops for combined array + other fields in one heap object? On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > wrote: Without a custom primitive it doesn't help much there, you have to store the indirection to the mask. With a custom primitive it should cut the on heap root-to-leaf path of everything in the HAMT in half. A shorter HashMap was actually one of the motivating factors for me doing this. It is rather astoundingly difficult to beat the performance of HashMap, so I had to start cheating pretty badly. ;) -Edward On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: I'd also be interested to chat at ICFP to see if I can use this for my HAMT implementation. On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > wrote: Sounds good to me. Right now I'm just hacking up composable accessors for "typed slots" in a fairly lens-like fashion, and treating the set of slots I define and the 'new' function I build for the data type as its API, and build atop that. This could eventually graduate to template-haskell, but I'm not entirely satisfied with the solution I have. I currently distinguish between what I'm calling "slots" (things that point directly to another SmallMutableArrayArray# sans wrapper) and "fields" which point directly to the usual Haskell data types because unifying the two notions meant that I couldn't lift some coercions out "far enough" to make them vanish. I'll be happy to run through my current working set of issues in person and -- as things get nailed down further -- in a longer lived medium than in personal conversations. ;) -Edward On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > wrote: I'd also love to meet up at ICFP and discuss this. I think the array primops plus a TH layer that lets (ab)use them many times without too much marginal cost sounds great. And I'd like to learn how we could be either early users of, or help with, this infrastructure. CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is currently working on concurrent data structures in Haskell, but will not be at ICFP. On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > wrote: I completely agree. I would love to spend some time during ICFP and friends talking about what it could look like. My small array for STM changes for the RTS can be seen here [1]. It is on a branch somewhere between 7.8 and 7.10 and includes irrelevant STM bits and some confusing naming choices (sorry), but should cover all the details needed to implement it for a non-STM context. The biggest surprise for me was following small array too closely and having a word/byte offset miss-match [2]. [1]: https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 Ryan On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett > wrote: > I'd love to have that last 10%, but its a lot of work to get there and more > importantly I don't know quite what it should look like. > > On the other hand, I do have a pretty good idea of how the primitives above > could be banged out and tested in a long evening, well in time for 7.12. And > as noted earlier, those remain useful even if a nicer typed version with an > extra level of indirection to the sizes is built up after. > > The rest sounds like a good graduate student project for someone who has > graduate students lying around. Maybe somebody at Indiana University who has > an interest in type theory and parallelism can find us one. =) > > -Edward > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates > wrote: >> >> I think from my perspective, the motivation for getting the type >> checker involved is primarily bringing this to the level where users >> could be expected to build these structures. it is reasonable to >> think that there are people who want to use STM (a context with >> mutation already) to implement a straight forward data structure that >> avoids extra indirection penalty. There should be some places where >> knowing that things are field accesses rather then array indexing >> could be helpful, but I think GHC is good right now about handling >> constant offsets. In my code I don't do any bounds checking as I know >> I will only be accessing my arrays with constant indexes. I make >> wrappers for each field access and leave all the unsafe stuff in >> there. When things go wrong though, the compiler is no help. Maybe >> template Haskell that generates the appropriate wrappers is the right >> direction to go. >> There is another benefit for me when working with these as arrays in >> that it is quite simple and direct (given the hoops already jumped >> through) to play with alignment. I can ensure two pointers are never >> on the same cache-line by just spacing things out in the array. >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett > wrote: >> > They just segfault at this level. ;) >> > >> > Sent from my iPhone >> > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton > wrote: >> > >> > You presumably also save a bounds check on reads by hard-coding the >> > sizes? >> > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> independent questions: >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> b.) if you want cardmarking. >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> pretty >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> -Edward >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: >> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up >> >>> if >> >>> they were small enough and there are enough of them. You get a bit >> >>> better >> >>> locality of reference in terms of what fits in the first cache line of >> >>> them. >> >>> >> >>> -Edward >> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: >> >>>> >> >>>> Yes. And for the short term I can imagine places we will settle with >> >>>> arrays even if it means tracking lengths unnecessarily and >> >>>> unsafeCoercing >> >>>> pointers whose types don't actually match their siblings. >> >>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >>>> array >> >>>> objects *other* than using them to fake structs? (Much to >> >>>> derecommend, as >> >>>> you mentioned!) >> >>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: >> >>>>> >> >>>>> I think both are useful, but the one you suggest requires a lot more >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >>>>> >> >>>>> -Edward >> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: >> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >>>>>> unbounded >> >>>>>> length) with extra payload. >> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >> >>>>>> especially >> >>>>>> with the extra payload at front. But wouldn't the general solution >> >>>>>> for >> >>>>>> structs be one that that allows new user data type defs for # >> >>>>>> types? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >>>>>>> known >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >>>>>>> above, but >> >>>>>>> where the word counts were stored in the objects themselves. >> >>>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >>>>>>> likely >> >>>>>>> want to be something we build in addition to MutVar# rather than a >> >>>>>>> replacement. >> >>>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build info >> >>>>>>> tables that knew them, and typechecker support, for instance, it'd >> >>>>>>> get >> >>>>>>> rather invasive. >> >>>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' versions >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >>>>>>> inline at the >> >>>>>>> end of the structure cease to be possible, so it isn't even a pure >> >>>>>>> win if we >> >>>>>>> did the engineering effort. >> >>>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding the one >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >>>>>>> >> >>>>>>> -Edward >> >>>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton > >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable structs >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >> >>>>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >>>>>>>> identity problems. But what about directly supporting an >> >>>>>>>> extensible set of >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >>>>>>>> MutVar#? That >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >>>>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >>>>>>>> class >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential >> >>>>>>>> counterparts. >> >>>>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >>>>>>>>> maybe >> >>>>>>>>> make a ticket for it. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Thanks >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >>>>>>>>> To: Simon Peyton Jones >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >>>>>>>>> this >> >>>>>>>>> all lives on the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has >> >>>>>>>>> to >> >>>>>>>>> deal with nested arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The Problem >> >>>>>>>>> >> >>>>>>>>> ----------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >>>>>>>>> list >> >>>>>>>>> in Haskell. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers >> >>>>>>>>> on >> >>>>>>>>> the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >>>>>>>>> Maybe >> >>>>>>>>> DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> That is 3 levels of indirection. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >>>>>>>>> worsening our representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This means that every operation we perform on this structure >> >>>>>>>>> will >> >>>>>>>>> be about half of the speed of an implementation in most other >> >>>>>>>>> languages >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Making Progress >> >>>>>>>>> >> >>>>>>>>> ---------------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I have been working on a number of data structures where the >> >>>>>>>>> indirection of going from something in * out to an object in # >> >>>>>>>>> which >> >>>>>>>>> contains the real pointer to my target and coming back >> >>>>>>>>> effectively doubles >> >>>>>>>>> my runtime. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >>>>>>>>> MutVar# >> >>>>>>>>> onto the mutable list when we dirty it. There is a well defined >> >>>>>>>>> write-barrier. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I could change out the representation to use >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, >> >>>>>>>>> but >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of >> >>>>>>>>> distinct >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >>>>>>>>> object to 2. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >>>>>>>>> array >> >>>>>>>>> object and then chase it to the next DLL and chase that to the >> >>>>>>>>> next array. I >> >>>>>>>>> do get my two pointers together in memory though. I'm paying for >> >>>>>>>>> a card >> >>>>>>>>> marking table as well, which I don't particularly need with just >> >>>>>>>>> two >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >>>>>>>>> machinery added >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >>>>>>>>> type, which can >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have two >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >>>>>>>>> What >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >>>>>>>>> impedence >> >>>>>>>>> mismatch between the imperative world and Haskell, and then just >> >>>>>>>>> let the >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >>>>>>>>> special >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >>>>>>>>> abuse pattern >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >>>>>>>>> make this >> >>>>>>>>> cheaper. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding >> >>>>>>>>> and next >> >>>>>>>>> entry in the linked list. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >> >>>>>>>>> strict world, and everything there lives in #. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >> >>>>>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >>>>>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >>>>>>>>> easily when they >> >>>>>>>>> are known strict and you chain operations of this sort! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Cleaning it Up >> >>>>>>>>> >> >>>>>>>>> ------------------ >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array that >> >>>>>>>>> points directly to other arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I can >> >>>>>>>>> fix >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using a >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a >> >>>>>>>>> mixture of >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >> >>>>>>>>> existing >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the >> >>>>>>>>> arguments it >> >>>>>>>>> takes. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that >> >>>>>>>>> would >> >>>>>>>>> be best left unboxed. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently >> >>>>>>>>> at >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >> >>>>>>>>> boxed or at a >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in >> >>>>>>>>> question in >> >>>>>>>>> there. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having to >> >>>>>>>>> go off to >> >>>>>>>>> the side costs me the entire win from avoiding the first pointer >> >>>>>>>>> chase. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >> >>>>>>>>> construct that had n words with unsafe access and m pointers to >> >>>>>>>>> other heap >> >>>>>>>>> objects, one that could put itself on the mutable list when any >> >>>>>>>>> of those >> >>>>>>>>> pointers changed then I could shed this last factor of two in >> >>>>>>>>> all >> >>>>>>>>> circumstances. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Prototype >> >>>>>>>>> >> >>>>>>>>> ------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Over the last few days I've put together a small prototype >> >>>>>>>>> implementation with a few non-trivial imperative data structures >> >>>>>>>>> for things >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >> >>>>>>>>> order-maintenance. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Notable bits: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >> >>>>>>>>> link-cut >> >>>>>>>>> trees in this style. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that >> >>>>>>>>> make >> >>>>>>>>> it go fast. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost >> >>>>>>>>> all >> >>>>>>>>> the references to the LinkCut or Object data constructor get >> >>>>>>>>> optimized away, >> >>>>>>>>> and we're left with beautiful strict code directly mutating out >> >>>>>>>>> underlying >> >>>>>>>>> representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >>>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this thread. >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a >> >>>>>>>>> ticket? Is >> >>>>>>>>> there a wiki page? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a >> >>>>>>>>> good >> >>>>>>>>> thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf >> >>>>>>>>> Of >> >>>>>>>>> Edward Kmett >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >>>>>>>>> To: Manuel M T Chakravarty >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >> >>>>>>>>> very handy as well. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider right now if I have something like an order-maintenance >> >>>>>>>>> structure I have: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Upper s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Lower s)) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >> >>>>>>>>> pointers, >> >>>>>>>>> one for forward and one for backwards. The latter is basically >> >>>>>>>>> the same >> >>>>>>>>> thing with a mutable reference up pointing at the structure >> >>>>>>>>> above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On the heap this is an object that points to a structure for the >> >>>>>>>>> bytearray, and points to another structure for each mutvar which >> >>>>>>>>> each point >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >>>>>>>>> indirection smeared >> >>>>>>>>> over everything. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >> >>>>>>>>> from >> >>>>>>>>> the structure below to the structure above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and >> >>>>>>>>> the >> >>>>>>>>> next 2 slots pointing to the previous and next previous objects, >> >>>>>>>>> represented >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >>>>>>>>> sameMutableArrayArray# on these >> >>>>>>>>> for object identity, which lets me check for the ends of the >> >>>>>>>>> lists by tying >> >>>>>>>>> things back on themselves. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and below that >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to >> >>>>>>>>> an >> >>>>>>>>> upper structure. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out the >> >>>>>>>>> slots >> >>>>>>>>> in question, while it has gained a level of indirection between >> >>>>>>>>> the wrapper >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can >> >>>>>>>>> be basically >> >>>>>>>>> erased by ghc. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the heap >> >>>>>>>>> for >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the >> >>>>>>>>> object itself, >> >>>>>>>>> and the MutableByteArray# that it references to carry around the >> >>>>>>>>> mutable >> >>>>>>>>> int. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The only pain points are >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >> >>>>>>>>> from >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an >> >>>>>>>>> ArrayArray >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >> >>>>>>>>> Haskell, >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers >> >>>>>>>>> wide. Card >> >>>>>>>>> marking doesn't help. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >> >>>>>>>>> convert >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >> >>>>>>>>> unsafeCoerce my way >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >> >>>>>>>>> directly into the >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >> >>>>>>>>> aping the >> >>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >> >>>>>>>>> altar >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them >> >>>>>>>>> and collect >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >> >>>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> That?s an interesting idea. >> >>>>>>>>> >> >>>>>>>>> Manuel >> >>>>>>>>> >> >>>>>>>>> > Edward Kmett >: >> >>>>>>>>> >> >>>>>>>>> > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >> >>>>>>>>> > ArrayArray# entries >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection for >> >>>>>>>>> > the containing >> >>>>>>>>> > structure is amazing, but I can only currently use it if my >> >>>>>>>>> > leaf level data >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be >> >>>>>>>>> > nice to be >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at >> >>>>>>>>> > the leaves to >> >>>>>>>>> > hold lifted contents. >> >>>>>>>>> > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >> >>>>>>>>> > access >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that >> >>>>>>>>> > if i tried to >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >> >>>>>>>>> > ByteArray# >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing >> >>>>>>>>> > this. >> >>>>>>>>> > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >> >>>>>>>>> > could shoehorn a >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >>>>>>>>> > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >> >>>>>>>>> > indirection compared to c/java and this could reduce that pain >> >>>>>>>>> > to just 1 >> >>>>>>>>> > level of unnecessary indirection. >> >>>>>>>>> > >> >>>>>>>>> > -Edward >> >>>>>>>>> >> >>>>>>>>> > _______________________________________________ >> >>>>>>>>> > ghc-devs mailing list >> >>>>>>>>> > ghc-devs at haskell.org >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> _______________________________________________ >> >>>>>>>>> ghc-devs mailing list >> >>>>>>>>> ghc-devs at haskell.org >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>> >> >>>>>>> >> >>>>> >> >>> >> >> >> > >> > >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > > _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Mon Sep 7 14:35:50 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 14:35:50 +0000 Subject: Unpacking sum types In-Reply-To: References: Message-ID: Good start. I have updated the page to separate the source-language design (what the programmer sees) from the implementation. And I have included boxed sums as well ? it would be deeply strange not to do so. Looks good to me! Simon From: Johan Tibell [mailto:johan.tibell at gmail.com] Sent: 01 September 2015 18:24 To: Simon Peyton Jones; Simon Marlow; Ryan Newton Cc: ghc-devs at haskell.org Subject: RFC: Unpacking sum types I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Mon Sep 7 14:57:10 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 14:57:10 +0000 Subject: more releases In-Reply-To: <87si6wkdta.fsf@smart-cactus.org> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> <87si6wkdta.fsf@smart-cactus.org> Message-ID: <09dfe23cd20746c88beb0cfd308ef8f6@DB4PR30MB030.064d.mgd.msft.net> Merging and releasing a fix to the stable branch always carries a cost: it might break something else. There is a real cost to merging, which is why we've followed the lazy strategy that Ben describes. Still, even given the lazy strategy we could perfectly well put out minor releases more proactively; e.g. fix one bug (or a little batch) and release. Provided we could reduce the per-release costs. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Ben | Gamari | Sent: 02 September 2015 17:05 | To: Richard Eisenberg | Cc: GHC developers | Subject: Re: more releases | | Richard Eisenberg writes: | | > I think some of my idea was misunderstood here: my goal was to have | > quick releases only from the stable branch. The goal would not be to | > release the new and shiny, but instead to get bugfixes out to users | > quicker. The new and shiny (master) would remain as it is now. In | > other words: more users would be affected by this change than just | the | > vanguard. | > | I see. This is something we could certainly do. | | It would require, however, that we be more pro-active about continuing | to merge things to the stable branch after the release. | Currently the stable branch is essentially in the same state that it | was in for the 7.10.2 release. I've left it this way as it takes time | and care to cherry-pick patches to stable. Thusfar my poilcy has been | to perform this work lazily until it's clear that we will do another | stable release as otherwise the effort may well be wasted. | | So, even if the steps of building, testing, and uploading the release | are streamlined more frequent releases are still far from free. | Whether it's a worthwhile cost I don't know. | | This is a difficult question to answer without knowing more about how | typical users actually acquire GHC. For instance, this effort would | have minimal impact on users who get their compiler through their | distribution's package manager. On the other hand, if most users | download GHC bindists directly from the GHC download page, then | perhaps this would be effort well-spent. | | Cheers, | | - Ben From ryan.gl.scott at gmail.com Mon Sep 7 15:26:01 2015 From: ryan.gl.scott at gmail.com (Ryan Scott) Date: Mon, 7 Sep 2015 11:26:01 -0400 Subject: Proposal: Automatic derivation of Lift Message-ID: There is a Lift typeclass defined in template-haskell [1] which, when a data type is an instance, permits it to be directly used in a TH quotation, like so data Example = Example instance Lift Example where lift Example = conE (mkNameG_d "" "" "Example") e :: Example e = [| Example |] Making Lift instances for most data types is straightforward and mechanical, so the proposal is to allow automatic derivation of Lift via a -XDeriveLift extension: data Example = Example deriving Lift This is actually a pretty a pretty old proposal [2], dating back to 2007. I wanted to have this feature for my needs, so I submitted a proof-of-concept at the GHC Trac issue page [3]. The question now is: do we really want to bake this feature into GHC? Since not many people opined on the Trac page, I wanted to submit this here for wider visibility and to have a discussion. Here are some arguments I have heard against this feature (please tell me if I am misrepresenting your opinion): * We already have a th-lift package [4] on Hackage which allows derivation of Lift via Template Haskell functions. In addition, if you're using Lift, chances are you're also using the -XTemplateHaskell extension in the first place, so th-lift should be suitable. * The same functionality could be added via GHC generics (as of GHC 7.12/8.0, which adds the ability to reify a datatype's package name [5]), if -XTemplateHaskell can't be used. * Adding another -XDerive- extension places a burden on GHC devs to maintain it in the future in response to further Template Haskell changes. Here are my (opinionated) responses to each of these: * th-lift isn't as fully-featured as a -XDerive- extension at the moment, since it can't do sophisticated type inference [6] or derive for data families. This is something that could be addressed with a patch to th-lift, though. * GHC generics wouldn't be enough to handle unlifted types like Int#, Char#, or Double# (which other -XDerive- extensions do). * This is a subjective measurement, but in terms of the amount of code I had to add, -XDeriveLift was substantially simpler than other -XDerive extensions, because there are fewer weird corner cases. Plus, I'd volunteer to maintain it :) Simon PJ wanted to know if other Template Haskell programmers would find -XDeriveLift useful. Would you be able to use it? Would you like to see a solution other than putting it into GHC? I'd love to hear feedback so we can bring some closure to this 8-year-old feature request. Ryan S. ----- [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html [3] https://ghc.haskell.org/trac/ghc/ticket/1830 [4] http://hackage.haskell.org/package/th-lift [5] https://ghc.haskell.org/trac/ghc/ticket/10030 [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 From spam at scientician.net Mon Sep 7 16:05:56 2015 From: spam at scientician.net (Bardur Arantsson) Date: Mon, 7 Sep 2015 18:05:56 +0200 Subject: more releases In-Reply-To: <09dfe23cd20746c88beb0cfd308ef8f6@DB4PR30MB030.064d.mgd.msft.net> References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> <87si6wkdta.fsf@smart-cactus.org> <09dfe23cd20746c88beb0cfd308ef8f6@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On 09/07/2015 04:57 PM, Simon Peyton Jones wrote: > Merging and releasing a fix to the stable branch always carries a cost: > it might break something else. There is a real cost to merging, which > is why we've followed the lazy strategy that Ben describes. > A valid point, but the upside is that it's a very fast operation to revert if a release is "bad"... and get that updated release into the wild. Regards, From dan.doel at gmail.com Mon Sep 7 17:53:11 2015 From: dan.doel at gmail.com (Dan Doel) Date: Mon, 7 Sep 2015 13:53:11 -0400 Subject: Unpacking sum types In-Reply-To: References: Message-ID: Are we okay with stealing some operator sections for this? E.G. (x ||). I think the boxed sums larger than 2 choices are all technically overlapping with sections. On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones wrote: > Good start. > > > > I have updated the page to separate the source-language design (what the > programmer sees) from the implementation. > > > > And I have included boxed sums as well ? it would be deeply strange not to > do so. > > > > Looks good to me! > > > > Simon > > > > From: Johan Tibell [mailto:johan.tibell at gmail.com] > Sent: 01 September 2015 18:24 > To: Simon Peyton Jones; Simon Marlow; Ryan Newton > Cc: ghc-devs at haskell.org > Subject: RFC: Unpacking sum types > > > > I have a draft design for unpacking sum types that I'd like some feedback > on. In particular feedback both on: > > > > * the writing and clarity of the proposal and > > * the proposal itself. > > > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > > > -- Johan > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From ezyang at mit.edu Mon Sep 7 17:57:43 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 07 Sep 2015 10:57:43 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <1441436053-sup-5590@sabre> Message-ID: <1441648640-sup-9581@sabre> Yes, I think you are right. I've restructured the spec so that 'Box' is an optional extension. Excerpts from Dan Doel's message of 2015-09-06 13:56:35 -0700: > On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel wrote: > > Also, the constructor isn't exactly relevant, so much as whether the > > unlifted error occurs inside the definition of a lifted thing. > > So, in light of this, `Box` is not necessary to define `suspend`. We > can simply write: > > suspend :: Force a -> a > suspend (Force x) = x > > and the fact that `a` has kind * means that `suspend undefined` only > throws an exception if you inspect it. > > `Box` as currently defined (not the previous GADT definition) is novel > in that it allows you to suspend unlifted types that weren't derived > from `Force`. And it would probably be useful to have coercions > between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I > think) it is not necessary for mediating between `Force a` and `a`. > > -- Dan From ezyang at mit.edu Mon Sep 7 18:13:29 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 07 Sep 2015 11:13:29 -0700 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <1441390306-sup-6240@sabre> <1441400654-sup-1647@sabre> <1441436053-sup-5590@sabre> Message-ID: <1441648807-sup-660@sabre> Excerpts from Dan Doel's message of 2015-09-05 10:35:44 -0700: > I tried with `error` first, and it worked exactly the way I described. > > But I guess it's a type inference weirdness. If I annotate mv# with > MutVar# it will work, whereas otherwise it will be inferred that mv# > :: a where a :: *, instead of #. Whereas !x is a pattern which > requires monomorphism of x, and so it figures out mv# :: MutVar# .... > Kind of an odd corner case where breaking cycles causes things _not_ > to type check, due to open kinds not being first class. > > I thought I remembered that at some point it was decided that `let` > bindings of unboxed things should be required to have bangs on the > bindings, to indicate the evaluation order. Maybe I'm thinking of > something else (was it that it was originally required and we got rid > of it?). Ah yes, I added an explicit type signature, which is why I didn't see your problem. As for requiring bang, I think probably you are thinking of: commit 831a35dd00faff195cf938659c2dd736192b865f Author: Ian Lynagh Date: Fri Apr 24 12:47:54 2009 +0000 Require a bang pattern when unlifted types are where/let bound; #3182 For now we only get a warning, rather than an error, because the alex and happy templates don't follow the new rules yet. But Simon eventually made it be less chatty: commit 67157c5c25c8044b54419470b5e8cc677be060c3 Author: simonpj at microsoft.com Date: Tue Nov 16 17:18:43 2010 +0000 Warn a bit less often about unlifted bindings. Warn when (a) a pattern bindings binds unlifted values (b) it has no top-level bang (c) the RHS has a *lifted* type Clause (c) is new, argued for by Simon M Eg x# = 4# + 4# -- No warning (# a,b #) = blah -- No warning I# x = blah -- Warning Since in our cases the RHS is not lifted, no warning occurs. > > Nope, if you just float the error call out of MV, you will go from > > "Okay." to an exception. Notice that *data constructors* are what are > > used to induce suspension. This is why we don't have a 'suspend' > > special form; instead, 'Box' is used directly. > > I know that it's the floating that makes a difference, not the bang > pattern. The point would be to make the syntax require the bang > pattern to give a visual indication of when it happens, and make it > illegal to look like you're doing a normal let that doesn't change the > value (although having it actually be a bang pattern would be bad, > because it'd restrict polymorphism of the definition). I think this is a reasonable thing to ask for. I also think, with the commit set above, this very discussion happened in 2010, and was resolved in favor of not warning in this case for unboxed types. Maybe the situation is different with unlifted data types; it's hard for me to tell. > Also, the constructor isn't exactly relevant, so much as whether the > unlifted error occurs inside the definition of a lifted thing. For > instance, we can go from: > > let mv = MutVar undefined > > to: > > let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv# > > and the result is the same, because it is the definition of mv that is > lazy. Constructors in complex expressions---and all subexpressions for > that matter---just get compiled this way. E.G. > > let f :: MutVar# RealWorld a -> MutVar a > f mv# = f mv# > in flip const (f undefined) $ putStrLn "okay" > > No constructors involved, but no error. Yes, you are right. I incorrectly surmised that a suspension function would have to be special form, but in fact, it does not need to be. > Okay. So, there isn't representational overhead, but there is > overhead, where you call a function or something (which will just > return its argument), whereas newtype constructors end up not having > any cost whatsoever? You might hope that it can get inlined away. But yes, a coercion would be best. Edward From jmcf125 at openmailbox.org Mon Sep 7 18:18:11 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Mon, 7 Sep 2015 19:18:11 +0100 Subject: Cannot have GHC in ARMv6 architecture Message-ID: <20150907181811.GA1668@jmcf125-Acer-Arch.home> Hi, I have tried to have GHC in my Raspberry Pi, got stuck in the issue 7754 (https://ghc.haskell.org/trac/ghc/ticket/7754), since I didn't know where to pass options to terminfo's configure file, although I did copy the headers from my Raspberry Pi. I've been using HUGS ever since, as Arch Linux doesn't have GHC for ARMv6, and deb2targz would not work. I'm aware I have a phase 0 compiler installed, need to build a phase 1 compiler, and use that to cross-compile GHC itself (I wasn't the 1st time I tried, couldn't find as much information as now). I've read the following pages: https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/RaspberryPi https://ghc.haskell.org/trac/ghc/wiki/Building/CrossCompiling https://ghc.haskell.org/trac/ghc/wiki/CrossCompilation along with quite a few bug reports, and questions on Stack Overflow that seemed related but really aren't, or that are already exposed in the tickets mentioned below. Below, /home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses and /home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses are directories to which I copied any headers and libraries, respectively, from my Raspberry Pi. I'm not sure which libraries I am supposed to point configure at. These are the headers-libraries combinations I tried: $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses --with-curses-libraries=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses && make -j5 (...) checking for unistd.h... yes checking ncurses.h usability... no checking ncurses.h presence... no checking for ncurses.h... no checking curses.h usability... no checking curses.h presence... no checking for curses.h... no configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': configure: error: curses headers could not be found, so this package cannot be built See `config.log' for more details libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 1 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 (https://ghc.haskell.org/trac/ghc/ticket/7754) $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/usr/include --with-curses-libraries=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses && make -j5 (...) checking for unistd.h... yes checking ncurses.h usability... yes checking ncurses.h presence... yes checking for ncurses.h... yes checking for setupterm in -ltinfo... no checking for setupterm in -lncursesw... no checking for setupterm in -lncurses... no checking for setupterm in -lcurses... no configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': configure: error: curses library not found, so this package cannot be built See `config.log' for more details libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 1 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 (https://ghc.haskell.org/trac/ghc/ticket/7281) $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/usr/include --with-curses-libraries=/usr/lib && make -j5 $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses --with-curses-libraries=/usr/lib && make -j5 (...) Configuring terminfo-0.4.0.1... configure: WARNING: unrecognized options: --with-compiler, --with-gcc checking for arm-unknown-linux-gnueabihf-gcc... /home/jmcf125/ghc-raspberry-pi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-gcc checking whether the C compiler works... no configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': configure: error: C compiler cannot create executables See `config.log' for more details libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 77 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 Concerning the last 2, it'd be odd if they worked, how could, say, the GCC cross-compiler for ARM use x86_64 libraries? I tried them anyway, since the 1st 2 errors didn't seem to make sense... Also, options --includedir and --oldincludedir seem to have no effect (always get issue 7754), and it doesn't matter if I build registarised or not, the results are the same (registarised, I'm using LLVM 3.6.2-3). I'm sorry if I'm wasting your time with what to you might seem such a simple thing, and not real development on GHC, but I don't know where else to turn to. I'm still learning Haskell, and have never had to compile compilers before. Thank you in advance, Jo?o Miguel From ezyang at mit.edu Mon Sep 7 18:30:16 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 07 Sep 2015 11:30:16 -0700 Subject: Using GHC API to compile Haskell file In-Reply-To: References: <1440368677-sup-472@sabre> Message-ID: <1441649731-sup-8699@sabre> Hello Neil, It looks like my second message got eaten. Let's try again. > 1) Is there any way to do the two compilations sharing some cached > state, e.g. loaded packages/.hi files, so each compilation goes > faster. You can, using withTempSession in the GhcMonad. The external package state will be preserved across calls here, but things put in the HPT will get thrown out. > 2) Is there any way to do the link alone through the GHC API. I am confused by your code. There are two ways you can do linking: 1. Explicitly specify all of the objects to link together. This works even if the source files aren't available. 2. Run ghc --make. This does dependency analysis to figure out what objects to link together, but since everything is already compiled, it just links. Your code seems to be trying to do (1) and (2) simultaneously (you set the mode to OneShot, but then you call load which calls into GhcMake). If you want to use (1), stop calling load and call 'oneShot' instead. If you want to use (2), just reuse your working --make code. (BTW, how did I figure this all out? By looking at ghc/Main.hs). Cheers, Edward From tomberek at gmail.com Mon Sep 7 18:36:22 2015 From: tomberek at gmail.com (Thomas Bereknyei) Date: Mon, 7 Sep 2015 14:36:22 -0400 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: Yes, I would find DeriveLift useful and a pleasant improvement to the Template Haskell ecosystem. I am relatively new to TH and was wondering about a few things (if this hijacks the thread we can start a new one); Other quotations, [m| for 'Q Match' would be helpful to define collections of matches that can be combined and manipulated. One can use Q (Pat,Body,[Decl]) but you lose the ability for the Body to refer to a variable bound in the Pat. One can use Q Exp for just a Lambda, but you cant just combine lambdas to create a Match expression without some machinery. Promotion of a Pat to an Exp. A subset of Pat can create an expression such that \ $pat -> $(promote pat) is id. Tom There is a Lift typeclass defined in template-haskell [1] which, when a data type is an instance, permits it to be directly used in a TH quotation, like so data Example = Example instance Lift Example where lift Example = conE (mkNameG_d "" "" "Example") e :: Example e = [| Example |] Making Lift instances for most data types is straightforward and mechanical, so the proposal is to allow automatic derivation of Lift via a -XDeriveLift extension: data Example = Example deriving Lift This is actually a pretty a pretty old proposal [2], dating back to 2007. I wanted to have this feature for my needs, so I submitted a proof-of-concept at the GHC Trac issue page [3]. The question now is: do we really want to bake this feature into GHC? Since not many people opined on the Trac page, I wanted to submit this here for wider visibility and to have a discussion. Here are some arguments I have heard against this feature (please tell me if I am misrepresenting your opinion): * We already have a th-lift package [4] on Hackage which allows derivation of Lift via Template Haskell functions. In addition, if you're using Lift, chances are you're also using the -XTemplateHaskell extension in the first place, so th-lift should be suitable. * The same functionality could be added via GHC generics (as of GHC 7.12/8.0, which adds the ability to reify a datatype's package name [5]), if -XTemplateHaskell can't be used. * Adding another -XDerive- extension places a burden on GHC devs to maintain it in the future in response to further Template Haskell changes. Here are my (opinionated) responses to each of these: * th-lift isn't as fully-featured as a -XDerive- extension at the moment, since it can't do sophisticated type inference [6] or derive for data families. This is something that could be addressed with a patch to th-lift, though. * GHC generics wouldn't be enough to handle unlifted types like Int#, Char#, or Double# (which other -XDerive- extensions do). * This is a subjective measurement, but in terms of the amount of code I had to add, -XDeriveLift was substantially simpler than other -XDerive extensions, because there are fewer weird corner cases. Plus, I'd volunteer to maintain it :) Simon PJ wanted to know if other Template Haskell programmers would find -XDeriveLift useful. Would you be able to use it? Would you like to see a solution other than putting it into GHC? I'd love to hear feedback so we can bring some closure to this 8-year-old feature request. Ryan S. ----- [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html [3] https://ghc.haskell.org/trac/ghc/ticket/1830 [4] http://hackage.haskell.org/package/th-lift [5] https://ghc.haskell.org/trac/ghc/ticket/10030 [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Mon Sep 7 19:10:34 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Mon, 7 Sep 2015 21:10:34 +0200 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: Continuing my support of the generics route. Is there a fundamental reason why it couldn't handle unlifted types? Given their relative paucity, it seems like a fair compromise to generically define lift instances for all normal data types but require TH for unlifted types. This approach seems much smoother from a maintenance perspective. On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott wrote: > There is a Lift typeclass defined in template-haskell [1] which, when > a data type is an instance, permits it to be directly used in a TH > quotation, like so > > data Example = Example > > instance Lift Example where > lift Example = conE (mkNameG_d "" "" "Example") > > e :: Example > e = [| Example |] > > Making Lift instances for most data types is straightforward and > mechanical, so the proposal is to allow automatic derivation of Lift > via a -XDeriveLift extension: > > data Example = Example deriving Lift > > This is actually a pretty a pretty old proposal [2], dating back to > 2007. I wanted to have this feature for my needs, so I submitted a > proof-of-concept at the GHC Trac issue page [3]. > > The question now is: do we really want to bake this feature into GHC? > Since not many people opined on the Trac page, I wanted to submit this > here for wider visibility and to have a discussion. > > Here are some arguments I have heard against this feature (please tell > me if I am misrepresenting your opinion): > > * We already have a th-lift package [4] on Hackage which allows > derivation of Lift via Template Haskell functions. In addition, if > you're using Lift, chances are you're also using the -XTemplateHaskell > extension in the first place, so th-lift should be suitable. > * The same functionality could be added via GHC generics (as of GHC > 7.12/8.0, which adds the ability to reify a datatype's package name > [5]), if -XTemplateHaskell can't be used. > * Adding another -XDerive- extension places a burden on GHC devs to > maintain it in the future in response to further Template Haskell > changes. > > Here are my (opinionated) responses to each of these: > > * th-lift isn't as fully-featured as a -XDerive- extension at the > moment, since it can't do sophisticated type inference [6] or derive > for data families. This is something that could be addressed with a > patch to th-lift, though. > * GHC generics wouldn't be enough to handle unlifted types like Int#, > Char#, or Double# (which other -XDerive- extensions do). > * This is a subjective measurement, but in terms of the amount of code > I had to add, -XDeriveLift was substantially simpler than other > -XDerive extensions, because there are fewer weird corner cases. Plus, > I'd volunteer to maintain it :) > > Simon PJ wanted to know if other Template Haskell programmers would > find -XDeriveLift useful. Would you be able to use it? Would you like > to see a solution other than putting it into GHC? I'd love to hear > feedback so we can bring some closure to this 8-year-old feature > request. > > Ryan S. > > ----- > [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift > [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html > [3] https://ghc.haskell.org/trac/ghc/ticket/1830 > [4] http://hackage.haskell.org/package/th-lift > [5] https://ghc.haskell.org/trac/ghc/ticket/10030 > [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Mon Sep 7 19:25:57 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 19:25:57 +0000 Subject: Unpacking sum types In-Reply-To: References: Message-ID: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> | Are we okay with stealing some operator sections for this? E.G. (x | ||). I think the boxed sums larger than 2 choices are all technically | overlapping with sections. I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact. Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there. But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using. I've updated the wiki page Simon | -----Original Message----- | From: Dan Doel [mailto:dan.doel at gmail.com] | Sent: 07 September 2015 18:53 | To: Simon Peyton Jones | Cc: Johan Tibell; Simon Marlow; Ryan Newton; ghc-devs at haskell.org | Subject: Re: Unpacking sum types | | Are we okay with stealing some operator sections for this? E.G. (x | ||). I think the boxed sums larger than 2 choices are all technically | overlapping with sections. | | On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones | wrote: | > Good start. | > | > | > | > I have updated the page to separate the source-language design (what | the | > programmer sees) from the implementation. | > | > | > | > And I have included boxed sums as well ? it would be deeply strange not | to | > do so. | > | > | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > From: Johan Tibell [mailto:johan.tibell at gmail.com] | > Sent: 01 September 2015 18:24 | > To: Simon Peyton Jones; Simon Marlow; Ryan Newton | > Cc: ghc-devs at haskell.org | > Subject: RFC: Unpacking sum types | > | > | > | > I have a draft design for unpacking sum types that I'd like some | feedback | > on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > | > | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs at haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | > From karel.gardas at centrum.cz Mon Sep 7 19:37:21 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Mon, 07 Sep 2015 21:37:21 +0200 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150907181811.GA1668@jmcf125-Acer-Arch.home> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> Message-ID: <55EDE771.9010404@centrum.cz> Hi, I think sysroot option may help you. I wrote something about it for ARMv8 in the past here: https://ghcarm.wordpress.com/2014/01/18/unregisterised-ghc-head-build-for-arm64-platform/ Cheers, Karel On 09/ 7/15 08:18 PM, jmcf125 at openmailbox.org wrote: > Hi, > > I have tried to have GHC in my Raspberry Pi, got stuck in the issue 7754 > (https://ghc.haskell.org/trac/ghc/ticket/7754), since I didn't know > where to pass options to terminfo's configure file, although I did copy > the headers from my Raspberry Pi. I've been using HUGS ever since, as > Arch Linux doesn't have GHC for ARMv6, and deb2targz would not work. > > I'm aware I have a phase 0 compiler installed, need to build a phase 1 > compiler, and use that to cross-compile GHC itself (I wasn't the 1st > time I tried, couldn't find as much information as now). > > I've read the following pages: > https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/RaspberryPi > https://ghc.haskell.org/trac/ghc/wiki/Building/CrossCompiling > https://ghc.haskell.org/trac/ghc/wiki/CrossCompilation > along with quite a few bug reports, and questions on Stack Overflow that > seemed related but really aren't, or that are already exposed in the > tickets mentioned below. > > Below, > /home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses and > /home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses are > directories to which I copied any headers and libraries, respectively, > from my Raspberry Pi. > > I'm not sure which libraries I am supposed to point configure at. These > are the headers-libraries combinations I tried: > > $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses --with-curses-libraries=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses && make -j5 > (...) > checking for unistd.h... yes > checking ncurses.h usability... no > checking ncurses.h presence... no > checking for ncurses.h... no > checking curses.h usability... no > checking curses.h presence... no > checking for curses.h... no > configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': > configure: error: curses headers could not be found, so this package cannot be built > See `config.log' for more details > libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed > make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 1 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > (https://ghc.haskell.org/trac/ghc/ticket/7754) > > $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/usr/include --with-curses-libraries=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/lib-curses && make -j5 > (...) > checking for unistd.h... yes > checking ncurses.h usability... yes > checking ncurses.h presence... yes > checking for ncurses.h... yes > checking for setupterm in -ltinfo... no > checking for setupterm in -lncursesw... no > checking for setupterm in -lncurses... no > checking for setupterm in -lcurses... no > configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': > configure: error: curses library not found, so this package cannot be built > See `config.log' for more details > libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed > make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 1 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > (https://ghc.haskell.org/trac/ghc/ticket/7281) > > $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/usr/include --with-curses-libraries=/usr/lib && make -j5 > > $ ./configure --target=arm-linux-gnueabihf --with-curses-includes=/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo/include-curses --with-curses-libraries=/usr/lib && make -j5 > (...) > Configuring terminfo-0.4.0.1... > configure: WARNING: unrecognized options: --with-compiler, --with-gcc > checking for arm-unknown-linux-gnueabihf-gcc... > /home/jmcf125/ghc-raspberry-pi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-gcc > checking whether the C compiler works... no > configure: error: in `/home/jmcf125/ghc-raspberry-pi/ghc/libraries/terminfo': > configure: error: C compiler cannot create executables > See `config.log' for more details > libraries/terminfo/ghc.mk:4: recipe for target 'libraries/terminfo/dist-install/package-data.mk' failed > make[1]: *** [libraries/terminfo/dist-install/package-data.mk] Error 77 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > > Concerning the last 2, it'd be odd if they worked, how could, say, the > GCC cross-compiler for ARM use x86_64 libraries? I tried them anyway, > since the 1st 2 errors didn't seem to make sense... > > Also, options --includedir and --oldincludedir seem to have no effect > (always get issue 7754), and it doesn't matter if I build registarised > or not, the results are the same (registarised, I'm using LLVM 3.6.2-3). > > I'm sorry if I'm wasting your time with what to you might seem such a > simple thing, and not real development on GHC, but I don't know where > else to turn to. I'm still learning Haskell, and have never had to > compile compilers before. > > Thank you in advance, > Jo?o Miguel > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From simonpj at microsoft.com Mon Sep 7 19:41:05 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 19:41:05 +0000 Subject: Unlifted data types In-Reply-To: <1441381504-sup-5051@sabre> References: <1441353701-sup-9422@sabre> <1441380599.3893947.374883985.0FBB1F3A@webmail.messagingengine.com> <1441381088-sup-172@sabre> <1441381504-sup-5051@sabre> Message-ID: <191e443a7b3049dab4a2384779c1dfda@DB4PR30MB030.064d.mgd.msft.net> | Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. This isn't reflected on the wiki. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 16:46 | To: Eric Seidel; ghc-devs | Subject: Re: Unlifted data types | | Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700: | > Yes. Actually, you have a good point that we'd like to have functions | > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we | > can't generate 'Coercible' instances for these types unless Coercible | becomes | > polykinded. Perhaps we can make a new type class, or just magic | > polymorphic functions. | | Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. | | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Mon Sep 7 20:00:04 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 20:00:04 +0000 Subject: Unlifted data types In-Reply-To: <1441353701-sup-9422@sabre> References: <1441353701-sup-9422@sabre> Message-ID: <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> | After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Too many beers! I like some, but not all, of this. There are several distinct things being mixed up. (1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: * Data types are always boxed, and so would unlifted data types * But an unlifted data type does not include bottom, and operationally is always represented by a pointer to the value. Just like Array#. * ALL the evaluation rules are IDENTICAL to those for other unlifted types such as Int# and Array#. Lets are strict, and cannot be recursive, function arguments are evaluated before the call. Literally nothing new here. * The code generator can generate more efficient case expressions, because the pointer always points to a value, never to a thunk or (I believe) an indirection. I think there are some special cases in GC and the RTS to ensure that this invariant holds. And that's it. Syntax: I'd suggest something more prominent than an Unlifted return kind, such as data unlifted T a = L a | R a but I would not die for this. I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple. (2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out. So I vote against splitting # into two: plain will do just fine. (3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation. I got lost in all the traffic about it. (4) Fourth, you don't mention a related suggestion, namely to allow newtype T = MkT Int# with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type. My suggestion: let's nail down (1), including a boxed version of Force an suspend as plain library code, if you want, and perhaps (4); and only THEN tackle the trickiness of unboxing Force. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 09:04 | To: ghc-devs | Subject: Unlifted data types | | Hello friends, | | After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, | Edward Kmett, and any others who I may have forgotten for crystallizing | this proposal. | | Cheers, | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ryan.gl.scott at gmail.com Mon Sep 7 20:02:45 2015 From: ryan.gl.scott at gmail.com (Ryan Scott) Date: Mon, 7 Sep 2015 16:02:45 -0400 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: Unlifted types can't be used polymorphically or in instance declarations, so this makes it impossible to do something like instance Generic Int# or store an Int# in one branch of a (:*:), preventing generics from doing anything in #-land. (unless someone has found a way to hack around this). I would be okay with implementing a generics-based approach, but we'd have to add a caveat that it will only work out-of-the-box on GHC 8.0 or later, due to TH's need to look up package information. (We could give users the ability to specify a package name manually as a workaround.) If this were added, where would be the best place to put it? th-lift? generic-deriving? template-haskell? A new package (lift-generics)? Ryan S. On Mon, Sep 7, 2015 at 3:10 PM, Matthew Pickering wrote: > Continuing my support of the generics route. Is there a fundamental > reason why it couldn't handle unlifted types? Given their relative > paucity, it seems like a fair compromise to generically define lift > instances for all normal data types but require TH for unlifted types. > This approach seems much smoother from a maintenance perspective. > > On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott wrote: >> There is a Lift typeclass defined in template-haskell [1] which, when >> a data type is an instance, permits it to be directly used in a TH >> quotation, like so >> >> data Example = Example >> >> instance Lift Example where >> lift Example = conE (mkNameG_d "" "" "Example") >> >> e :: Example >> e = [| Example |] >> >> Making Lift instances for most data types is straightforward and >> mechanical, so the proposal is to allow automatic derivation of Lift >> via a -XDeriveLift extension: >> >> data Example = Example deriving Lift >> >> This is actually a pretty a pretty old proposal [2], dating back to >> 2007. I wanted to have this feature for my needs, so I submitted a >> proof-of-concept at the GHC Trac issue page [3]. >> >> The question now is: do we really want to bake this feature into GHC? >> Since not many people opined on the Trac page, I wanted to submit this >> here for wider visibility and to have a discussion. >> >> Here are some arguments I have heard against this feature (please tell >> me if I am misrepresenting your opinion): >> >> * We already have a th-lift package [4] on Hackage which allows >> derivation of Lift via Template Haskell functions. In addition, if >> you're using Lift, chances are you're also using the -XTemplateHaskell >> extension in the first place, so th-lift should be suitable. >> * The same functionality could be added via GHC generics (as of GHC >> 7.12/8.0, which adds the ability to reify a datatype's package name >> [5]), if -XTemplateHaskell can't be used. >> * Adding another -XDerive- extension places a burden on GHC devs to >> maintain it in the future in response to further Template Haskell >> changes. >> >> Here are my (opinionated) responses to each of these: >> >> * th-lift isn't as fully-featured as a -XDerive- extension at the >> moment, since it can't do sophisticated type inference [6] or derive >> for data families. This is something that could be addressed with a >> patch to th-lift, though. >> * GHC generics wouldn't be enough to handle unlifted types like Int#, >> Char#, or Double# (which other -XDerive- extensions do). >> * This is a subjective measurement, but in terms of the amount of code >> I had to add, -XDeriveLift was substantially simpler than other >> -XDerive extensions, because there are fewer weird corner cases. Plus, >> I'd volunteer to maintain it :) >> >> Simon PJ wanted to know if other Template Haskell programmers would >> find -XDeriveLift useful. Would you be able to use it? Would you like >> to see a solution other than putting it into GHC? I'd love to hear >> feedback so we can bring some closure to this 8-year-old feature >> request. >> >> Ryan S. >> >> ----- >> [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift >> [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html >> [3] https://ghc.haskell.org/trac/ghc/ticket/1830 >> [4] http://hackage.haskell.org/package/th-lift >> [5] https://ghc.haskell.org/trac/ghc/ticket/10030 >> [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ekmett at gmail.com Mon Sep 7 20:13:59 2015 From: ekmett at gmail.com (Edward Kmett) Date: Mon, 7 Sep 2015 16:13:59 -0400 Subject: ArrayArrays In-Reply-To: <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I volunteered to write something up with the caveat that it would take me a while after the conference ended to get time to do so. I'll see what I can do. -Edward On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones wrote: > It was fun to meet and discuss this. > > > > Did someone volunteer to write a wiki page that describes the proposed > design? And, I earnestly hope, also describes the menagerie of currently > available array types and primops so that users can have some chance of > picking the right one?! > > > > Thanks > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan > Newton > *Sent:* 31 August 2015 23:11 > *To:* Edward Kmett; Johan Tibell > *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; > Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays > > > > Dear Edward, Ryan Yates, and other interested parties -- > > > > So when should we meet up about this? > > > > May I propose the Tues afternoon break for everyone at ICFP who is > interested in this topic? We can meet out in the coffee area and > congregate around Edward Kmett, who is tall and should be easy to find ;-). > > > > I think Ryan is going to show us how to use his new primops for combined > array + other fields in one heap object? > > > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: > > Without a custom primitive it doesn't help much there, you have to store > the indirection to the mask. > > > > With a custom primitive it should cut the on heap root-to-leaf path of > everything in the HAMT in half. A shorter HashMap was actually one of the > motivating factors for me doing this. It is rather astoundingly difficult > to beat the performance of HashMap, so I had to start cheating pretty > badly. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: > > I'd also be interested to chat at ICFP to see if I can use this for my > HAMT implementation. > > > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: > > Sounds good to me. Right now I'm just hacking up composable accessors for > "typed slots" in a fairly lens-like fashion, and treating the set of slots > I define and the 'new' function I build for the data type as its API, and > build atop that. This could eventually graduate to template-haskell, but > I'm not entirely satisfied with the solution I have. I currently > distinguish between what I'm calling "slots" (things that point directly to > another SmallMutableArrayArray# sans wrapper) and "fields" which point > directly to the usual Haskell data types because unifying the two notions > meant that I couldn't lift some coercions out "far enough" to make them > vanish. > > > > I'll be happy to run through my current working set of issues in person > and -- as things get nailed down further -- in a longer lived medium than > in personal conversations. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: > > I'd also love to meet up at ICFP and discuss this. I think the array > primops plus a TH layer that lets (ab)use them many times without too much > marginal cost sounds great. And I'd like to learn how we could be either > early users of, or help with, this infrastructure. > > > > CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping > in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is > currently working on concurrent data structures in Haskell, but will not be > at ICFP. > > > > > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: > > I completely agree. I would love to spend some time during ICFP and > friends talking about what it could look like. My small array for STM > changes for the RTS can be seen here [1]. It is on a branch somewhere > between 7.8 and 7.10 and includes irrelevant STM bits and some > confusing naming choices (sorry), but should cover all the details > needed to implement it for a non-STM context. The biggest surprise > for me was following small array too closely and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan > > > On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: > > I'd love to have that last 10%, but its a lot of work to get there and > more > > importantly I don't know quite what it should look like. > > > > On the other hand, I do have a pretty good idea of how the primitives > above > > could be banged out and tested in a long evening, well in time for 7.12. > And > > as noted earlier, those remain useful even if a nicer typed version with > an > > extra level of indirection to the sizes is built up after. > > > > The rest sounds like a good graduate student project for someone who has > > graduate students lying around. Maybe somebody at Indiana University who > has > > an interest in type theory and parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: > >> > >> I think from my perspective, the motivation for getting the type > >> checker involved is primarily bringing this to the level where users > >> could be expected to build these structures. it is reasonable to > >> think that there are people who want to use STM (a context with > >> mutation already) to implement a straight forward data structure that > >> avoids extra indirection penalty. There should be some places where > >> knowing that things are field accesses rather then array indexing > >> could be helpful, but I think GHC is good right now about handling > >> constant offsets. In my code I don't do any bounds checking as I know > >> I will only be accessing my arrays with constant indexes. I make > >> wrappers for each field access and leave all the unsafe stuff in > >> there. When things go wrong though, the compiler is no help. Maybe > >> template Haskell that generates the appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me when working with these as arrays in > >> that it is quite simple and direct (given the hoops already jumped > >> through) to play with alignment. I can ensure two pointers are never > >> on the same cache-line by just spacing things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > >> > They just segfault at this level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > >> > > >> > You presumably also save a bounds check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: > >> >> > >> >> Also there are 4 different "things" here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the sizes into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without cardmarking for different sizes can be done > >> >> pretty > >> >> easily, but as noted, the infotable variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: > >> >>> > >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up > >> >>> if > >> >>> they were small enough and there are enough of them. You get a bit > >> >>> better > >> >>> locality of reference in terms of what fits in the first cache line > of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short term I can imagine places we will settle > with > >> >>>> arrays even if it means tracking lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't actually match their siblings. > >> >>>> > >> >>>> Is there anything to recommend the hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using them to fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, but the one you suggest requires a lot > more > >> >>>>> plumbing and doesn't subsume all of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an array like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do without structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload at front. But wouldn't the general > solution > >> >>>>>> for > >> >>>>>> structs be one that that allows new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of MutableStruct# with a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a couple of words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we build in addition to MutVar# rather > than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if we had to fix those numbers and build info > >> >>>>>>> tables that knew them, and typechecker support, for instance, > it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of things that we can do with the 'sized' > versions > >> >>>>>>> above, like working with evil unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure cease to be possible, so it isn't even a > pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the needs I have are covered just by adding the > one > >> >>>>>>> primitive. The last 10% gets pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < > rrnewton at gmail.com> > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility of a general solution for mutable > structs > >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's > hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack MutVar into constructors because of object > >> >>>>>>>> identity problems. But what about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, but is it problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this is also critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >> >>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >> >>>>>>>>> Sent: 27 August 2015 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It > >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or > ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in #, they are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing them as a placeholder for a better > thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still stuck with a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the speed of an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory bound on loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a number of data structures where the > >> >>>>>>>>> indirection of going from something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real pointer to my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list when we dirty it. There is a well > defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount > of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase it to the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers together in memory though. I'm paying > for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as well, which I don't particularly need with > just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which is just the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit when you don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted my object itself to live in # and have > two > >> >>>>>>>>> mutable fields and be able to share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the imperative world and Haskell, and then > just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the readMutableArrayArray# and > >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' > into a > >> >>>>>>>>> strict world, and everything there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code > to > >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and you chain operations of this sort! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Cleaning it Up > >> >>>>>>>>> > >> >>>>>>>>> ------------------ > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Now I have one outermost indirection pointing to an array that > >> >>>>>>>>> points directly to other arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I > can > >> >>>>>>>>> fix > >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using > a > >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a > >> >>>>>>>>> mixture of > >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data > structure. > >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the > >> >>>>>>>>> existing > >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the > >> >>>>>>>>> arguments it > >> >>>>>>>>> takes. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that > >> >>>>>>>>> would > >> >>>>>>>>> be best left unboxed. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently > >> >>>>>>>>> at > >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a > >> >>>>>>>>> boxed or at a > >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int > in > >> >>>>>>>>> question in > >> >>>>>>>>> there. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to > >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having > to > >> >>>>>>>>> go off to > >> >>>>>>>>> the side costs me the entire win from avoiding the first > pointer > >> >>>>>>>>> chase. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could > >> >>>>>>>>> construct that had n words with unsafe access and m pointers > to > >> >>>>>>>>> other heap > >> >>>>>>>>> objects, one that could put itself on the mutable list when > any > >> >>>>>>>>> of those > >> >>>>>>>>> pointers changed then I could shed this last factor of two in > >> >>>>>>>>> all > >> >>>>>>>>> circumstances. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Prototype > >> >>>>>>>>> > >> >>>>>>>>> ------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Over the last few days I've put together a small prototype > >> >>>>>>>>> implementation with a few non-trivial imperative data > structures > >> >>>>>>>>> for things > >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and > >> >>>>>>>>> order-maintenance. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> https://github.com/ekmett/structs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Notable bits: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of > >> >>>>>>>>> link-cut > >> >>>>>>>>> trees in this style. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that > >> >>>>>>>>> make > >> >>>>>>>>> it go fast. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost > >> >>>>>>>>> all > >> >>>>>>>>> the references to the LinkCut or Object data constructor get > >> >>>>>>>>> optimized away, > >> >>>>>>>>> and we're left with beautiful strict code directly mutating > out > >> >>>>>>>>> underlying > >> >>>>>>>>> representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> Just to say that I have no idea what is going on in this > thread. > >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a > >> >>>>>>>>> ticket? Is > >> >>>>>>>>> there a wiki page? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a > >> >>>>>>>>> good > >> >>>>>>>>> thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On > Behalf > >> >>>>>>>>> Of > >> >>>>>>>>> Edward Kmett > >> >>>>>>>>> Sent: 21 August 2015 05:25 > >> >>>>>>>>> To: Manuel M T Chakravarty > >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would > be > >> >>>>>>>>> very handy as well. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider right now if I have something like an > order-maintenance > >> >>>>>>>>> structure I have: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Upper s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Lower s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The former contains, logically, a mutable integer and two > >> >>>>>>>>> pointers, > >> >>>>>>>>> one for forward and one for backwards. The latter is basically > >> >>>>>>>>> the same > >> >>>>>>>>> thing with a mutable reference up pointing at the structure > >> >>>>>>>>> above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On the heap this is an object that points to a structure for > the > >> >>>>>>>>> bytearray, and points to another structure for each mutvar > which > >> >>>>>>>>> each point > >> >>>>>>>>> to the other 'Upper' structure. So there is a level of > >> >>>>>>>>> indirection smeared > >> >>>>>>>>> over everything. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link > >> >>>>>>>>> from > >> >>>>>>>>> the structure below to the structure above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Converted into ArrayArray#s I'd get > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and > >> >>>>>>>>> the > >> >>>>>>>>> next 2 slots pointing to the previous and next previous > objects, > >> >>>>>>>>> represented > >> >>>>>>>>> just as their MutableArrayArray#s. I can use > >> >>>>>>>>> sameMutableArrayArray# on these > >> >>>>>>>>> for object identity, which lets me check for the ends of the > >> >>>>>>>>> lists by tying > >> >>>>>>>>> things back on themselves. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and below that > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up > to > >> >>>>>>>>> an > >> >>>>>>>>> upper structure. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can then write a handful of combinators for getting out the > >> >>>>>>>>> slots > >> >>>>>>>>> in question, while it has gained a level of indirection > between > >> >>>>>>>>> the wrapper > >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can > >> >>>>>>>>> be basically > >> >>>>>>>>> erased by ghc. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Unlike before I don't have several separate objects on the > heap > >> >>>>>>>>> for > >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the > >> >>>>>>>>> object itself, > >> >>>>>>>>> and the MutableByteArray# that it references to carry around > the > >> >>>>>>>>> mutable > >> >>>>>>>>> int. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The only pain points are > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me > >> >>>>>>>>> from > >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into > an > >> >>>>>>>>> ArrayArray > >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of > >> >>>>>>>>> Haskell, > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid > the > >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 > pointers > >> >>>>>>>>> wide. Card > >> >>>>>>>>> marking doesn't help. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Alternately I could just try to do really evil things and > >> >>>>>>>>> convert > >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to > >> >>>>>>>>> unsafeCoerce my way > >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays > >> >>>>>>>>> directly into the > >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by > >> >>>>>>>>> aping the > >> >>>>>>>>> MutableArrayArray# s API, but that gets really really > dangerous! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the > >> >>>>>>>>> altar > >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them > >> >>>>>>>>> and collect > >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> That?s an interesting idea. > >> >>>>>>>>> > >> >>>>>>>>> Manuel > >> >>>>>>>>> > >> >>>>>>>>> > Edward Kmett : > >> >>>>>>>>> > >> >>>>>>>>> > > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and > >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the > >> >>>>>>>>> > ArrayArray# entries > >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection > for > >> >>>>>>>>> > the containing > >> >>>>>>>>> > structure is amazing, but I can only currently use it if my > >> >>>>>>>>> > leaf level data > >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd > be > >> >>>>>>>>> > nice to be > >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at > >> >>>>>>>>> > the leaves to > >> >>>>>>>>> > hold lifted contents. > >> >>>>>>>>> > > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to > >> >>>>>>>>> > access > >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do > that > >> >>>>>>>>> > if i tried to > >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a > >> >>>>>>>>> > ByteArray# > >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing > >> >>>>>>>>> > this. > >> >>>>>>>>> > > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection > >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I > >> >>>>>>>>> > could shoehorn a > >> >>>>>>>>> > number of them into ArrayArrays if this worked. > >> >>>>>>>>> > > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary > >> >>>>>>>>> > indirection compared to c/java and this could reduce that > pain > >> >>>>>>>>> > to just 1 > >> >>>>>>>>> > level of unnecessary indirection. > >> >>>>>>>>> > > >> >>>>>>>>> > -Edward > >> >>>>>>>>> > >> >>>>>>>>> > _______________________________________________ > >> >>>>>>>>> > ghc-devs mailing list > >> >>>>>>>>> > ghc-devs at haskell.org > >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> _______________________________________________ > >> >>>>>>>>> ghc-devs mailing list > >> >>>>>>>>> ghc-devs at haskell.org > >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> >> > >> > > >> > > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > > > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Mon Sep 7 20:16:57 2015 From: ekmett at gmail.com (Edward Kmett) Date: Mon, 7 Sep 2015 16:16:57 -0400 Subject: ArrayArrays In-Reply-To: <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I had a brief discussion with Richard during the Haskell Symposium about how we might be able to let parametricity help a bit in reducing the space of necessarily primops to a slightly more manageable level. Notably, it'd be interesting to explore the ability to allow parametricity over the portion of # that is just a gcptr. We could do this if the levity polymorphism machinery was tweaked a bit. You could envision the ability to abstract over things in both * and the subset of # that are represented by a gcptr, then modifying the existing array primitives to be parametric in that choice of levity for their argument so long as it was of a "heap object" levity. This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}# references into a {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the need for folks to descend into the use of the more evil structure primitives we're talking about, and letting us keep a few more principles around us. Then in the cases like `atomicModifyMutVar#` where it needs to actually be in * rather than just a gcptr, due to the constructed field selectors it introduces on the heap then we could keep the existing less polymorphic type. -Edward On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones wrote: > It was fun to meet and discuss this. > > > > Did someone volunteer to write a wiki page that describes the proposed > design? And, I earnestly hope, also describes the menagerie of currently > available array types and primops so that users can have some chance of > picking the right one?! > > > > Thanks > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan > Newton > *Sent:* 31 August 2015 23:11 > *To:* Edward Kmett; Johan Tibell > *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; > Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays > > > > Dear Edward, Ryan Yates, and other interested parties -- > > > > So when should we meet up about this? > > > > May I propose the Tues afternoon break for everyone at ICFP who is > interested in this topic? We can meet out in the coffee area and > congregate around Edward Kmett, who is tall and should be easy to find ;-). > > > > I think Ryan is going to show us how to use his new primops for combined > array + other fields in one heap object? > > > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: > > Without a custom primitive it doesn't help much there, you have to store > the indirection to the mask. > > > > With a custom primitive it should cut the on heap root-to-leaf path of > everything in the HAMT in half. A shorter HashMap was actually one of the > motivating factors for me doing this. It is rather astoundingly difficult > to beat the performance of HashMap, so I had to start cheating pretty > badly. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: > > I'd also be interested to chat at ICFP to see if I can use this for my > HAMT implementation. > > > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: > > Sounds good to me. Right now I'm just hacking up composable accessors for > "typed slots" in a fairly lens-like fashion, and treating the set of slots > I define and the 'new' function I build for the data type as its API, and > build atop that. This could eventually graduate to template-haskell, but > I'm not entirely satisfied with the solution I have. I currently > distinguish between what I'm calling "slots" (things that point directly to > another SmallMutableArrayArray# sans wrapper) and "fields" which point > directly to the usual Haskell data types because unifying the two notions > meant that I couldn't lift some coercions out "far enough" to make them > vanish. > > > > I'll be happy to run through my current working set of issues in person > and -- as things get nailed down further -- in a longer lived medium than > in personal conversations. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: > > I'd also love to meet up at ICFP and discuss this. I think the array > primops plus a TH layer that lets (ab)use them many times without too much > marginal cost sounds great. And I'd like to learn how we could be either > early users of, or help with, this infrastructure. > > > > CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping > in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is > currently working on concurrent data structures in Haskell, but will not be > at ICFP. > > > > > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: > > I completely agree. I would love to spend some time during ICFP and > friends talking about what it could look like. My small array for STM > changes for the RTS can be seen here [1]. It is on a branch somewhere > between 7.8 and 7.10 and includes irrelevant STM bits and some > confusing naming choices (sorry), but should cover all the details > needed to implement it for a non-STM context. The biggest surprise > for me was following small array too closely and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan > > > On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: > > I'd love to have that last 10%, but its a lot of work to get there and > more > > importantly I don't know quite what it should look like. > > > > On the other hand, I do have a pretty good idea of how the primitives > above > > could be banged out and tested in a long evening, well in time for 7.12. > And > > as noted earlier, those remain useful even if a nicer typed version with > an > > extra level of indirection to the sizes is built up after. > > > > The rest sounds like a good graduate student project for someone who has > > graduate students lying around. Maybe somebody at Indiana University who > has > > an interest in type theory and parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: > >> > >> I think from my perspective, the motivation for getting the type > >> checker involved is primarily bringing this to the level where users > >> could be expected to build these structures. it is reasonable to > >> think that there are people who want to use STM (a context with > >> mutation already) to implement a straight forward data structure that > >> avoids extra indirection penalty. There should be some places where > >> knowing that things are field accesses rather then array indexing > >> could be helpful, but I think GHC is good right now about handling > >> constant offsets. In my code I don't do any bounds checking as I know > >> I will only be accessing my arrays with constant indexes. I make > >> wrappers for each field access and leave all the unsafe stuff in > >> there. When things go wrong though, the compiler is no help. Maybe > >> template Haskell that generates the appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me when working with these as arrays in > >> that it is quite simple and direct (given the hoops already jumped > >> through) to play with alignment. I can ensure two pointers are never > >> on the same cache-line by just spacing things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > >> > They just segfault at this level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > >> > > >> > You presumably also save a bounds check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: > >> >> > >> >> Also there are 4 different "things" here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the sizes into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without cardmarking for different sizes can be done > >> >> pretty > >> >> easily, but as noted, the infotable variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: > >> >>> > >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up > >> >>> if > >> >>> they were small enough and there are enough of them. You get a bit > >> >>> better > >> >>> locality of reference in terms of what fits in the first cache line > of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short term I can imagine places we will settle > with > >> >>>> arrays even if it means tracking lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't actually match their siblings. > >> >>>> > >> >>>> Is there anything to recommend the hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using them to fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, but the one you suggest requires a lot > more > >> >>>>> plumbing and doesn't subsume all of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an array like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do without structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload at front. But wouldn't the general > solution > >> >>>>>> for > >> >>>>>> structs be one that that allows new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of MutableStruct# with a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a couple of words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we build in addition to MutVar# rather > than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if we had to fix those numbers and build info > >> >>>>>>> tables that knew them, and typechecker support, for instance, > it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of things that we can do with the 'sized' > versions > >> >>>>>>> above, like working with evil unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure cease to be possible, so it isn't even a > pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the needs I have are covered just by adding the > one > >> >>>>>>> primitive. The last 10% gets pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < > rrnewton at gmail.com> > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility of a general solution for mutable > structs > >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's > hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack MutVar into constructors because of object > >> >>>>>>>> identity problems. But what about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, but is it problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this is also critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >> >>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >> >>>>>>>>> Sent: 27 August 2015 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It > >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or > ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in #, they are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing them as a placeholder for a better > thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still stuck with a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the speed of an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory bound on loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a number of data structures where the > >> >>>>>>>>> indirection of going from something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real pointer to my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list when we dirty it. There is a well > defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount > of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase it to the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers together in memory though. I'm paying > for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as well, which I don't particularly need with > just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which is just the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit when you don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted my object itself to live in # and have > two > >> >>>>>>>>> mutable fields and be able to share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the imperative world and Haskell, and then > just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the readMutableArrayArray# and > >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' > into a > >> >>>>>>>>> strict world, and everything there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code > to > >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and you chain operations of this sort! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Cleaning it Up > >> >>>>>>>>> > >> >>>>>>>>> ------------------ > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Now I have one outermost indirection pointing to an array that > >> >>>>>>>>> points directly to other arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I > can > >> >>>>>>>>> fix > >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using > a > >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a > >> >>>>>>>>> mixture of > >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data > structure. > >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the > >> >>>>>>>>> existing > >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the > >> >>>>>>>>> arguments it > >> >>>>>>>>> takes. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that > >> >>>>>>>>> would > >> >>>>>>>>> be best left unboxed. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently > >> >>>>>>>>> at > >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a > >> >>>>>>>>> boxed or at a > >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int > in > >> >>>>>>>>> question in > >> >>>>>>>>> there. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to > >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having > to > >> >>>>>>>>> go off to > >> >>>>>>>>> the side costs me the entire win from avoiding the first > pointer > >> >>>>>>>>> chase. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could > >> >>>>>>>>> construct that had n words with unsafe access and m pointers > to > >> >>>>>>>>> other heap > >> >>>>>>>>> objects, one that could put itself on the mutable list when > any > >> >>>>>>>>> of those > >> >>>>>>>>> pointers changed then I could shed this last factor of two in > >> >>>>>>>>> all > >> >>>>>>>>> circumstances. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Prototype > >> >>>>>>>>> > >> >>>>>>>>> ------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Over the last few days I've put together a small prototype > >> >>>>>>>>> implementation with a few non-trivial imperative data > structures > >> >>>>>>>>> for things > >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and > >> >>>>>>>>> order-maintenance. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> https://github.com/ekmett/structs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Notable bits: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of > >> >>>>>>>>> link-cut > >> >>>>>>>>> trees in this style. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that > >> >>>>>>>>> make > >> >>>>>>>>> it go fast. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost > >> >>>>>>>>> all > >> >>>>>>>>> the references to the LinkCut or Object data constructor get > >> >>>>>>>>> optimized away, > >> >>>>>>>>> and we're left with beautiful strict code directly mutating > out > >> >>>>>>>>> underlying > >> >>>>>>>>> representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> Just to say that I have no idea what is going on in this > thread. > >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a > >> >>>>>>>>> ticket? Is > >> >>>>>>>>> there a wiki page? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a > >> >>>>>>>>> good > >> >>>>>>>>> thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On > Behalf > >> >>>>>>>>> Of > >> >>>>>>>>> Edward Kmett > >> >>>>>>>>> Sent: 21 August 2015 05:25 > >> >>>>>>>>> To: Manuel M T Chakravarty > >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would > be > >> >>>>>>>>> very handy as well. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider right now if I have something like an > order-maintenance > >> >>>>>>>>> structure I have: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Upper s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Lower s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The former contains, logically, a mutable integer and two > >> >>>>>>>>> pointers, > >> >>>>>>>>> one for forward and one for backwards. The latter is basically > >> >>>>>>>>> the same > >> >>>>>>>>> thing with a mutable reference up pointing at the structure > >> >>>>>>>>> above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On the heap this is an object that points to a structure for > the > >> >>>>>>>>> bytearray, and points to another structure for each mutvar > which > >> >>>>>>>>> each point > >> >>>>>>>>> to the other 'Upper' structure. So there is a level of > >> >>>>>>>>> indirection smeared > >> >>>>>>>>> over everything. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link > >> >>>>>>>>> from > >> >>>>>>>>> the structure below to the structure above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Converted into ArrayArray#s I'd get > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and > >> >>>>>>>>> the > >> >>>>>>>>> next 2 slots pointing to the previous and next previous > objects, > >> >>>>>>>>> represented > >> >>>>>>>>> just as their MutableArrayArray#s. I can use > >> >>>>>>>>> sameMutableArrayArray# on these > >> >>>>>>>>> for object identity, which lets me check for the ends of the > >> >>>>>>>>> lists by tying > >> >>>>>>>>> things back on themselves. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and below that > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up > to > >> >>>>>>>>> an > >> >>>>>>>>> upper structure. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can then write a handful of combinators for getting out the > >> >>>>>>>>> slots > >> >>>>>>>>> in question, while it has gained a level of indirection > between > >> >>>>>>>>> the wrapper > >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can > >> >>>>>>>>> be basically > >> >>>>>>>>> erased by ghc. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Unlike before I don't have several separate objects on the > heap > >> >>>>>>>>> for > >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the > >> >>>>>>>>> object itself, > >> >>>>>>>>> and the MutableByteArray# that it references to carry around > the > >> >>>>>>>>> mutable > >> >>>>>>>>> int. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The only pain points are > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me > >> >>>>>>>>> from > >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into > an > >> >>>>>>>>> ArrayArray > >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of > >> >>>>>>>>> Haskell, > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid > the > >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 > pointers > >> >>>>>>>>> wide. Card > >> >>>>>>>>> marking doesn't help. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Alternately I could just try to do really evil things and > >> >>>>>>>>> convert > >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to > >> >>>>>>>>> unsafeCoerce my way > >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays > >> >>>>>>>>> directly into the > >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by > >> >>>>>>>>> aping the > >> >>>>>>>>> MutableArrayArray# s API, but that gets really really > dangerous! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the > >> >>>>>>>>> altar > >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them > >> >>>>>>>>> and collect > >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> That?s an interesting idea. > >> >>>>>>>>> > >> >>>>>>>>> Manuel > >> >>>>>>>>> > >> >>>>>>>>> > Edward Kmett : > >> >>>>>>>>> > >> >>>>>>>>> > > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and > >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the > >> >>>>>>>>> > ArrayArray# entries > >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection > for > >> >>>>>>>>> > the containing > >> >>>>>>>>> > structure is amazing, but I can only currently use it if my > >> >>>>>>>>> > leaf level data > >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd > be > >> >>>>>>>>> > nice to be > >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at > >> >>>>>>>>> > the leaves to > >> >>>>>>>>> > hold lifted contents. > >> >>>>>>>>> > > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to > >> >>>>>>>>> > access > >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do > that > >> >>>>>>>>> > if i tried to > >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a > >> >>>>>>>>> > ByteArray# > >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing > >> >>>>>>>>> > this. > >> >>>>>>>>> > > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection > >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I > >> >>>>>>>>> > could shoehorn a > >> >>>>>>>>> > number of them into ArrayArrays if this worked. > >> >>>>>>>>> > > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary > >> >>>>>>>>> > indirection compared to c/java and this could reduce that > pain > >> >>>>>>>>> > to just 1 > >> >>>>>>>>> > level of unnecessary indirection. > >> >>>>>>>>> > > >> >>>>>>>>> > -Edward > >> >>>>>>>>> > >> >>>>>>>>> > _______________________________________________ > >> >>>>>>>>> > ghc-devs mailing list > >> >>>>>>>>> > ghc-devs at haskell.org > >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> _______________________________________________ > >> >>>>>>>>> ghc-devs mailing list > >> >>>>>>>>> ghc-devs at haskell.org > >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> >> > >> > > >> > > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > > > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Mon Sep 7 20:19:59 2015 From: dan.doel at gmail.com (Dan Doel) Date: Mon, 7 Sep 2015 16:19:59 -0400 Subject: Unlifted data types In-Reply-To: <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On Mon, Sep 7, 2015 at 4:00 PM, Simon Peyton Jones wrote: > (2) Second, we cannot expect levity polymorphism. Consider > map f (x:xs) = f x : map f xs > Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out. > > So I vote against splitting # into two: plain will do just fine. I don't understand how that last bit follows from the previous stuff (or, I don't understand the sentence). Splitting # into two kinds is useful even if functions can't be levity polymorphic. # contains a bunch of types that aren't represented uniformly. Int# might be 32 bits while Double# is 64, etc. But Unlifted would contain only types that are uniformly represented as pointers, so you could write functions that are polymorphic over types of kind Unlifted. This is not true for Unboxed/# (unless we implement C++ style polymorphism-as-code-generation). ---- Also, with regard to the previous mail, it's not true that `suspend` has to be a special form. All expressions with types of kind * are 'special forms' in the necessary sense. -- Dan From mail at joachim-breitner.de Mon Sep 7 20:21:14 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Mon, 07 Sep 2015 22:21:14 +0200 Subject: AnonymousSums data con syntax In-Reply-To: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> References: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <1441657274.28403.7.camel@joachim-breitner.de> Hi, Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones: > > Are we okay with stealing some operator sections for this? E.G. (x > > > > ). I think the boxed sums larger than 2 choices are all technically overlapping with sections. > > I hadn't thought of that. I suppose that in distfix notation we > could require spaces > (x | |) > since vertical bar by itself isn't an operator. But then (_||) x > might feel more compact. > > Also a section (x ||) isn't valid in a pattern, so we would not need > to require spaces there. > > But my gut feel is: yes, with AnonymousSums we should just steal the > syntax. It won't hurt existing code (since it won't use > AnonymousSums), and if you *are* using AnonymousSums then the distfix > notation is probably more valuable than the sections for an operator > you probably aren't using. I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don?t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From mike at izbicki.me Mon Sep 7 20:26:53 2015 From: mike at izbicki.me (Mike Izbicki) Date: Mon, 7 Sep 2015 13:26:53 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: I have another question :) This one relates to Andrew Farmer's answer a while back on how to build dictionaries given a Concrete type. Everything I have works when I use my own numeric hierarchy, but when I use the Prelude's numeric hierarchy, GHC can't find the `Num Float` instance (or any other builtin instance). I created the following function (based on HERMIT's buildDictionary function) to build my dictionaries (for GHC 7.10.1): -- | Given a function name and concrete type, get the needed dictionary. getDictConcrete :: ModGuts -> String -> Type -> CoreM (Maybe (Expr CoreBndr)) getDictConcrete guts opstr t = trace ("getDictConcrete "++opstr) $ do hscenv <- getHscEnv dflags <- getDynFlags eps <- liftIO $ hscEPS hscenv let (opname,ParentIs classname) = getNameParent guts opstr classType = mkTyConTy $ case lookupNameEnv (eps_PTE eps) classname of Just (ATyCon t) -> t Just (AnId _) -> error "loopupNameEnv AnId" Just (AConLike _) -> error "loopupNameEnv AConLike" Just (ACoAxiom _) -> error "loopupNameEnv ACoAxiom" Nothing -> error "getNameParent gutsEnv Nothing" dictType = mkAppTy classType t dictVar = mkGlobalVar VanillaId (mkSystemName (mkUnique 'z' 1337) (mkVarOcc $ "magicDictionaryName")) dictType vanillaIdInfo bnds <- runTcM guts $ do loc <- getCtLoc $ GivenOrigin UnkSkol let nonC = mkNonCanonical $ CtWanted { ctev_pred = dictType , ctev_evar = dictVar , ctev_loc = loc } wCs = mkSimpleWC [nonC] (x, evBinds) <- solveWantedsTcM wCs bnds <- initDsTc $ dsEvBinds evBinds liftIO $ do putStrLn $ "dictType="++showSDoc dflags (ppr dictType) putStrLn $ "dictVar="++showSDoc dflags (ppr dictVar) putStrLn $ "nonC="++showSDoc dflags (ppr nonC) putStrLn $ "wCs="++showSDoc dflags (ppr wCs) putStrLn $ "bnds="++showSDoc dflags (ppr bnds) putStrLn $ "x="++showSDoc dflags (ppr x) return bnds case bnds of [NonRec _ dict] -> return $ Just dict otherwise -> return Nothing When I use my own numeric class hierarchy, this works great! But when I use the Prelude numeric hierarchy, this doesn't work for some reason. In particular, if I pass `+` as the operation I want a dictionary for on the type `Float`, then the function returns `Nothing` with the following output: getDictConcrete + dictType=Num Float dictVar=magicDictionaryName_zlz nonC=[W] magicDictionaryName_zlz :: Num Float (CNonCanonical) wCs=WC {wc_simple = [W] magicDictionaryName_zlz :: Num Float (CNonCanonical)} bnds=[] x=WC {wc_simple = [W] magicDictionaryName_zlz :: Num Float (CNonCanonical)} If I change the `solveWantedTcMs` function to `simplifyInteractive`, then GHC panics with the following message: Top level: No instance for (GHC.Num.Num GHC.Types.Float) arising from UnkSkol Why doesn't the TcM monad know about the `Num Float` instance? On Fri, Sep 4, 2015 at 9:18 PM, ?mer Sinan A?acan wrote: > Typo: "You're parsing your code" I mean "You're passing your code" > > 2015-09-05 0:16 GMT-04:00 ?mer Sinan A?acan : >> Hi Mike, >> >> I'll try to hack an example for you some time tomorrow(I'm returning from ICFP >> and have some long flights ahead of me). >> >> But in the meantime, here's a working Core code, generated by GHC: >> >> f_rjH :: forall a_alz. Ord a_alz => a_alz -> Bool >> f_rjH = >> \ (@ a_aCH) ($dOrd_aCI :: Ord a_aCH) (eta_B1 :: a_aCH) -> >> == @ a_aCH (GHC.Classes.$p1Ord @ a_aCH $dOrd_aCI) eta_B1 eta_B1 >> >> You can clearly see here how Eq dictionary is selected from Ord >> dicitonary($dOrd_aCI in the example), it's just an application of selector to >> type and dictionary, that's all. >> >> This is generated from this code: >> >> {-# NOINLINE f #-} >> f :: Ord a => a -> Bool >> f x = x == x >> >> Compile it with this: >> >> ghc --make -fforce-recomp -O0 -ddump-simpl -ddump-to-file Main.hs >> -dsuppress-idinfo >> >>> Can anyone help me figure this out? Is there any chance this is a bug in how >>> GHC parses Core? >> >> This seems unlikely, because GHC doesn't have a Core parser and there's no Core >> parsing going on here, you're parsing your Code in the form of AST(CoreExpr, >> CoreProgram etc. defined in CoreSyn.hs). Did you mean something else and am I >> misunderstanding? >> >> 2015-09-04 19:39 GMT-04:00 Mike Izbicki : >>> I'm still having trouble creating Core code that can extract >>> superclass dictionaries from a given dictionary. I suspect the >>> problem is that I don't actually understand what the Core code to do >>> this is supposed to look like. I keep getting the errors mentioned >>> above when I try what I think should work. >>> >>> Can anyone help me figure this out? Is there any chance this is a bug >>> in how GHC parses Core? >>> >>> On Tue, Aug 25, 2015 at 9:24 PM, Mike Izbicki wrote: >>>> The purpose of the plugin is to automatically improve the numerical >>>> stability of Haskell code. It is supposed to identify numeric >>>> expressions, then use Herbie (https://github.com/uwplse/herbie) to >>>> generate a numerically stable version, then rewrite the numerically >>>> stable version back into the code. The first two steps were really >>>> easy. It's the last step of inserting back into the code that I'm >>>> having tons of trouble with. Core is a lot more complicated than I >>>> thought :) >>>> >>>> I'm not sure what you mean by the CoreExpr representation? Here's the >>>> output of the pretty printer you gave: >>>> App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp >>>> Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy >>>> TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] >>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>> {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) >>>> (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp >>>> Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy >>>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>>> "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try >>>> = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>>> [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many >>>> Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many >>>> Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy >>>> (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy >>>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>>> "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = >>>> }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>>> [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), >>>> absd = Use Many (UProd [Use Many >>>> Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] >>>> (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var >>>> Id{$dFloating,aBM,TyConApp Floating [TyVarTy >>>> TyVar{a}],VanillaId,Info{0,SpecInfo [] >>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>> {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy >>>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>> {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy >>>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>> {strd = Lazy, absd = Use Many Used},0}}) >>>> >>>> You can find my pretty printer (and all the other code for the plugin) >>>> at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 >>>> >>>> The function getDictMap >>>> (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) >>>> is where I'm constructing the dictionaries that are getting inserted >>>> back into the Core. >>>> >>>> On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: >>>>> It seems like in your App syntax you're having a non-function in function >>>>> position. You can see this by looking at what failing function >>>>> (splitFunTy_maybe) is doing: >>>>> >>>>> splitFunTy_maybe :: Type -> Maybe (Type, Type) >>>>> -- ^ Attempts to extract the argument and result types from a type >>>>> ... (definition is not important) ... >>>>> >>>>> Then it's used like this at the error site: >>>>> >>>>> (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ >>>>> splitFunTy_maybe fun_ty >>>>> >>>>> In your case this function is returning Nothing and then exceptJust is >>>>> signalling the panic. >>>>> >>>>> Your code looked correct to me, I don't see any problems with that. Maybe you're >>>>> using something wrong as selectors. Could you paste CoreExpr representation of >>>>> your program? >>>>> >>>>> It may also be the case that the panic is caused by something else, maybe your >>>>> syntax is invalidating some assumptions/invariants in GHC but it's not >>>>> immediately checked etc. Working at the Core level is frustrating at times. >>>>> >>>>> Can I ask what kind of plugin are you working on? >>>>> >>>>> (Btw, how did you generate this representation of AST? Did you write it >>>>> manually? If you have a pretty-printer, would you mind sharing it?) >>>>> >>>>> 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >>>>>> Thanks ?mer! >>>>>> >>>>>> I'm able to get dictionaries for the superclasses of a class now, but >>>>>> I get an error whenever I try to get a dictionary for a >>>>>> super-superclass. Here's the Haskell expression I'm working with: >>>>>> >>>>>> test1 :: Floating a => a -> a >>>>>> test1 x1 = x1+x1 >>>>>> >>>>>> The original core is: >>>>>> >>>>>> + @ a $dNum_aJu x1 x1 >>>>>> >>>>>> But my plugin is replacing it with the core: >>>>>> >>>>>> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >>>>>> >>>>>> The only difference is the way I'm getting the Num dictionary. The >>>>>> corresponding AST (annotated with variable names and types) is: >>>>>> >>>>>> App >>>>>> (App >>>>>> (App >>>>>> (App >>>>>> (Var +::forall a. Num a => a -> a -> a) >>>>>> (Type a) >>>>>> ) >>>>>> (App >>>>>> (Var $p1Fractional::forall a. Fractional a => Num a) >>>>>> (App >>>>>> (Var $p1Floating::forall a. Floating a => Fractional a) >>>>>> (Var $dFloating_aJq::Floating a) >>>>>> ) >>>>>> ) >>>>>> ) >>>>>> (Var x1::'a') >>>>>> ) >>>>>> (Var x1::'a') >>>>>> >>>>>> When I insert, GHC gives the following error: >>>>>> >>>>>> ghc: panic! (the 'impossible' happened) >>>>>> (GHC version 7.10.1 for x86_64-unknown-linux): >>>>>> expectJust cpeBody:collect_args >>>>>> >>>>>> What am I doing wrong with extracting these super-superclass >>>>>> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >>>>>> figure out what it's trying to do, so I'm not sure why it's failing on >>>>>> my core. >>>>>> >>>>>> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>>>>>> Mike, here's a piece of code that may be helpful to you: >>>>>>> >>>>>>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>>>>>> >>>>>>> Copy this module to your plugin, it doesn't have any dependencies other than >>>>>>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>>>>>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>>>>>> >>>>>>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>>>>>> else we can learn what Core is generated for a given code, and reverse-engineer >>>>>>> to figure out details. >>>>>>> >>>>>>> Hope it helps. >>>>>>> >>>>>>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>>>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>>>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>>>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>>>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>>>>>> How can I get access to these superclass dictionaries? >>>>>>>> >>>>>>>> I don't have a working code for this but this should get you started: >>>>>>>> >>>>>>>> let ord_dictionary :: Id = ... >>>>>>>> ord_class :: Class = ... >>>>>>>> in >>>>>>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>>>>>> >>>>>>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>>>>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>>>>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>>>>>> >>>>>>>> I assumed you already have ord_dictionary, it should be passed to your function >>>>>>>> already if you had `(Ord a) => ` in your function. >>>>>>>> >>>>>>>> >>>>>>>> Now I realized you asked for getting Num from Floating. I think you should >>>>>>>> follow a similar path except you need two applications, first to get Fractional >>>>>>>> from Floating and second to get Num from Fractional: >>>>>>>> >>>>>>>> mkApps (Var (head (classSCSels fractional_class))) >>>>>>>> [mkApps (Var (head (classSCSels floating_class))) >>>>>>>> [Var floating_dictionary]] >>>>>>>> >>>>>>>> Return value should be a Num dictionary. >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From rrnewton at gmail.com Mon Sep 7 20:27:43 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Mon, 7 Sep 2015 16:27:43 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Ah, incidentally that introduces an interesting difference between atomicModify and CAS. CAS should be able to work on mutable locations in that subset of # that are represented by a gcptr, whereas Edward pointed out that atomicModify cannot. (Indeed, to use lock-free algorithms with these new unboxed mutable structures we'll need CAS on the slots.) On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett wrote: > I had a brief discussion with Richard during the Haskell Symposium about > how we might be able to let parametricity help a bit in reducing the space > of necessarily primops to a slightly more manageable level. > > Notably, it'd be interesting to explore the ability to allow parametricity > over the portion of # that is just a gcptr. > > We could do this if the levity polymorphism machinery was tweaked a bit. > You could envision the ability to abstract over things in both * and the > subset of # that are represented by a gcptr, then modifying the existing > array primitives to be parametric in that choice of levity for their > argument so long as it was of a "heap object" levity. > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the > need for folks to descend into the use of the more evil structure > primitives we're talking about, and letting us keep a few more principles > around us. > > Then in the cases like `atomicModifyMutVar#` where it needs to actually be > in * rather than just a gcptr, due to the constructed field selectors it > introduces on the heap then we could keep the existing less polymorphic > type. > > -Edward > > On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > wrote: > >> It was fun to meet and discuss this. >> >> >> >> Did someone volunteer to write a wiki page that describes the proposed >> design? And, I earnestly hope, also describes the menagerie of currently >> available array types and primops so that users can have some chance of >> picking the right one?! >> >> >> >> Thanks >> >> >> >> Simon >> >> >> >> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan >> Newton >> *Sent:* 31 August 2015 23:11 >> *To:* Edward Kmett; Johan Tibell >> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; >> Ryan Scott; Ryan Yates >> *Subject:* Re: ArrayArrays >> >> >> >> Dear Edward, Ryan Yates, and other interested parties -- >> >> >> >> So when should we meet up about this? >> >> >> >> May I propose the Tues afternoon break for everyone at ICFP who is >> interested in this topic? We can meet out in the coffee area and >> congregate around Edward Kmett, who is tall and should be easy to find ;-). >> >> >> >> I think Ryan is going to show us how to use his new primops for combined >> array + other fields in one heap object? >> >> >> >> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: >> >> Without a custom primitive it doesn't help much there, you have to store >> the indirection to the mask. >> >> >> >> With a custom primitive it should cut the on heap root-to-leaf path of >> everything in the HAMT in half. A shorter HashMap was actually one of the >> motivating factors for me doing this. It is rather astoundingly difficult >> to beat the performance of HashMap, so I had to start cheating pretty >> badly. ;) >> >> >> >> -Edward >> >> >> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >> wrote: >> >> I'd also be interested to chat at ICFP to see if I can use this for my >> HAMT implementation. >> >> >> >> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: >> >> Sounds good to me. Right now I'm just hacking up composable accessors for >> "typed slots" in a fairly lens-like fashion, and treating the set of slots >> I define and the 'new' function I build for the data type as its API, and >> build atop that. This could eventually graduate to template-haskell, but >> I'm not entirely satisfied with the solution I have. I currently >> distinguish between what I'm calling "slots" (things that point directly to >> another SmallMutableArrayArray# sans wrapper) and "fields" which point >> directly to the usual Haskell data types because unifying the two notions >> meant that I couldn't lift some coercions out "far enough" to make them >> vanish. >> >> >> >> I'll be happy to run through my current working set of issues in person >> and -- as things get nailed down further -- in a longer lived medium than >> in personal conversations. ;) >> >> >> >> -Edward >> >> >> >> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: >> >> I'd also love to meet up at ICFP and discuss this. I think the array >> primops plus a TH layer that lets (ab)use them many times without too much >> marginal cost sounds great. And I'd like to learn how we could be either >> early users of, or help with, this infrastructure. >> >> >> >> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >> who is currently working on concurrent data structures in Haskell, but will >> not be at ICFP. >> >> >> >> >> >> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: >> >> I completely agree. I would love to spend some time during ICFP and >> friends talking about what it could look like. My small array for STM >> changes for the RTS can be seen here [1]. It is on a branch somewhere >> between 7.8 and 7.10 and includes irrelevant STM bits and some >> confusing naming choices (sorry), but should cover all the details >> needed to implement it for a non-STM context. The biggest surprise >> for me was following small array too closely and having a word/byte >> offset miss-match [2]. >> >> [1]: >> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >> >> Ryan >> >> >> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: >> > I'd love to have that last 10%, but its a lot of work to get there and >> more >> > importantly I don't know quite what it should look like. >> > >> > On the other hand, I do have a pretty good idea of how the primitives >> above >> > could be banged out and tested in a long evening, well in time for >> 7.12. And >> > as noted earlier, those remain useful even if a nicer typed version >> with an >> > extra level of indirection to the sizes is built up after. >> > >> > The rest sounds like a good graduate student project for someone who has >> > graduate students lying around. Maybe somebody at Indiana University >> who has >> > an interest in type theory and parallelism can find us one. =) >> > >> > -Edward >> > >> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >> wrote: >> >> >> >> I think from my perspective, the motivation for getting the type >> >> checker involved is primarily bringing this to the level where users >> >> could be expected to build these structures. it is reasonable to >> >> think that there are people who want to use STM (a context with >> >> mutation already) to implement a straight forward data structure that >> >> avoids extra indirection penalty. There should be some places where >> >> knowing that things are field accesses rather then array indexing >> >> could be helpful, but I think GHC is good right now about handling >> >> constant offsets. In my code I don't do any bounds checking as I know >> >> I will only be accessing my arrays with constant indexes. I make >> >> wrappers for each field access and leave all the unsafe stuff in >> >> there. When things go wrong though, the compiler is no help. Maybe >> >> template Haskell that generates the appropriate wrappers is the right >> >> direction to go. >> >> There is another benefit for me when working with these as arrays in >> >> that it is quite simple and direct (given the hoops already jumped >> >> through) to play with alignment. I can ensure two pointers are never >> >> on the same cache-line by just spacing things out in the array. >> >> >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >> wrote: >> >> > They just segfault at this level. ;) >> >> > >> >> > Sent from my iPhone >> >> > >> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: >> >> > >> >> > You presumably also save a bounds check on reads by hard-coding the >> >> > sizes? >> >> > >> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >> wrote: >> >> >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> >> independent questions: >> >> >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> >> b.) if you want cardmarking. >> >> >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> >> pretty >> >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> >> >> -Edward >> >> >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >> wrote: >> >> >>> >> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds >> up >> >> >>> if >> >> >>> they were small enough and there are enough of them. You get a bit >> >> >>> better >> >> >>> locality of reference in terms of what fits in the first cache >> line of >> >> >>> them. >> >> >>> >> >> >>> -Edward >> >> >>> >> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >> >> >>> wrote: >> >> >>>> >> >> >>>> Yes. And for the short term I can imagine places we will settle >> with >> >> >>>> arrays even if it means tracking lengths unnecessarily and >> >> >>>> unsafeCoercing >> >> >>>> pointers whose types don't actually match their siblings. >> >> >>>> >> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >> >>>> array >> >> >>>> objects *other* than using them to fake structs? (Much to >> >> >>>> derecommend, as >> >> >>>> you mentioned!) >> >> >>>> >> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> I think both are useful, but the one you suggest requires a lot >> more >> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >> >>>>> >> >> >>>>> -Edward >> >> >>>>> >> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > > >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >> >>>>>> unbounded >> >> >>>>>> length) with extra payload. >> >> >>>>>> >> >> >>>>>> I can see how we can do without structs if we have arrays, >> >> >>>>>> especially >> >> >>>>>> with the extra payload at front. But wouldn't the general >> solution >> >> >>>>>> for >> >> >>>>>> structs be one that that allows new user data type defs for # >> >> >>>>>> types? >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >> >>>>>> wrote: >> >> >>>>>>> >> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >> >>>>>>> known >> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >> >>>>>>> above, but >> >> >>>>>>> where the word counts were stored in the objects themselves. >> >> >>>>>>> >> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >> >>>>>>> likely >> >> >>>>>>> want to be something we build in addition to MutVar# rather >> than a >> >> >>>>>>> replacement. >> >> >>>>>>> >> >> >>>>>>> On the other hand, if we had to fix those numbers and build >> info >> >> >>>>>>> tables that knew them, and typechecker support, for instance, >> it'd >> >> >>>>>>> get >> >> >>>>>>> rather invasive. >> >> >>>>>>> >> >> >>>>>>> Also, a number of things that we can do with the 'sized' >> versions >> >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >> >>>>>>> inline at the >> >> >>>>>>> end of the structure cease to be possible, so it isn't even a >> pure >> >> >>>>>>> win if we >> >> >>>>>>> did the engineering effort. >> >> >>>>>>> >> >> >>>>>>> I think 90% of the needs I have are covered just by adding the >> one >> >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >> >>>>>>> >> >> >>>>>>> -Edward >> >> >>>>>>> >> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >> rrnewton at gmail.com> >> >> >>>>>>> wrote: >> >> >>>>>>>> >> >> >>>>>>>> I like the possibility of a general solution for mutable >> structs >> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's >> hard. >> >> >>>>>>>> >> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >> >>>>>>>> identity problems. But what about directly supporting an >> >> >>>>>>>> extensible set of >> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >> >>>>>>>> MutVar#? That >> >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >> >>>>>>>> >> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >> >>>>>>>> class >> >> >>>>>>>> lockfree mutable structures, just like their Stm and >> sequential >> >> >>>>>>>> counterparts. >> >> >>>>>>>> >> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >> >>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> At the very least I'll take this email and turn it into a >> short >> >> >>>>>>>>> article. >> >> >>>>>>>>> >> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >> >>>>>>>>> maybe >> >> >>>>>>>>> make a ticket for it. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Thanks >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Simon >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >> >>>>>>>>> To: Simon Peyton Jones >> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >> >>>>>>>>> Subject: Re: ArrayArrays >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. >> It >> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >> ByteArray#'s. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >> >>>>>>>>> this >> >> >>>>>>>>> all lives on the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it >> has >> >> >>>>>>>>> to >> >> >>>>>>>>> deal with nested arrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >> thing. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The Problem >> >> >>>>>>>>> >> >> >>>>>>>>> ----------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >> >>>>>>>>> list >> >> >>>>>>>>> in Haskell. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >> pointers >> >> >>>>>>>>> on >> >> >>>>>>>>> the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >> >>>>>>>>> Maybe >> >> >>>>>>>>> DLL ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> That is 3 levels of indirection. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >> >>>>>>>>> worsening our representation. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> This means that every operation we perform on this structure >> >> >>>>>>>>> will >> >> >>>>>>>>> be about half of the speed of an implementation in most other >> >> >>>>>>>>> languages >> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Making Progress >> >> >>>>>>>>> >> >> >>>>>>>>> ---------------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I have been working on a number of data structures where the >> >> >>>>>>>>> indirection of going from something in * out to an object in >> # >> >> >>>>>>>>> which >> >> >>>>>>>>> contains the real pointer to my target and coming back >> >> >>>>>>>>> effectively doubles >> >> >>>>>>>>> my runtime. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >> >>>>>>>>> MutVar# >> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >> defined >> >> >>>>>>>>> write-barrier. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I could change out the representation to use >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >> time, >> >> >>>>>>>>> but >> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount >> of >> >> >>>>>>>>> distinct >> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >> >>>>>>>>> object to 2. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >> >>>>>>>>> array >> >> >>>>>>>>> object and then chase it to the next DLL and chase that to >> the >> >> >>>>>>>>> next array. I >> >> >>>>>>>>> do get my two pointers together in memory though. I'm paying >> for >> >> >>>>>>>>> a card >> >> >>>>>>>>> marking table as well, which I don't particularly need with >> just >> >> >>>>>>>>> two >> >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >> >>>>>>>>> machinery added >> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >> >>>>>>>>> type, which can >> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> But what if I wanted my object itself to live in # and have >> two >> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >> >>>>>>>>> What >> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >> >>>>>>>>> impedence >> >> >>>>>>>>> mismatch between the imperative world and Haskell, and then >> just >> >> >>>>>>>>> let the >> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >> >>>>>>>>> special >> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >> >>>>>>>>> abuse pattern >> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >> >>>>>>>>> make this >> >> >>>>>>>>> cheaper. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >> preceding >> >> >>>>>>>>> and next >> >> >>>>>>>>> entry in the linked list. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >> into a >> >> >>>>>>>>> strict world, and everything there lives in #. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> next :: DLL -> IO DLL >> >> >>>>>>>>> >> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >> >>>>>>>>> >> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code >> to >> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >> >>>>>>>>> easily when they >> >> >>>>>>>>> are known strict and you chain operations of this sort! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Cleaning it Up >> >> >>>>>>>>> >> >> >>>>>>>>> ------------------ >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >> that >> >> >>>>>>>>> points directly to other arrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I >> can >> >> >>>>>>>>> fix >> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >> using a >> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a >> >> >>>>>>>>> mixture of >> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >> structure. >> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >> >> >>>>>>>>> existing >> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of >> the >> >> >>>>>>>>> arguments it >> >> >>>>>>>>> takes. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that >> >> >>>>>>>>> would >> >> >>>>>>>>> be best left unboxed. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >> currently >> >> >>>>>>>>> at >> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >> >> >>>>>>>>> boxed or at a >> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >> int in >> >> >>>>>>>>> question in >> >> >>>>>>>>> there. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need >> to >> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >> Having to >> >> >>>>>>>>> go off to >> >> >>>>>>>>> the side costs me the entire win from avoiding the first >> pointer >> >> >>>>>>>>> chase. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >> >> >>>>>>>>> construct that had n words with unsafe access and m pointers >> to >> >> >>>>>>>>> other heap >> >> >>>>>>>>> objects, one that could put itself on the mutable list when >> any >> >> >>>>>>>>> of those >> >> >>>>>>>>> pointers changed then I could shed this last factor of two in >> >> >>>>>>>>> all >> >> >>>>>>>>> circumstances. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Prototype >> >> >>>>>>>>> >> >> >>>>>>>>> ------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Over the last few days I've put together a small prototype >> >> >>>>>>>>> implementation with a few non-trivial imperative data >> structures >> >> >>>>>>>>> for things >> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >> >> >>>>>>>>> order-maintenance. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> https://github.com/ekmett/structs >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Notable bits: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >> >> >>>>>>>>> link-cut >> >> >>>>>>>>> trees in this style. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that >> >> >>>>>>>>> make >> >> >>>>>>>>> it go fast. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost >> >> >>>>>>>>> all >> >> >>>>>>>>> the references to the LinkCut or Object data constructor get >> >> >>>>>>>>> optimized away, >> >> >>>>>>>>> and we're left with beautiful strict code directly mutating >> out >> >> >>>>>>>>> underlying >> >> >>>>>>>>> representation. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> At the very least I'll take this email and turn it into a >> short >> >> >>>>>>>>> article. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >> >>>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> Just to say that I have no idea what is going on in this >> thread. >> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there >> a >> >> >>>>>>>>> ticket? Is >> >> >>>>>>>>> there a wiki page? >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a >> >> >>>>>>>>> good >> >> >>>>>>>>> thing. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Simon >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >> Behalf >> >> >>>>>>>>> Of >> >> >>>>>>>>> Edward Kmett >> >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >> >>>>>>>>> To: Manuel M T Chakravarty >> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >> >>>>>>>>> Subject: Re: ArrayArrays >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >> would be >> >> >>>>>>>>> very handy as well. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Consider right now if I have something like an >> order-maintenance >> >> >>>>>>>>> structure I have: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >> >> >>>>>>>>> (Upper s)) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >> >> >>>>>>>>> (Lower s)) {-# >> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The former contains, logically, a mutable integer and two >> >> >>>>>>>>> pointers, >> >> >>>>>>>>> one for forward and one for backwards. The latter is >> basically >> >> >>>>>>>>> the same >> >> >>>>>>>>> thing with a mutable reference up pointing at the structure >> >> >>>>>>>>> above. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On the heap this is an object that points to a structure for >> the >> >> >>>>>>>>> bytearray, and points to another structure for each mutvar >> which >> >> >>>>>>>>> each point >> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >> >>>>>>>>> indirection smeared >> >> >>>>>>>>> over everything. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >> >> >>>>>>>>> from >> >> >>>>>>>>> the structure below to the structure above. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and >> >> >>>>>>>>> the >> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >> objects, >> >> >>>>>>>>> represented >> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >> >>>>>>>>> sameMutableArrayArray# on these >> >> >>>>>>>>> for object identity, which lets me check for the ends of the >> >> >>>>>>>>> lists by tying >> >> >>>>>>>>> things back on themselves. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> and below that >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up >> to >> >> >>>>>>>>> an >> >> >>>>>>>>> upper structure. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I can then write a handful of combinators for getting out the >> >> >>>>>>>>> slots >> >> >>>>>>>>> in question, while it has gained a level of indirection >> between >> >> >>>>>>>>> the wrapper >> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one >> can >> >> >>>>>>>>> be basically >> >> >>>>>>>>> erased by ghc. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Unlike before I don't have several separate objects on the >> heap >> >> >>>>>>>>> for >> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the >> >> >>>>>>>>> object itself, >> >> >>>>>>>>> and the MutableByteArray# that it references to carry around >> the >> >> >>>>>>>>> mutable >> >> >>>>>>>>> int. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The only pain points are >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >> >> >>>>>>>>> from >> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >> into an >> >> >>>>>>>>> ArrayArray >> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >> >> >>>>>>>>> Haskell, >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> and >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid >> the >> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >> pointers >> >> >>>>>>>>> wide. Card >> >> >>>>>>>>> marking doesn't help. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Alternately I could just try to do really evil things and >> >> >>>>>>>>> convert >> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >> >> >>>>>>>>> unsafeCoerce my way >> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >> >> >>>>>>>>> directly into the >> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >> >> >>>>>>>>> aping the >> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >> dangerous! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >> >> >>>>>>>>> altar >> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >> them >> >> >>>>>>>>> and collect >> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >> >> >>>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> That?s an interesting idea. >> >> >>>>>>>>> >> >> >>>>>>>>> Manuel >> >> >>>>>>>>> >> >> >>>>>>>>> > Edward Kmett : >> >> >>>>>>>>> >> >> >>>>>>>>> > >> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# >> and >> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >> >> >>>>>>>>> > ArrayArray# entries >> >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection >> for >> >> >>>>>>>>> > the containing >> >> >>>>>>>>> > structure is amazing, but I can only currently use it if my >> >> >>>>>>>>> > leaf level data >> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >> It'd be >> >> >>>>>>>>> > nice to be >> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at >> >> >>>>>>>>> > the leaves to >> >> >>>>>>>>> > hold lifted contents. >> >> >>>>>>>>> > >> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >> >> >>>>>>>>> > access >> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do >> that >> >> >>>>>>>>> > if i tried to >> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >> >> >>>>>>>>> > ByteArray# >> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >> preventing >> >> >>>>>>>>> > this. >> >> >>>>>>>>> > >> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >> >> >>>>>>>>> > could shoehorn a >> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >> >>>>>>>>> > >> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >> >> >>>>>>>>> > indirection compared to c/java and this could reduce that >> pain >> >> >>>>>>>>> > to just 1 >> >> >>>>>>>>> > level of unnecessary indirection. >> >> >>>>>>>>> > >> >> >>>>>>>>> > -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> > _______________________________________________ >> >> >>>>>>>>> > ghc-devs mailing list >> >> >>>>>>>>> > ghc-devs at haskell.org >> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> _______________________________________________ >> >> >>>>>>>>> ghc-devs mailing list >> >> >>>>>>>>> ghc-devs at haskell.org >> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>> >> >> >>> >> >> >> >> >> > >> >> > >> >> > _______________________________________________ >> >> > ghc-devs mailing list >> >> > ghc-devs at haskell.org >> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> > >> > >> > >> >> >> >> >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Mon Sep 7 20:31:57 2015 From: ekmett at gmail.com (Edward Kmett) Date: Mon, 7 Sep 2015 16:31:57 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Indeed. I can CAS today with appropriately coerced primitives. -Edward On Mon, Sep 7, 2015 at 4:27 PM, Ryan Newton wrote: > Ah, incidentally that introduces an interesting difference between > atomicModify and CAS. CAS should be able to work on mutable locations in > that subset of # that are represented by a gcptr, whereas Edward pointed > out that atomicModify cannot. > > (Indeed, to use lock-free algorithms with these new unboxed mutable > structures we'll need CAS on the slots.) > > On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett wrote: > >> I had a brief discussion with Richard during the Haskell Symposium about >> how we might be able to let parametricity help a bit in reducing the space >> of necessarily primops to a slightly more manageable level. >> >> Notably, it'd be interesting to explore the ability to allow >> parametricity over the portion of # that is just a gcptr. >> >> We could do this if the levity polymorphism machinery was tweaked a bit. >> You could envision the ability to abstract over things in both * and the >> subset of # that are represented by a gcptr, then modifying the existing >> array primitives to be parametric in that choice of levity for their >> argument so long as it was of a "heap object" levity. >> >> This could make the menagerie of ways to pack >> {Small}{Mutable}Array{Array}# references into a >> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the >> need for folks to descend into the use of the more evil structure >> primitives we're talking about, and letting us keep a few more principles >> around us. >> >> Then in the cases like `atomicModifyMutVar#` where it needs to actually >> be in * rather than just a gcptr, due to the constructed field selectors it >> introduces on the heap then we could keep the existing less polymorphic >> type. >> >> -Edward >> >> On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > > wrote: >> >>> It was fun to meet and discuss this. >>> >>> >>> >>> Did someone volunteer to write a wiki page that describes the proposed >>> design? And, I earnestly hope, also describes the menagerie of currently >>> available array types and primops so that users can have some chance of >>> picking the right one?! >>> >>> >>> >>> Thanks >>> >>> >>> >>> Simon >>> >>> >>> >>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan >>> Newton >>> *Sent:* 31 August 2015 23:11 >>> *To:* Edward Kmett; Johan Tibell >>> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; >>> Ryan Scott; Ryan Yates >>> *Subject:* Re: ArrayArrays >>> >>> >>> >>> Dear Edward, Ryan Yates, and other interested parties -- >>> >>> >>> >>> So when should we meet up about this? >>> >>> >>> >>> May I propose the Tues afternoon break for everyone at ICFP who is >>> interested in this topic? We can meet out in the coffee area and >>> congregate around Edward Kmett, who is tall and should be easy to find ;-). >>> >>> >>> >>> I think Ryan is going to show us how to use his new primops for combined >>> array + other fields in one heap object? >>> >>> >>> >>> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: >>> >>> Without a custom primitive it doesn't help much there, you have to store >>> the indirection to the mask. >>> >>> >>> >>> With a custom primitive it should cut the on heap root-to-leaf path of >>> everything in the HAMT in half. A shorter HashMap was actually one of the >>> motivating factors for me doing this. It is rather astoundingly difficult >>> to beat the performance of HashMap, so I had to start cheating pretty >>> badly. ;) >>> >>> >>> >>> -Edward >>> >>> >>> >>> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >>> wrote: >>> >>> I'd also be interested to chat at ICFP to see if I can use this for my >>> HAMT implementation. >>> >>> >>> >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: >>> >>> Sounds good to me. Right now I'm just hacking up composable accessors >>> for "typed slots" in a fairly lens-like fashion, and treating the set of >>> slots I define and the 'new' function I build for the data type as its API, >>> and build atop that. This could eventually graduate to template-haskell, >>> but I'm not entirely satisfied with the solution I have. I currently >>> distinguish between what I'm calling "slots" (things that point directly to >>> another SmallMutableArrayArray# sans wrapper) and "fields" which point >>> directly to the usual Haskell data types because unifying the two notions >>> meant that I couldn't lift some coercions out "far enough" to make them >>> vanish. >>> >>> >>> >>> I'll be happy to run through my current working set of issues in person >>> and -- as things get nailed down further -- in a longer lived medium than >>> in personal conversations. ;) >>> >>> >>> >>> -Edward >>> >>> >>> >>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: >>> >>> I'd also love to meet up at ICFP and discuss this. I think the array >>> primops plus a TH layer that lets (ab)use them many times without too much >>> marginal cost sounds great. And I'd like to learn how we could be either >>> early users of, or help with, this infrastructure. >>> >>> >>> >>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >>> who is currently working on concurrent data structures in Haskell, but will >>> not be at ICFP. >>> >>> >>> >>> >>> >>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: >>> >>> I completely agree. I would love to spend some time during ICFP and >>> friends talking about what it could look like. My small array for STM >>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>> confusing naming choices (sorry), but should cover all the details >>> needed to implement it for a non-STM context. The biggest surprise >>> for me was following small array too closely and having a word/byte >>> offset miss-match [2]. >>> >>> [1]: >>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>> >>> Ryan >>> >>> >>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: >>> > I'd love to have that last 10%, but its a lot of work to get there and >>> more >>> > importantly I don't know quite what it should look like. >>> > >>> > On the other hand, I do have a pretty good idea of how the primitives >>> above >>> > could be banged out and tested in a long evening, well in time for >>> 7.12. And >>> > as noted earlier, those remain useful even if a nicer typed version >>> with an >>> > extra level of indirection to the sizes is built up after. >>> > >>> > The rest sounds like a good graduate student project for someone who >>> has >>> > graduate students lying around. Maybe somebody at Indiana University >>> who has >>> > an interest in type theory and parallelism can find us one. =) >>> > >>> > -Edward >>> > >>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >>> wrote: >>> >> >>> >> I think from my perspective, the motivation for getting the type >>> >> checker involved is primarily bringing this to the level where users >>> >> could be expected to build these structures. it is reasonable to >>> >> think that there are people who want to use STM (a context with >>> >> mutation already) to implement a straight forward data structure that >>> >> avoids extra indirection penalty. There should be some places where >>> >> knowing that things are field accesses rather then array indexing >>> >> could be helpful, but I think GHC is good right now about handling >>> >> constant offsets. In my code I don't do any bounds checking as I know >>> >> I will only be accessing my arrays with constant indexes. I make >>> >> wrappers for each field access and leave all the unsafe stuff in >>> >> there. When things go wrong though, the compiler is no help. Maybe >>> >> template Haskell that generates the appropriate wrappers is the right >>> >> direction to go. >>> >> There is another benefit for me when working with these as arrays in >>> >> that it is quite simple and direct (given the hoops already jumped >>> >> through) to play with alignment. I can ensure two pointers are never >>> >> on the same cache-line by just spacing things out in the array. >>> >> >>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >>> wrote: >>> >> > They just segfault at this level. ;) >>> >> > >>> >> > Sent from my iPhone >>> >> > >>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >>> wrote: >>> >> > >>> >> > You presumably also save a bounds check on reads by hard-coding the >>> >> > sizes? >>> >> > >>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >>> wrote: >>> >> >> >>> >> >> Also there are 4 different "things" here, basically depending on >>> two >>> >> >> independent questions: >>> >> >> >>> >> >> a.) if you want to shove the sizes into the info table, and >>> >> >> b.) if you want cardmarking. >>> >> >> >>> >> >> Versions with/without cardmarking for different sizes can be done >>> >> >> pretty >>> >> >> easily, but as noted, the infotable variants are pretty invasive. >>> >> >> >>> >> >> -Edward >>> >> >> >>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >>> wrote: >>> >> >>> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds >>> up >>> >> >>> if >>> >> >>> they were small enough and there are enough of them. You get a bit >>> >> >>> better >>> >> >>> locality of reference in terms of what fits in the first cache >>> line of >>> >> >>> them. >>> >> >>> >>> >> >>> -Edward >>> >> >>> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >>> >> >>> wrote: >>> >> >>>> >>> >> >>>> Yes. And for the short term I can imagine places we will settle >>> with >>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>> >> >>>> unsafeCoercing >>> >> >>>> pointers whose types don't actually match their siblings. >>> >> >>>> >>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>> sized >>> >> >>>> array >>> >> >>>> objects *other* than using them to fake structs? (Much to >>> >> >>>> derecommend, as >>> >> >>>> you mentioned!) >>> >> >>>> >>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>> >> >>>> wrote: >>> >> >>>>> >>> >> >>>>> I think both are useful, but the one you suggest requires a lot >>> more >>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >>> >> >>>>> >>> >> >>>>> -Edward >>> >> >>>>> >>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton < >>> rrnewton at gmail.com> >>> >> >>>>> wrote: >>> >> >>>>>> >>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>> >> >>>>>> unbounded >>> >> >>>>>> length) with extra payload. >>> >> >>>>>> >>> >> >>>>>> I can see how we can do without structs if we have arrays, >>> >> >>>>>> especially >>> >> >>>>>> with the extra payload at front. But wouldn't the general >>> solution >>> >> >>>>>> for >>> >> >>>>>> structs be one that that allows new user data type defs for # >>> >> >>>>>> types? >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> > >>> >> >>>>>> wrote: >>> >> >>>>>>> >>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >>> >> >>>>>>> known >>> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >>> >> >>>>>>> above, but >>> >> >>>>>>> where the word counts were stored in the objects themselves. >>> >> >>>>>>> >>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >>> >> >>>>>>> likely >>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>> than a >>> >> >>>>>>> replacement. >>> >> >>>>>>> >>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>> info >>> >> >>>>>>> tables that knew them, and typechecker support, for instance, >>> it'd >>> >> >>>>>>> get >>> >> >>>>>>> rather invasive. >>> >> >>>>>>> >>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>> versions >>> >> >>>>>>> above, like working with evil unsized c-style arrays directly >>> >> >>>>>>> inline at the >>> >> >>>>>>> end of the structure cease to be possible, so it isn't even a >>> pure >>> >> >>>>>>> win if we >>> >> >>>>>>> did the engineering effort. >>> >> >>>>>>> >>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>> the one >>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>> >> >>>>>>> >>> >> >>>>>>> -Edward >>> >> >>>>>>> >>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >>> rrnewton at gmail.com> >>> >> >>>>>>> wrote: >>> >> >>>>>>>> >>> >> >>>>>>>> I like the possibility of a general solution for mutable >>> structs >>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's >>> hard. >>> >> >>>>>>>> >>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>> object >>> >> >>>>>>>> identity problems. But what about directly supporting an >>> >> >>>>>>>> extensible set of >>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>> replacing) >>> >> >>>>>>>> MutVar#? That >>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>> >> >>>>>>>> >>> >> >>>>>>>> Needless to say, this is also critical if we ever want best >>> in >>> >> >>>>>>>> class >>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>> sequential >>> >> >>>>>>>> counterparts. >>> >> >>>>>>>> >>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>> >> >>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>> short >>> >> >>>>>>>>> article. >>> >> >>>>>>>>> >>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >>> >> >>>>>>>>> maybe >>> >> >>>>>>>>> make a ticket for it. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Thanks >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Simon >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>> >> >>>>>>>>> To: Simon Peyton Jones >>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>> >> >>>>>>>>> Subject: Re: ArrayArrays >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. >>> It >>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>> ByteArray#'s. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> While those live in #, they are garbage collected objects, >>> so >>> >> >>>>>>>>> this >>> >> >>>>>>>>> all lives on the heap. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it >>> has >>> >> >>>>>>>>> to >>> >> >>>>>>>>> deal with nested arrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>> thing. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The Problem >>> >> >>>>>>>>> >>> >> >>>>>>>>> ----------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Consider the scenario where you write a classic >>> doubly-linked >>> >> >>>>>>>>> list >>> >> >>>>>>>>> in Haskell. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>> pointers >>> >> >>>>>>>>> on >>> >> >>>>>>>>> the heap. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >>> >> >>>>>>>>> Maybe >>> >> >>>>>>>>> DLL ~> DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> That is 3 levels of indirection. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>> and >>> >> >>>>>>>>> worsening our representation. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> This means that every operation we perform on this structure >>> >> >>>>>>>>> will >>> >> >>>>>>>>> be about half of the speed of an implementation in most >>> other >>> >> >>>>>>>>> languages >>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Making Progress >>> >> >>>>>>>>> >>> >> >>>>>>>>> ---------------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I have been working on a number of data structures where the >>> >> >>>>>>>>> indirection of going from something in * out to an object >>> in # >>> >> >>>>>>>>> which >>> >> >>>>>>>>> contains the real pointer to my target and coming back >>> >> >>>>>>>>> effectively doubles >>> >> >>>>>>>>> my runtime. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >>> >> >>>>>>>>> MutVar# >>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>> defined >>> >> >>>>>>>>> write-barrier. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I could change out the representation to use >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>> time, >>> >> >>>>>>>>> but >>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>> amount of >>> >> >>>>>>>>> distinct >>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >>> >> >>>>>>>>> object to 2. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>> the >>> >> >>>>>>>>> array >>> >> >>>>>>>>> object and then chase it to the next DLL and chase that to >>> the >>> >> >>>>>>>>> next array. I >>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>> paying for >>> >> >>>>>>>>> a card >>> >> >>>>>>>>> marking table as well, which I don't particularly need with >>> just >>> >> >>>>>>>>> two >>> >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >>> >> >>>>>>>>> machinery added >>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >>> >> >>>>>>>>> type, which can >>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have >>> two >>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>> types. >>> >> >>>>>>>>> What >>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>> the >>> >> >>>>>>>>> impedence >>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and then >>> just >>> >> >>>>>>>>> let the >>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >>> >> >>>>>>>>> special >>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >>> >> >>>>>>>>> abuse pattern >>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >>> >> >>>>>>>>> make this >>> >> >>>>>>>>> cheaper. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>> preceding >>> >> >>>>>>>>> and next >>> >> >>>>>>>>> entry in the linked list. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>> into a >>> >> >>>>>>>>> strict world, and everything there lives in #. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> next :: DLL -> IO DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>> >> >>>>>>>>> >>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>> code to >>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >>> >> >>>>>>>>> easily when they >>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Cleaning it Up >>> >> >>>>>>>>> >>> >> >>>>>>>>> ------------------ >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>> that >>> >> >>>>>>>>> points directly to other arrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I >>> can >>> >> >>>>>>>>> fix >>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>> using a >>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store >>> a >>> >> >>>>>>>>> mixture of >>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>> structure. >>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >>> >> >>>>>>>>> existing >>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of >>> the >>> >> >>>>>>>>> arguments it >>> >> >>>>>>>>> takes. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>> that >>> >> >>>>>>>>> would >>> >> >>>>>>>>> be best left unboxed. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>> currently >>> >> >>>>>>>>> at >>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >>> >> >>>>>>>>> boxed or at a >>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>> int in >>> >> >>>>>>>>> question in >>> >> >>>>>>>>> there. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need >>> to >>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>> Having to >>> >> >>>>>>>>> go off to >>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>> pointer >>> >> >>>>>>>>> chase. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >>> >> >>>>>>>>> construct that had n words with unsafe access and m >>> pointers to >>> >> >>>>>>>>> other heap >>> >> >>>>>>>>> objects, one that could put itself on the mutable list when >>> any >>> >> >>>>>>>>> of those >>> >> >>>>>>>>> pointers changed then I could shed this last factor of two >>> in >>> >> >>>>>>>>> all >>> >> >>>>>>>>> circumstances. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Prototype >>> >> >>>>>>>>> >>> >> >>>>>>>>> ------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Over the last few days I've put together a small prototype >>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>> structures >>> >> >>>>>>>>> for things >>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>> >> >>>>>>>>> order-maintenance. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> https://github.com/ekmett/structs >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Notable bits: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >>> >> >>>>>>>>> link-cut >>> >> >>>>>>>>> trees in this style. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>> that >>> >> >>>>>>>>> make >>> >> >>>>>>>>> it go fast. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>> almost >>> >> >>>>>>>>> all >>> >> >>>>>>>>> the references to the LinkCut or Object data constructor get >>> >> >>>>>>>>> optimized away, >>> >> >>>>>>>>> and we're left with beautiful strict code directly mutating >>> out >>> >> >>>>>>>>> underlying >>> >> >>>>>>>>> representation. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>> short >>> >> >>>>>>>>> article. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>> >> >>>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>> thread. >>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>> there a >>> >> >>>>>>>>> ticket? Is >>> >> >>>>>>>>> there a wiki page? >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be >>> a >>> >> >>>>>>>>> good >>> >> >>>>>>>>> thing. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Simon >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>> Behalf >>> >> >>>>>>>>> Of >>> >> >>>>>>>>> Edward Kmett >>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>> >> >>>>>>>>> To: Manuel M T Chakravarty >>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>> >> >>>>>>>>> Subject: Re: ArrayArrays >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>> would be >>> >> >>>>>>>>> very handy as well. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Consider right now if I have something like an >>> order-maintenance >>> >> >>>>>>>>> structure I have: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>> {-# >>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >>> >> >>>>>>>>> (Upper s)) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>> {-# >>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >>> >> >>>>>>>>> (Lower s)) {-# >>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>> >> >>>>>>>>> pointers, >>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>> basically >>> >> >>>>>>>>> the same >>> >> >>>>>>>>> thing with a mutable reference up pointing at the structure >>> >> >>>>>>>>> above. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On the heap this is an object that points to a structure >>> for the >>> >> >>>>>>>>> bytearray, and points to another structure for each mutvar >>> which >>> >> >>>>>>>>> each point >>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>> >> >>>>>>>>> indirection smeared >>> >> >>>>>>>>> over everything. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >>> >> >>>>>>>>> from >>> >> >>>>>>>>> the structure below to the structure above. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>> and >>> >> >>>>>>>>> the >>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>> objects, >>> >> >>>>>>>>> represented >>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>> >> >>>>>>>>> sameMutableArrayArray# on these >>> >> >>>>>>>>> for object identity, which lets me check for the ends of the >>> >> >>>>>>>>> lists by tying >>> >> >>>>>>>>> things back on themselves. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> and below that >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>> up to >>> >> >>>>>>>>> an >>> >> >>>>>>>>> upper structure. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>> the >>> >> >>>>>>>>> slots >>> >> >>>>>>>>> in question, while it has gained a level of indirection >>> between >>> >> >>>>>>>>> the wrapper >>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one >>> can >>> >> >>>>>>>>> be basically >>> >> >>>>>>>>> erased by ghc. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Unlike before I don't have several separate objects on the >>> heap >>> >> >>>>>>>>> for >>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>> the >>> >> >>>>>>>>> object itself, >>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>> around the >>> >> >>>>>>>>> mutable >>> >> >>>>>>>>> int. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The only pain points are >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >>> >> >>>>>>>>> from >>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>> into an >>> >> >>>>>>>>> ArrayArray >>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >>> >> >>>>>>>>> Haskell, >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> and >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid >>> the >>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>> pointers >>> >> >>>>>>>>> wide. Card >>> >> >>>>>>>>> marking doesn't help. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>> >> >>>>>>>>> convert >>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>> >> >>>>>>>>> unsafeCoerce my way >>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>> >> >>>>>>>>> directly into the >>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >>> >> >>>>>>>>> aping the >>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>> dangerous! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >>> >> >>>>>>>>> altar >>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>> them >>> >> >>>>>>>>> and collect >>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>> >> >>>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> That?s an interesting idea. >>> >> >>>>>>>>> >>> >> >>>>>>>>> Manuel >>> >> >>>>>>>>> >>> >> >>>>>>>>> > Edward Kmett : >>> >> >>>>>>>>> >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# >>> and >>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >>> >> >>>>>>>>> > ArrayArray# entries >>> >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection >>> for >>> >> >>>>>>>>> > the containing >>> >> >>>>>>>>> > structure is amazing, but I can only currently use it if >>> my >>> >> >>>>>>>>> > leaf level data >>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>> It'd be >>> >> >>>>>>>>> > nice to be >>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down >>> at >>> >> >>>>>>>>> > the leaves to >>> >> >>>>>>>>> > hold lifted contents. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >>> >> >>>>>>>>> > access >>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do >>> that >>> >> >>>>>>>>> > if i tried to >>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >>> >> >>>>>>>>> > ByteArray# >>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>> preventing >>> >> >>>>>>>>> > this. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >>> >> >>>>>>>>> > could shoehorn a >>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>> unnecessary >>> >> >>>>>>>>> > indirection compared to c/java and this could reduce that >>> pain >>> >> >>>>>>>>> > to just 1 >>> >> >>>>>>>>> > level of unnecessary indirection. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> > _______________________________________________ >>> >> >>>>>>>>> > ghc-devs mailing list >>> >> >>>>>>>>> > ghc-devs at haskell.org >>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> _______________________________________________ >>> >> >>>>>>>>> ghc-devs mailing list >>> >> >>>>>>>>> ghc-devs at haskell.org >>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> >>>>>>> >>> >> >>>>>>> >>> >> >>>>> >>> >> >>> >>> >> >> >>> >> > >>> >> > >>> >> > _______________________________________________ >>> >> > ghc-devs mailing list >>> >> > ghc-devs at haskell.org >>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> > >>> > >>> > >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >>> >>> >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Mon Sep 7 21:23:38 2015 From: dan.doel at gmail.com (Dan Doel) Date: Mon, 7 Sep 2015 17:23:38 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett wrote: > Notably, it'd be interesting to explore the ability to allow parametricity > over the portion of # that is just a gcptr. Which is also a necessary part of Ed Yang's unlifted types proposal. This portion of # becomes the `Unlifted` kind, and it should be possible to have parametric polymorphism for it (and if that isn't stated outright, several things in the proposal assume you have it). -- Dan From ezyang at mit.edu Mon Sep 7 21:35:58 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 07 Sep 2015 14:35:58 -0700 Subject: Unlifted data types In-Reply-To: <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <1441661177-sup-2150@sabre> Hello Simon, > There are several distinct things being mixed up. I've split the document into three (four?) distinct subproposals. Proposals 1 and 2 stand alone. > (1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: [snip] > > I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple. This is now "Proposal 1". > (2) Second, we cannot expect levity polymorphism. Consider > map f (x:xs) = f x : map f xs > Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out. > > So I vote against splitting # into two: plain will do just fine. Levity polymorphism will not work without generating two copies of 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has also pointed out.) In any case, I've extracted this out into a separate subproposal "Proposal 1.1". https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:PolymorphismoveranewUnliftedkind (reordering here.) > (4) Fourth, you don't mention a related suggestion, namely to allow > newtype T = MkT Int# > with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type. This is now "Proposal 2". > (3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation. I agree that we can't introduce a coercion between 'Force a' and 'a', for the reason you mentioned. (there's also a second reason which is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have different kinds.) I've imagined that we might be able to just continue representing Force explicitly in Core, and somehow "compile it away" at STG time, but I am definitely fuzzy about how this is supposed to work. Perhaps Force should not actually be a data type, and we should have 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter of which compiles to a no-op.) Cheers, Edward From simonpj at microsoft.com Mon Sep 7 21:37:37 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 21:37:37 +0000 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> | Splitting # into two kinds is useful even if functions can't be levity | polymorphic. # contains a bunch of types that aren't represented | uniformly. Int# might be 32 bits while Double# is 64, etc. But | Unlifted would contain only types that are uniformly represented as | pointers, so you could write functions that are polymorphic over types | of kind Unlifted. Yes, I agree that's true, provided they are *not* also polymorphic over things of kind *. But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef# Let's tackle things one at a time, with separate proposals and separate motivation. Simon | C++ style polymorphism-as-code-generation). | | ---- | | Also, with regard to the previous mail, it's not true that `suspend` | has to be a special form. All expressions with types of kind * are | 'special forms' in the necessary sense. | | -- Dan From simonpj at microsoft.com Mon Sep 7 21:52:27 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 21:52:27 +0000 Subject: Unlifted data types In-Reply-To: <1441661177-sup-2150@sabre> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <1441661177-sup-2150@sabre> Message-ID: <0f5878d44e584b6dae8fb7de6fdf1ca8@DB4PR30MB030.064d.mgd.msft.net> | I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. I've re-numbered them 1,2,3,4, since 1.1 is (to me) a pretty major deal, stands in its own right, and certainly isn't a sub-proposal of (1). Under (new) 2, I'm very dubious about "Boxed levity polymorphism in types (and functions with extra code generation)". It's certainly true that we could generate two copied of the code for every function; but by generating three, or perhaps four copies we could also deal with Int# and Float#. Maybe one more for Double#. .NET does this on the fly, incidentally. Where do you stop? Also remember it's not just an issue of GC pointers. The semantics of the function changes, because things that are thunks for the lifted version become strict in the unlifted version. Your 'umap' is a bit more convincing. But for now (2) would be low on my priority list, until we encounter user pressure which (note) we have not encountered so far despite the range of boxed but unlifted types. Why is now the right time? (1) and (3) seem solid. I'll leave (4) for another message. Simon | -----Original Message----- | From: Edward Z. Yang [mailto:ezyang at mit.edu] | Sent: 07 September 2015 22:36 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: RE: Unlifted data types | | Hello Simon, | | > There are several distinct things being mixed up. | | I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. | | > (1) First, a proposal to allow a data type to be declared to be | unlifted. On its own, this is a pretty simple proposal: [snip] | > | > I would really like to see this articulated as a stand-alone proposal. | It makes sense by itself, and is really pretty simple. | | This is now "Proposal 1". | | > (2) Second, we cannot expect levity polymorphism. Consider | > map f (x:xs) = f x : map f xs | > Is the (f x) a thunk or is it evaluated strictly? Unless you are going | to clone the code for map (which levity polymorphism is there to avoid), | we can't answer "it depends on the type of (f x)". So, no, I think | levity polymorphism is out. | > | > So I vote against splitting # into two: plain will do just fine. | | Levity polymorphism will not work without generating two copies of | 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has | also pointed out.) In any case, I've extracted this out into a | separate subproposal "Proposal 1.1". | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:Polym | orphismoveranewUnliftedkind | | (reordering here.) | | > (4) Fourth, you don't mention a related suggestion, namely to allow | > newtype T = MkT Int# | > with T getting kind #. I see no difficulty here. We do have (T ~R | Int#). It's just a useful way of wrapping a newtype around an unlifted | type. | | This is now "Proposal 2". | | > (3) Third, the stuff about Force and suspend. Provided you do no more | than write library code that uses the above new features I'm fine. But | there seems to be lots of stuff that dances around the hope that (Force | a) is represented the same way as 'a'. I don't' know how to make this | fly. Is there a coercion in FC? If so then (a ~R Force a). And that | seems very doubtful since we must do some evaluation. | | I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) | | Cheers, | Edward From simonpj at microsoft.com Mon Sep 7 21:55:09 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 21:55:09 +0000 Subject: Unlifted data types In-Reply-To: <1441661177-sup-2150@sabre> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <1441661177-sup-2150@sabre> Message-ID: <9cafcebc6d274b2385f202a4fd224174@DB4PR30MB030.064d.mgd.msft.net> | I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) I'm still doubtful. What is the problem you are trying to solve here? How does Force help us? Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. f x = (# x+1 #) return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant. Simon From simonpj at microsoft.com Mon Sep 7 21:56:56 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 7 Sep 2015 21:56:56 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}# references into a {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the need for folks to descend into the use of the more evil structure primitives we're talking about, and letting us keep a few more principles around us. I?m lost. Can you give some concrete examples that illustrate how levity polymorphism will help us? Simon From: Edward Kmett [mailto:ekmett at gmail.com] Sent: 07 September 2015 21:17 To: Simon Peyton Jones Cc: Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates Subject: Re: ArrayArrays I had a brief discussion with Richard during the Haskell Symposium about how we might be able to let parametricity help a bit in reducing the space of necessarily primops to a slightly more manageable level. Notably, it'd be interesting to explore the ability to allow parametricity over the portion of # that is just a gcptr. We could do this if the levity polymorphism machinery was tweaked a bit. You could envision the ability to abstract over things in both * and the subset of # that are represented by a gcptr, then modifying the existing array primitives to be parametric in that choice of levity for their argument so long as it was of a "heap object" levity. This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}# references into a {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the need for folks to descend into the use of the more evil structure primitives we're talking about, and letting us keep a few more principles around us. Then in the cases like `atomicModifyMutVar#` where it needs to actually be in * rather than just a gcptr, due to the constructed field selectors it introduces on the heap then we could keep the existing less polymorphic type. -Edward On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > wrote: It was fun to meet and discuss this. Did someone volunteer to write a wiki page that describes the proposed design? And, I earnestly hope, also describes the menagerie of currently available array types and primops so that users can have some chance of picking the right one?! Thanks Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Ryan Newton Sent: 31 August 2015 23:11 To: Edward Kmett; Johan Tibell Cc: Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates Subject: Re: ArrayArrays Dear Edward, Ryan Yates, and other interested parties -- So when should we meet up about this? May I propose the Tues afternoon break for everyone at ICFP who is interested in this topic? We can meet out in the coffee area and congregate around Edward Kmett, who is tall and should be easy to find ;-). I think Ryan is going to show us how to use his new primops for combined array + other fields in one heap object? On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > wrote: Without a custom primitive it doesn't help much there, you have to store the indirection to the mask. With a custom primitive it should cut the on heap root-to-leaf path of everything in the HAMT in half. A shorter HashMap was actually one of the motivating factors for me doing this. It is rather astoundingly difficult to beat the performance of HashMap, so I had to start cheating pretty badly. ;) -Edward On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: I'd also be interested to chat at ICFP to see if I can use this for my HAMT implementation. On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > wrote: Sounds good to me. Right now I'm just hacking up composable accessors for "typed slots" in a fairly lens-like fashion, and treating the set of slots I define and the 'new' function I build for the data type as its API, and build atop that. This could eventually graduate to template-haskell, but I'm not entirely satisfied with the solution I have. I currently distinguish between what I'm calling "slots" (things that point directly to another SmallMutableArrayArray# sans wrapper) and "fields" which point directly to the usual Haskell data types because unifying the two notions meant that I couldn't lift some coercions out "far enough" to make them vanish. I'll be happy to run through my current working set of issues in person and -- as things get nailed down further -- in a longer lived medium than in personal conversations. ;) -Edward On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > wrote: I'd also love to meet up at ICFP and discuss this. I think the array primops plus a TH layer that lets (ab)use them many times without too much marginal cost sounds great. And I'd like to learn how we could be either early users of, or help with, this infrastructure. CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is currently working on concurrent data structures in Haskell, but will not be at ICFP. On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > wrote: I completely agree. I would love to spend some time during ICFP and friends talking about what it could look like. My small array for STM changes for the RTS can be seen here [1]. It is on a branch somewhere between 7.8 and 7.10 and includes irrelevant STM bits and some confusing naming choices (sorry), but should cover all the details needed to implement it for a non-STM context. The biggest surprise for me was following small array too closely and having a word/byte offset miss-match [2]. [1]: https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 Ryan On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett > wrote: > I'd love to have that last 10%, but its a lot of work to get there and more > importantly I don't know quite what it should look like. > > On the other hand, I do have a pretty good idea of how the primitives above > could be banged out and tested in a long evening, well in time for 7.12. And > as noted earlier, those remain useful even if a nicer typed version with an > extra level of indirection to the sizes is built up after. > > The rest sounds like a good graduate student project for someone who has > graduate students lying around. Maybe somebody at Indiana University who has > an interest in type theory and parallelism can find us one. =) > > -Edward > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates > wrote: >> >> I think from my perspective, the motivation for getting the type >> checker involved is primarily bringing this to the level where users >> could be expected to build these structures. it is reasonable to >> think that there are people who want to use STM (a context with >> mutation already) to implement a straight forward data structure that >> avoids extra indirection penalty. There should be some places where >> knowing that things are field accesses rather then array indexing >> could be helpful, but I think GHC is good right now about handling >> constant offsets. In my code I don't do any bounds checking as I know >> I will only be accessing my arrays with constant indexes. I make >> wrappers for each field access and leave all the unsafe stuff in >> there. When things go wrong though, the compiler is no help. Maybe >> template Haskell that generates the appropriate wrappers is the right >> direction to go. >> There is another benefit for me when working with these as arrays in >> that it is quite simple and direct (given the hoops already jumped >> through) to play with alignment. I can ensure two pointers are never >> on the same cache-line by just spacing things out in the array. >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett > wrote: >> > They just segfault at this level. ;) >> > >> > Sent from my iPhone >> > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton > wrote: >> > >> > You presumably also save a bounds check on reads by hard-coding the >> > sizes? >> > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> independent questions: >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> b.) if you want cardmarking. >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> pretty >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> -Edward >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: >> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up >> >>> if >> >>> they were small enough and there are enough of them. You get a bit >> >>> better >> >>> locality of reference in terms of what fits in the first cache line of >> >>> them. >> >>> >> >>> -Edward >> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: >> >>>> >> >>>> Yes. And for the short term I can imagine places we will settle with >> >>>> arrays even if it means tracking lengths unnecessarily and >> >>>> unsafeCoercing >> >>>> pointers whose types don't actually match their siblings. >> >>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >>>> array >> >>>> objects *other* than using them to fake structs? (Much to >> >>>> derecommend, as >> >>>> you mentioned!) >> >>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: >> >>>>> >> >>>>> I think both are useful, but the one you suggest requires a lot more >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >>>>> >> >>>>> -Edward >> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: >> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >>>>>> unbounded >> >>>>>> length) with extra payload. >> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >> >>>>>> especially >> >>>>>> with the extra payload at front. But wouldn't the general solution >> >>>>>> for >> >>>>>> structs be one that that allows new user data type defs for # >> >>>>>> types? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >>>>>>> known >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >>>>>>> above, but >> >>>>>>> where the word counts were stored in the objects themselves. >> >>>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >>>>>>> likely >> >>>>>>> want to be something we build in addition to MutVar# rather than a >> >>>>>>> replacement. >> >>>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build info >> >>>>>>> tables that knew them, and typechecker support, for instance, it'd >> >>>>>>> get >> >>>>>>> rather invasive. >> >>>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' versions >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >>>>>>> inline at the >> >>>>>>> end of the structure cease to be possible, so it isn't even a pure >> >>>>>>> win if we >> >>>>>>> did the engineering effort. >> >>>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding the one >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >>>>>>> >> >>>>>>> -Edward >> >>>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton > >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable structs >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >> >>>>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >>>>>>>> identity problems. But what about directly supporting an >> >>>>>>>> extensible set of >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >>>>>>>> MutVar#? That >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >>>>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >>>>>>>> class >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential >> >>>>>>>> counterparts. >> >>>>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >>>>>>>>> maybe >> >>>>>>>>> make a ticket for it. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Thanks >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >>>>>>>>> To: Simon Peyton Jones >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >>>>>>>>> this >> >>>>>>>>> all lives on the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has >> >>>>>>>>> to >> >>>>>>>>> deal with nested arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The Problem >> >>>>>>>>> >> >>>>>>>>> ----------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >>>>>>>>> list >> >>>>>>>>> in Haskell. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers >> >>>>>>>>> on >> >>>>>>>>> the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >>>>>>>>> Maybe >> >>>>>>>>> DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> That is 3 levels of indirection. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >>>>>>>>> worsening our representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This means that every operation we perform on this structure >> >>>>>>>>> will >> >>>>>>>>> be about half of the speed of an implementation in most other >> >>>>>>>>> languages >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Making Progress >> >>>>>>>>> >> >>>>>>>>> ---------------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I have been working on a number of data structures where the >> >>>>>>>>> indirection of going from something in * out to an object in # >> >>>>>>>>> which >> >>>>>>>>> contains the real pointer to my target and coming back >> >>>>>>>>> effectively doubles >> >>>>>>>>> my runtime. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >>>>>>>>> MutVar# >> >>>>>>>>> onto the mutable list when we dirty it. There is a well defined >> >>>>>>>>> write-barrier. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I could change out the representation to use >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, >> >>>>>>>>> but >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of >> >>>>>>>>> distinct >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >>>>>>>>> object to 2. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >>>>>>>>> array >> >>>>>>>>> object and then chase it to the next DLL and chase that to the >> >>>>>>>>> next array. I >> >>>>>>>>> do get my two pointers together in memory though. I'm paying for >> >>>>>>>>> a card >> >>>>>>>>> marking table as well, which I don't particularly need with just >> >>>>>>>>> two >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >>>>>>>>> machinery added >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >>>>>>>>> type, which can >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have two >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >>>>>>>>> What >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >>>>>>>>> impedence >> >>>>>>>>> mismatch between the imperative world and Haskell, and then just >> >>>>>>>>> let the >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >>>>>>>>> special >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >>>>>>>>> abuse pattern >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >>>>>>>>> make this >> >>>>>>>>> cheaper. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding >> >>>>>>>>> and next >> >>>>>>>>> entry in the linked list. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >> >>>>>>>>> strict world, and everything there lives in #. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >> >>>>>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >>>>>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >>>>>>>>> easily when they >> >>>>>>>>> are known strict and you chain operations of this sort! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Cleaning it Up >> >>>>>>>>> >> >>>>>>>>> ------------------ >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array that >> >>>>>>>>> points directly to other arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I can >> >>>>>>>>> fix >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using a >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a >> >>>>>>>>> mixture of >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >> >>>>>>>>> existing >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the >> >>>>>>>>> arguments it >> >>>>>>>>> takes. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that >> >>>>>>>>> would >> >>>>>>>>> be best left unboxed. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently >> >>>>>>>>> at >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >> >>>>>>>>> boxed or at a >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in >> >>>>>>>>> question in >> >>>>>>>>> there. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having to >> >>>>>>>>> go off to >> >>>>>>>>> the side costs me the entire win from avoiding the first pointer >> >>>>>>>>> chase. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >> >>>>>>>>> construct that had n words with unsafe access and m pointers to >> >>>>>>>>> other heap >> >>>>>>>>> objects, one that could put itself on the mutable list when any >> >>>>>>>>> of those >> >>>>>>>>> pointers changed then I could shed this last factor of two in >> >>>>>>>>> all >> >>>>>>>>> circumstances. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Prototype >> >>>>>>>>> >> >>>>>>>>> ------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Over the last few days I've put together a small prototype >> >>>>>>>>> implementation with a few non-trivial imperative data structures >> >>>>>>>>> for things >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >> >>>>>>>>> order-maintenance. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Notable bits: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >> >>>>>>>>> link-cut >> >>>>>>>>> trees in this style. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that >> >>>>>>>>> make >> >>>>>>>>> it go fast. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost >> >>>>>>>>> all >> >>>>>>>>> the references to the LinkCut or Object data constructor get >> >>>>>>>>> optimized away, >> >>>>>>>>> and we're left with beautiful strict code directly mutating out >> >>>>>>>>> underlying >> >>>>>>>>> representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >>>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this thread. >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a >> >>>>>>>>> ticket? Is >> >>>>>>>>> there a wiki page? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a >> >>>>>>>>> good >> >>>>>>>>> thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf >> >>>>>>>>> Of >> >>>>>>>>> Edward Kmett >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >>>>>>>>> To: Manuel M T Chakravarty >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >> >>>>>>>>> very handy as well. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider right now if I have something like an order-maintenance >> >>>>>>>>> structure I have: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Upper s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Lower s)) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >> >>>>>>>>> pointers, >> >>>>>>>>> one for forward and one for backwards. The latter is basically >> >>>>>>>>> the same >> >>>>>>>>> thing with a mutable reference up pointing at the structure >> >>>>>>>>> above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On the heap this is an object that points to a structure for the >> >>>>>>>>> bytearray, and points to another structure for each mutvar which >> >>>>>>>>> each point >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >>>>>>>>> indirection smeared >> >>>>>>>>> over everything. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >> >>>>>>>>> from >> >>>>>>>>> the structure below to the structure above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and >> >>>>>>>>> the >> >>>>>>>>> next 2 slots pointing to the previous and next previous objects, >> >>>>>>>>> represented >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >>>>>>>>> sameMutableArrayArray# on these >> >>>>>>>>> for object identity, which lets me check for the ends of the >> >>>>>>>>> lists by tying >> >>>>>>>>> things back on themselves. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and below that >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to >> >>>>>>>>> an >> >>>>>>>>> upper structure. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out the >> >>>>>>>>> slots >> >>>>>>>>> in question, while it has gained a level of indirection between >> >>>>>>>>> the wrapper >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can >> >>>>>>>>> be basically >> >>>>>>>>> erased by ghc. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the heap >> >>>>>>>>> for >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the >> >>>>>>>>> object itself, >> >>>>>>>>> and the MutableByteArray# that it references to carry around the >> >>>>>>>>> mutable >> >>>>>>>>> int. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The only pain points are >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >> >>>>>>>>> from >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an >> >>>>>>>>> ArrayArray >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >> >>>>>>>>> Haskell, >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers >> >>>>>>>>> wide. Card >> >>>>>>>>> marking doesn't help. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >> >>>>>>>>> convert >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >> >>>>>>>>> unsafeCoerce my way >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >> >>>>>>>>> directly into the >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >> >>>>>>>>> aping the >> >>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >> >>>>>>>>> altar >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them >> >>>>>>>>> and collect >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >> >>>>>>>>> > wrote: >> >>>>>>>>> >> >>>>>>>>> That?s an interesting idea. >> >>>>>>>>> >> >>>>>>>>> Manuel >> >>>>>>>>> >> >>>>>>>>> > Edward Kmett >: >> >>>>>>>>> >> >>>>>>>>> > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >> >>>>>>>>> > ArrayArray# entries >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection for >> >>>>>>>>> > the containing >> >>>>>>>>> > structure is amazing, but I can only currently use it if my >> >>>>>>>>> > leaf level data >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be >> >>>>>>>>> > nice to be >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at >> >>>>>>>>> > the leaves to >> >>>>>>>>> > hold lifted contents. >> >>>>>>>>> > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >> >>>>>>>>> > access >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that >> >>>>>>>>> > if i tried to >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >> >>>>>>>>> > ByteArray# >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing >> >>>>>>>>> > this. >> >>>>>>>>> > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >> >>>>>>>>> > could shoehorn a >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >>>>>>>>> > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >> >>>>>>>>> > indirection compared to c/java and this could reduce that pain >> >>>>>>>>> > to just 1 >> >>>>>>>>> > level of unnecessary indirection. >> >>>>>>>>> > >> >>>>>>>>> > -Edward >> >>>>>>>>> >> >>>>>>>>> > _______________________________________________ >> >>>>>>>>> > ghc-devs mailing list >> >>>>>>>>> > ghc-devs at haskell.org >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> _______________________________________________ >> >>>>>>>>> ghc-devs mailing list >> >>>>>>>>> ghc-devs at haskell.org >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>> >> >>>>>>> >> >>>>> >> >>> >> >> >> > >> > >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > > _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Mon Sep 7 22:08:48 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 07 Sep 2015 15:08:48 -0700 Subject: Unlifted data types In-Reply-To: <9cafcebc6d274b2385f202a4fd224174@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <1441661177-sup-2150@sabre> <9cafcebc6d274b2385f202a4fd224174@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <1441663307-sup-612@sabre> Excerpts from Simon Peyton Jones's message of 2015-09-07 14:55:09 -0700: > I'm still doubtful. What is the problem you are trying to solve here? How does Force help us? The problem 'Force' is trying to solve is the fact that Haskell currently has many existing lifted data types, and they all have ~essentially identical unlifted versions. But for a user to write the lifted and unlifted version, they have to copy paste their code or use 'Force'. > Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. > f x = (# x+1 #) > return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant. I don't think so? Unboxed tuples take a computation with kind * and represent it in kind #. But 'suspend' takes a computation in kind # and represents in kind *. Edward From dan.doel at gmail.com Mon Sep 7 23:09:18 2015 From: dan.doel at gmail.com (Dan Doel) Date: Mon, 7 Sep 2015 19:09:18 -0400 Subject: Unlifted data types In-Reply-To: <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On Mon, Sep 7, 2015 at 5:37 PM, Simon Peyton Jones wrote: > But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef# Well, evidently people over in the ArrayArray thread want it, for one. But also, if general unlifted types get accepted, there are many possible uses. For instance, people working with concurrency have to worry about work being done in the correct thread, and there are functions on MVars and whatnot that ensure that threads don't simply pass thunks between each other. But, if you have unlifted types, then you can have: data UMVar (a :: Unlifted) and then the type rules out the possibility of passing thunks through a reference (at least at the top level). But this requires polymorphism to avoid having to create a separate type for each unlifted type. This is also a use case of `Force`, since it is likely that we want to put ordinary data types in the MVars, just ensure that we aren't passing thunks with delayed work. ---- I'm kind of down on being polymorphic over choice of evaluation order, as well. At least without any further motivation. ---- Also, I'd still like to synthesize some of the redundancies introduced by the proposal. Perhaps it could be done by making `Force` a more primitive building block than !. I.E. data Nat = Zero | Suc !Nat can be, under this proposal, considered sugar for something like: data Nat = Zero | Suc_INTERNAL# (Force Nat) pattern Suc x = Suc_INTERNAL# (Force x) and all stipulations about what you can UNPACK are actually about Unlifted fields, rather than ! fields (which just inherit them from Force). That still leaves `Force LiftedSum` vs. `UnliftedSum`, though. And to be honest, I'm not sure we need arbitrary data types in Unlifted; Force (which would be primitive) might be enough. -- Dan From ekmett at gmail.com Mon Sep 7 23:14:35 2015 From: ekmett at gmail.com (Edward Kmett) Date: Mon, 7 Sep 2015 19:14:35 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Assume we had the ability to talk about Levity in a new way and instead of just: data Levity = Lifted | Unlifted type * = TYPE 'Lifted type # = TYPE 'Unlifted we replace had a more nuanced notion of TYPE parameterized on another data type: data Levity = Lifted | Unlifted data Param = Composite | Simple Levity and we parameterized TYPE with a Param rather than Levity. Existing strange representations can continue to live in TYPE 'Composite (# Int# , Double #) :: TYPE 'Composite and we don't support parametricity in there, just like, currently we don't allow parametricity in #. We can include the undefined example from Richard's talk: undefined :: forall (v :: Param). v and ultimately lift it into his pi type when it is available just as before. But we could let consider TYPE ('Simple 'Unlifted) as a form of 'parametric #' covering unlifted things we're willing to allow polymorphism over because they are just pointers to something in the heap, that just happens to not be able to be _|_ or a thunk. In this setting, recalling that above, I modified Richard's TYPE to take a Param instead of Levity, we can define a type alias for things that live as a simple pointer to a heap allocated object: type GC (l :: Levity) = TYPE ('Simple l) type * = GC 'Lifted and then we can look at existing primitives generalized: Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted Weak#, StablePtr#, StableName#, etc. all can take similar modifications. Recall that an ArrayArray# was just an Array# hacked up to be able to hold onto the subset of # that is collectable. Almost all of the operations on these data types can work on the more general kind of argument. newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> State# s -> (# State# s, MutableArray# s a #) writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a -> Int# -> a -> State# s -> State# s readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a -> Int# -> State# s -> (# State# s, a #) etc. Only a couple of our existing primitives _can't_ generalize this way. The one that leaps to mind is atomicModifyMutVar, which would need to stay constrained to only work on arguments in *, because of the way it operates. With that we can still talk about MutableArray# s Int but now we can also talk about: MutableArray# s (MutableArray# s Int) without the layer of indirection through a box in * and without an explosion of primops. The same newFoo, readFoo, writeFoo machinery works for both kinds. The struct machinery doesn't get to take advantage of this, but it would let us clean house elsewhere in Prim and drastically improve the range of applicability of the existing primitives with nothing more than a small change to the levity machinery. I'm not attached to any of the names above, I coined them just to give us a concrete thing to talk about. Here I'm only proposing we extend machinery in GHC.Prim this way, but an interesting 'now that the barn door is open' question is to consider that our existing Haskell data types often admit a similar form of parametricity and nothing in principle prevents this from working for Maybe or [] and once you permit inference to fire across all of GC l then it seems to me that you'd start to get those same capabilities there as well when LevityPolymorphism was turned on. -Edward On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones wrote: > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the > need for folks to descend into the use of the more evil structure > primitives we're talking about, and letting us keep a few more principles > around us. > > > > I?m lost. Can you give some concrete examples that illustrate how levity > polymorphism will help us? > > > Simon > > > > *From:* Edward Kmett [mailto:ekmett at gmail.com] > *Sent:* 07 September 2015 21:17 > *To:* Simon Peyton Jones > *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T Chakravarty; > Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays > > > > I had a brief discussion with Richard during the Haskell Symposium about > how we might be able to let parametricity help a bit in reducing the space > of necessarily primops to a slightly more manageable level. > > > > Notably, it'd be interesting to explore the ability to allow parametricity > over the portion of # that is just a gcptr. > > > > We could do this if the levity polymorphism machinery was tweaked a bit. > You could envision the ability to abstract over things in both * and the > subset of # that are represented by a gcptr, then modifying the existing > array primitives to be parametric in that choice of levity for their > argument so long as it was of a "heap object" levity. > > > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the > need for folks to descend into the use of the more evil structure > primitives we're talking about, and letting us keep a few more principles > around us. > > > > Then in the cases like `atomicModifyMutVar#` where it needs to actually be > in * rather than just a gcptr, due to the constructed field selectors it > introduces on the heap then we could keep the existing less polymorphic > type. > > > > -Edward > > > > On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > wrote: > > It was fun to meet and discuss this. > > > > Did someone volunteer to write a wiki page that describes the proposed > design? And, I earnestly hope, also describes the menagerie of currently > available array types and primops so that users can have some chance of > picking the right one?! > > > > Thanks > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan > Newton > *Sent:* 31 August 2015 23:11 > *To:* Edward Kmett; Johan Tibell > *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; > Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays > > > > Dear Edward, Ryan Yates, and other interested parties -- > > > > So when should we meet up about this? > > > > May I propose the Tues afternoon break for everyone at ICFP who is > interested in this topic? We can meet out in the coffee area and > congregate around Edward Kmett, who is tall and should be easy to find ;-). > > > > I think Ryan is going to show us how to use his new primops for combined > array + other fields in one heap object? > > > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: > > Without a custom primitive it doesn't help much there, you have to store > the indirection to the mask. > > > > With a custom primitive it should cut the on heap root-to-leaf path of > everything in the HAMT in half. A shorter HashMap was actually one of the > motivating factors for me doing this. It is rather astoundingly difficult > to beat the performance of HashMap, so I had to start cheating pretty > badly. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: > > I'd also be interested to chat at ICFP to see if I can use this for my > HAMT implementation. > > > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: > > Sounds good to me. Right now I'm just hacking up composable accessors for > "typed slots" in a fairly lens-like fashion, and treating the set of slots > I define and the 'new' function I build for the data type as its API, and > build atop that. This could eventually graduate to template-haskell, but > I'm not entirely satisfied with the solution I have. I currently > distinguish between what I'm calling "slots" (things that point directly to > another SmallMutableArrayArray# sans wrapper) and "fields" which point > directly to the usual Haskell data types because unifying the two notions > meant that I couldn't lift some coercions out "far enough" to make them > vanish. > > > > I'll be happy to run through my current working set of issues in person > and -- as things get nailed down further -- in a longer lived medium than > in personal conversations. ;) > > > > -Edward > > > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: > > I'd also love to meet up at ICFP and discuss this. I think the array > primops plus a TH layer that lets (ab)use them many times without too much > marginal cost sounds great. And I'd like to learn how we could be either > early users of, or help with, this infrastructure. > > > > CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping > in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is > currently working on concurrent data structures in Haskell, but will not be > at ICFP. > > > > > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: > > I completely agree. I would love to spend some time during ICFP and > friends talking about what it could look like. My small array for STM > changes for the RTS can be seen here [1]. It is on a branch somewhere > between 7.8 and 7.10 and includes irrelevant STM bits and some > confusing naming choices (sorry), but should cover all the details > needed to implement it for a non-STM context. The biggest surprise > for me was following small array too closely and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan > > > On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: > > I'd love to have that last 10%, but its a lot of work to get there and > more > > importantly I don't know quite what it should look like. > > > > On the other hand, I do have a pretty good idea of how the primitives > above > > could be banged out and tested in a long evening, well in time for 7.12. > And > > as noted earlier, those remain useful even if a nicer typed version with > an > > extra level of indirection to the sizes is built up after. > > > > The rest sounds like a good graduate student project for someone who has > > graduate students lying around. Maybe somebody at Indiana University who > has > > an interest in type theory and parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: > >> > >> I think from my perspective, the motivation for getting the type > >> checker involved is primarily bringing this to the level where users > >> could be expected to build these structures. it is reasonable to > >> think that there are people who want to use STM (a context with > >> mutation already) to implement a straight forward data structure that > >> avoids extra indirection penalty. There should be some places where > >> knowing that things are field accesses rather then array indexing > >> could be helpful, but I think GHC is good right now about handling > >> constant offsets. In my code I don't do any bounds checking as I know > >> I will only be accessing my arrays with constant indexes. I make > >> wrappers for each field access and leave all the unsafe stuff in > >> there. When things go wrong though, the compiler is no help. Maybe > >> template Haskell that generates the appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me when working with these as arrays in > >> that it is quite simple and direct (given the hoops already jumped > >> through) to play with alignment. I can ensure two pointers are never > >> on the same cache-line by just spacing things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > >> > They just segfault at this level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > >> > > >> > You presumably also save a bounds check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: > >> >> > >> >> Also there are 4 different "things" here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the sizes into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without cardmarking for different sizes can be done > >> >> pretty > >> >> easily, but as noted, the infotable variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: > >> >>> > >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up > >> >>> if > >> >>> they were small enough and there are enough of them. You get a bit > >> >>> better > >> >>> locality of reference in terms of what fits in the first cache line > of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short term I can imagine places we will settle > with > >> >>>> arrays even if it means tracking lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't actually match their siblings. > >> >>>> > >> >>>> Is there anything to recommend the hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using them to fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, but the one you suggest requires a lot > more > >> >>>>> plumbing and doesn't subsume all of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an array like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do without structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload at front. But wouldn't the general > solution > >> >>>>>> for > >> >>>>>> structs be one that that allows new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of MutableStruct# with a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a couple of words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we build in addition to MutVar# rather > than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if we had to fix those numbers and build info > >> >>>>>>> tables that knew them, and typechecker support, for instance, > it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of things that we can do with the 'sized' > versions > >> >>>>>>> above, like working with evil unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure cease to be possible, so it isn't even a > pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the needs I have are covered just by adding the > one > >> >>>>>>> primitive. The last 10% gets pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < > rrnewton at gmail.com> > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility of a general solution for mutable > structs > >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's > hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack MutVar into constructors because of object > >> >>>>>>>> identity problems. But what about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, but is it problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this is also critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >> >>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >> >>>>>>>>> Sent: 27 August 2015 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It > >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or > ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in #, they are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing them as a placeholder for a better > thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still stuck with a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the speed of an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory bound on loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a number of data structures where the > >> >>>>>>>>> indirection of going from something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real pointer to my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list when we dirty it. There is a well > defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount > of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase it to the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers together in memory though. I'm paying > for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as well, which I don't particularly need with > just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which is just the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit when you don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted my object itself to live in # and have > two > >> >>>>>>>>> mutable fields and be able to share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the imperative world and Haskell, and then > just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the readMutableArrayArray# and > >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' > into a > >> >>>>>>>>> strict world, and everything there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code > to > >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and you chain operations of this sort! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Cleaning it Up > >> >>>>>>>>> > >> >>>>>>>>> ------------------ > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Now I have one outermost indirection pointing to an array that > >> >>>>>>>>> points directly to other arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I > can > >> >>>>>>>>> fix > >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using > a > >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a > >> >>>>>>>>> mixture of > >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data > structure. > >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the > >> >>>>>>>>> existing > >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the > >> >>>>>>>>> arguments it > >> >>>>>>>>> takes. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that > >> >>>>>>>>> would > >> >>>>>>>>> be best left unboxed. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently > >> >>>>>>>>> at > >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a > >> >>>>>>>>> boxed or at a > >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int > in > >> >>>>>>>>> question in > >> >>>>>>>>> there. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to > >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having > to > >> >>>>>>>>> go off to > >> >>>>>>>>> the side costs me the entire win from avoiding the first > pointer > >> >>>>>>>>> chase. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could > >> >>>>>>>>> construct that had n words with unsafe access and m pointers > to > >> >>>>>>>>> other heap > >> >>>>>>>>> objects, one that could put itself on the mutable list when > any > >> >>>>>>>>> of those > >> >>>>>>>>> pointers changed then I could shed this last factor of two in > >> >>>>>>>>> all > >> >>>>>>>>> circumstances. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Prototype > >> >>>>>>>>> > >> >>>>>>>>> ------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Over the last few days I've put together a small prototype > >> >>>>>>>>> implementation with a few non-trivial imperative data > structures > >> >>>>>>>>> for things > >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and > >> >>>>>>>>> order-maintenance. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> https://github.com/ekmett/structs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Notable bits: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of > >> >>>>>>>>> link-cut > >> >>>>>>>>> trees in this style. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that > >> >>>>>>>>> make > >> >>>>>>>>> it go fast. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost > >> >>>>>>>>> all > >> >>>>>>>>> the references to the LinkCut or Object data constructor get > >> >>>>>>>>> optimized away, > >> >>>>>>>>> and we're left with beautiful strict code directly mutating > out > >> >>>>>>>>> underlying > >> >>>>>>>>> representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> Just to say that I have no idea what is going on in this > thread. > >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a > >> >>>>>>>>> ticket? Is > >> >>>>>>>>> there a wiki page? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a > >> >>>>>>>>> good > >> >>>>>>>>> thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On > Behalf > >> >>>>>>>>> Of > >> >>>>>>>>> Edward Kmett > >> >>>>>>>>> Sent: 21 August 2015 05:25 > >> >>>>>>>>> To: Manuel M T Chakravarty > >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would > be > >> >>>>>>>>> very handy as well. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider right now if I have something like an > order-maintenance > >> >>>>>>>>> structure I have: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Upper s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Lower s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The former contains, logically, a mutable integer and two > >> >>>>>>>>> pointers, > >> >>>>>>>>> one for forward and one for backwards. The latter is basically > >> >>>>>>>>> the same > >> >>>>>>>>> thing with a mutable reference up pointing at the structure > >> >>>>>>>>> above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On the heap this is an object that points to a structure for > the > >> >>>>>>>>> bytearray, and points to another structure for each mutvar > which > >> >>>>>>>>> each point > >> >>>>>>>>> to the other 'Upper' structure. So there is a level of > >> >>>>>>>>> indirection smeared > >> >>>>>>>>> over everything. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link > >> >>>>>>>>> from > >> >>>>>>>>> the structure below to the structure above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Converted into ArrayArray#s I'd get > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and > >> >>>>>>>>> the > >> >>>>>>>>> next 2 slots pointing to the previous and next previous > objects, > >> >>>>>>>>> represented > >> >>>>>>>>> just as their MutableArrayArray#s. I can use > >> >>>>>>>>> sameMutableArrayArray# on these > >> >>>>>>>>> for object identity, which lets me check for the ends of the > >> >>>>>>>>> lists by tying > >> >>>>>>>>> things back on themselves. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and below that > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up > to > >> >>>>>>>>> an > >> >>>>>>>>> upper structure. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can then write a handful of combinators for getting out the > >> >>>>>>>>> slots > >> >>>>>>>>> in question, while it has gained a level of indirection > between > >> >>>>>>>>> the wrapper > >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can > >> >>>>>>>>> be basically > >> >>>>>>>>> erased by ghc. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Unlike before I don't have several separate objects on the > heap > >> >>>>>>>>> for > >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the > >> >>>>>>>>> object itself, > >> >>>>>>>>> and the MutableByteArray# that it references to carry around > the > >> >>>>>>>>> mutable > >> >>>>>>>>> int. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The only pain points are > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me > >> >>>>>>>>> from > >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into > an > >> >>>>>>>>> ArrayArray > >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of > >> >>>>>>>>> Haskell, > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid > the > >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 > pointers > >> >>>>>>>>> wide. Card > >> >>>>>>>>> marking doesn't help. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Alternately I could just try to do really evil things and > >> >>>>>>>>> convert > >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to > >> >>>>>>>>> unsafeCoerce my way > >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays > >> >>>>>>>>> directly into the > >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by > >> >>>>>>>>> aping the > >> >>>>>>>>> MutableArrayArray# s API, but that gets really really > dangerous! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the > >> >>>>>>>>> altar > >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them > >> >>>>>>>>> and collect > >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> That?s an interesting idea. > >> >>>>>>>>> > >> >>>>>>>>> Manuel > >> >>>>>>>>> > >> >>>>>>>>> > Edward Kmett : > >> >>>>>>>>> > >> >>>>>>>>> > > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and > >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the > >> >>>>>>>>> > ArrayArray# entries > >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection > for > >> >>>>>>>>> > the containing > >> >>>>>>>>> > structure is amazing, but I can only currently use it if my > >> >>>>>>>>> > leaf level data > >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd > be > >> >>>>>>>>> > nice to be > >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at > >> >>>>>>>>> > the leaves to > >> >>>>>>>>> > hold lifted contents. > >> >>>>>>>>> > > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to > >> >>>>>>>>> > access > >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do > that > >> >>>>>>>>> > if i tried to > >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a > >> >>>>>>>>> > ByteArray# > >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing > >> >>>>>>>>> > this. > >> >>>>>>>>> > > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection > >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I > >> >>>>>>>>> > could shoehorn a > >> >>>>>>>>> > number of them into ArrayArrays if this worked. > >> >>>>>>>>> > > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary > >> >>>>>>>>> > indirection compared to c/java and this could reduce that > pain > >> >>>>>>>>> > to just 1 > >> >>>>>>>>> > level of unnecessary indirection. > >> >>>>>>>>> > > >> >>>>>>>>> > -Edward > >> >>>>>>>>> > >> >>>>>>>>> > _______________________________________________ > >> >>>>>>>>> > ghc-devs mailing list > >> >>>>>>>>> > ghc-devs at haskell.org > >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> _______________________________________________ > >> >>>>>>>>> ghc-devs mailing list > >> >>>>>>>>> ghc-devs at haskell.org > >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> >> > >> > > >> > > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > > > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kolmodin at gmail.com Tue Sep 8 06:11:40 2015 From: kolmodin at gmail.com (Lennart Kolmodin) Date: Tue, 8 Sep 2015 07:11:40 +0100 Subject: AnonymousSums data con syntax In-Reply-To: <1441657274.28403.7.camel@joachim-breitner.de> References: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> <1441657274.28403.7.camel@joachim-breitner.de> Message-ID: 2015-09-07 21:21 GMT+01:00 Joachim Breitner : > Hi, > > Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones: > > > Are we okay with stealing some operator sections for this? E.G. (x > > > > > ). I think the boxed sums larger than 2 choices are all > technically overlapping with sections. > > > > I hadn't thought of that. I suppose that in distfix notation we > > could require spaces > > (x | |) > > since vertical bar by itself isn't an operator. But then (_||) x > > might feel more compact. > > > > Also a section (x ||) isn't valid in a pattern, so we would not need > > to require spaces there. > > > > But my gut feel is: yes, with AnonymousSums we should just steal the > > syntax. It won't hurt existing code (since it won't use > > AnonymousSums), and if you *are* using AnonymousSums then the distfix > > notation is probably more valuable than the sections for an operator > > you probably aren't using. > > I wonder if this syntax for constructors is really that great. Yes, you > there is similarly with the type constructor (which is nice), but for > the data constructor, do we really want an unary encoding and have our > users count bars? > > I believe the user (and also us, having to read core) would be better > served by some syntax that involves plain numbers. > I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars. > Given that of is already a keyword, how about something involving "3 > of 4"? For example > > (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) > > and > > case sum of > (Put# x in 1 of 3) -> ... > (Put# x in 2 of 3) -> ... > (Put# x in 3 of 3) -> ... > > (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) > > > I don?t find this particular choice very great, but something with > numbers rather than ASCII art seems to make more sense here. Is there > something even better? > > Greetings, > Joachim > > > > > -- > Joachim ?nomeata? Breitner > mail at joachim-breitner.de ? http://www.joachim-breitner.de/ > Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F > Debian Developer: nomeata at debian.org > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Tue Sep 8 07:40:29 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 08:40:29 +0100 Subject: ArrayArrays In-Reply-To: References: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55EE90ED.1040609@gmail.com> This would be very cool, however it's questionable whether it's worth it. Without any unlifted kind, we need - ArrayArray# - a set of new/read/write primops for every element type, either built-in or made from unsafeCoerce# With the unlifted kind, we would need - ArrayArray# - one set of new/read/write primops With levity polymorphism, we would need - none of this, Array# can be used So having an unlifted kind already kills a lot of the duplication, polymorphism only kills a bit more. Cheers Simon On 08/09/2015 00:14, Edward Kmett wrote: > Assume we had the ability to talk about Levity in a new way and instead > of just: > > data Levity = Lifted | Unlifted > > type * = TYPE 'Lifted > type # = TYPE 'Unlifted > > we replace had a more nuanced notion of TYPE parameterized on another > data type: > > data Levity = Lifted | Unlifted > data Param = Composite | Simple Levity > > and we parameterized TYPE with a Param rather than Levity. > > Existing strange representations can continue to live in TYPE 'Composite > > (# Int# , Double #) :: TYPE 'Composite > > and we don't support parametricity in there, just like, currently we > don't allow parametricity in #. > > We can include the undefined example from Richard's talk: > > undefined :: forall (v :: Param). v > > and ultimately lift it into his pi type when it is available just as before. > > But we could let consider TYPE ('Simple 'Unlifted) as a form of > 'parametric #' covering unlifted things we're willing to allow > polymorphism over because they are just pointers to something in the > heap, that just happens to not be able to be _|_ or a thunk. > > In this setting, recalling that above, I modified Richard's TYPE to take > a Param instead of Levity, we can define a type alias for things that > live as a simple pointer to a heap allocated object: > > type GC (l :: Levity) = TYPE ('Simple l) > type * = GC 'Lifted > > and then we can look at existing primitives generalized: > > Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted > MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted > SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted > SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC > 'Unlifted > MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted > MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted > > Weak#, StablePtr#, StableName#, etc. all can take similar modifications. > > Recall that an ArrayArray# was just an Array# hacked up to be able to > hold onto the subset of # that is collectable. > > Almost all of the operations on these data types can work on the more > general kind of argument. > > newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> > State# s -> (# State# s, MutableArray# s a #) > > writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# > s a -> Int# -> a -> State# s -> State# s > > readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s > a -> Int# -> State# s -> (# State# s, a #) > > etc. > > Only a couple of our existing primitives _can't_ generalize this way. > The one that leaps to mind is atomicModifyMutVar, which would need to > stay constrained to only work on arguments in *, because of the way it > operates. > > With that we can still talk about > > MutableArray# s Int > > but now we can also talk about: > > MutableArray# s (MutableArray# s Int) > > without the layer of indirection through a box in * and without an > explosion of primops. The same newFoo, readFoo, writeFoo machinery works > for both kinds. > > The struct machinery doesn't get to take advantage of this, but it would > let us clean house elsewhere in Prim and drastically improve the range > of applicability of the existing primitives with nothing more than a > small change to the levity machinery. > > I'm not attached to any of the names above, I coined them just to give > us a concrete thing to talk about. > > Here I'm only proposing we extend machinery in GHC.Prim this way, but an > interesting 'now that the barn door is open' question is to consider > that our existing Haskell data types often admit a similar form of > parametricity and nothing in principle prevents this from working for > Maybe or [] and once you permit inference to fire across all of GC l > then it seems to me that you'd start to get those same capabilities > there as well when LevityPolymorphism was turned on. > > -Edward > > On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones > > wrote: > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing > the need for folks to descend into the use of the more evil > structure primitives we're talking about, and letting us keep a few > more principles around us.____ > > __ __ > > I?m lost. Can you give some concrete examples that illustrate how > levity polymorphism will help us?____ > > > Simon____ > > __ __ > > *From:*Edward Kmett [mailto:ekmett at gmail.com ] > *Sent:* 07 September 2015 21:17 > *To:* Simon Peyton Jones > *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T > Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays____ > > __ __ > > I had a brief discussion with Richard during the Haskell Symposium > about how we might be able to let parametricity help a bit in > reducing the space of necessarily primops to a slightly more > manageable level. ____ > > __ __ > > Notably, it'd be interesting to explore the ability to allow > parametricity over the portion of # that is just a gcptr.____ > > __ __ > > We could do this if the levity polymorphism machinery was tweaked a > bit. You could envision the ability to abstract over things in both > * and the subset of # that are represented by a gcptr, then > modifying the existing array primitives to be parametric in that > choice of levity for their argument so long as it was of a "heap > object" levity.____ > > __ __ > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing > the need for folks to descend into the use of the more evil > structure primitives we're talking about, and letting us keep a few > more principles around us.____ > > __ __ > > Then in the cases like `atomicModifyMutVar#` where it needs to > actually be in * rather than just a gcptr, due to the constructed > field selectors it introduces on the heap then we could keep the > existing less polymorphic type.____ > > __ __ > > -Edward____ > > __ __ > > On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > > wrote:____ > > It was fun to meet and discuss this.____ > > ____ > > Did someone volunteer to write a wiki page that describes the > proposed design? And, I earnestly hope, also describes the > menagerie of currently available array types and primops so that > users can have some chance of picking the right one?!____ > > ____ > > Thanks____ > > ____ > > Simon____ > > ____ > > *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org > ] *On Behalf Of *Ryan Newton > *Sent:* 31 August 2015 23:11 > *To:* Edward Kmett; Johan Tibell > *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; > ghc-devs; Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays____ > > ____ > > Dear Edward, Ryan Yates, and other interested parties -- ____ > > ____ > > So when should we meet up about this?____ > > ____ > > May I propose the Tues afternoon break for everyone at ICFP who > is interested in this topic? We can meet out in the coffee area > and congregate around Edward Kmett, who is tall and should be > easy to find ;-).____ > > ____ > > I think Ryan is going to show us how to use his new primops for > combined array + other fields in one heap object?____ > > ____ > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > wrote:____ > > Without a custom primitive it doesn't help much there, you > have to store the indirection to the mask.____ > > ____ > > With a custom primitive it should cut the on heap > root-to-leaf path of everything in the HAMT in half. A > shorter HashMap was actually one of the motivating factors > for me doing this. It is rather astoundingly difficult to > beat the performance of HashMap, so I had to start cheating > pretty badly. ;)____ > > ____ > > -Edward____ > > ____ > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > > > wrote:____ > > I'd also be interested to chat at ICFP to see if I can > use this for my HAMT implementation.____ > > ____ > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > > wrote:____ > > Sounds good to me. Right now I'm just hacking up > composable accessors for "typed slots" in a fairly > lens-like fashion, and treating the set of slots I > define and the 'new' function I build for the data > type as its API, and build atop that. This could > eventually graduate to template-haskell, but I'm not > entirely satisfied with the solution I have. I > currently distinguish between what I'm calling > "slots" (things that point directly to another > SmallMutableArrayArray# sans wrapper) and "fields" > which point directly to the usual Haskell data types > because unifying the two notions meant that I > couldn't lift some coercions out "far enough" to > make them vanish.____ > > ____ > > I'll be happy to run through my current working set > of issues in person and -- as things get nailed down > further -- in a longer lived medium than in personal > conversations. ;)____ > > ____ > > -Edward____ > > ____ > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > > > wrote:____ > > I'd also love to meet up at ICFP and discuss > this. I think the array primops plus a TH layer > that lets (ab)use them many times without too > much marginal cost sounds great. And I'd like > to learn how we could be either early users of, > or help with, this infrastructure.____ > > ____ > > CC'ing in Ryan Scot and Omer Agacan who may also > be interested in dropping in on such discussions > @ICFP, and Chao-Hong Chen, a Ph.D. student who > is currently working on concurrent data > structures in Haskell, but will not be at ICFP.____ > > ____ > > ____ > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > > wrote:____ > > I completely agree. I would love to spend > some time during ICFP and > friends talking about what it could look > like. My small array for STM > changes for the RTS can be seen here [1]. > It is on a branch somewhere > between 7.8 and 7.10 and includes irrelevant > STM bits and some > confusing naming choices (sorry), but should > cover all the details > needed to implement it for a non-STM > context. The biggest surprise > for me was following small array too closely > and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: > https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan____ > > > On Fri, Aug 28, 2015 at 10:09 PM, Edward > Kmett > wrote: > > I'd love to have that last 10%, but its a > lot of work to get there and more > > importantly I don't know quite what it > should look like. > > > > On the other hand, I do have a pretty > good idea of how the primitives above > > could be banged out and tested in a long > evening, well in time for 7.12. And > > as noted earlier, those remain useful > even if a nicer typed version with an > > extra level of indirection to the sizes > is built up after. > > > > The rest sounds like a good graduate > student project for someone who has > > graduate students lying around. Maybe > somebody at Indiana University who has > > an interest in type theory and > parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan > Yates > wrote: > >> > >> I think from my perspective, the > motivation for getting the type > >> checker involved is primarily bringing > this to the level where users > >> could be expected to build these > structures. it is reasonable to > >> think that there are people who want to > use STM (a context with > >> mutation already) to implement a > straight forward data structure that > >> avoids extra indirection penalty. There > should be some places where > >> knowing that things are field accesses > rather then array indexing > >> could be helpful, but I think GHC is > good right now about handling > >> constant offsets. In my code I don't do > any bounds checking as I know > >> I will only be accessing my arrays with > constant indexes. I make > >> wrappers for each field access and leave > all the unsafe stuff in > >> there. When things go wrong though, the > compiler is no help. Maybe > >> template Haskell that generates the > appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me when > working with these as arrays in > >> that it is quite simple and direct > (given the hoops already jumped > >> through) to play with alignment. I can > ensure two pointers are never > >> on the same cache-line by just spacing > things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 PM, Edward > Kmett > wrote: > >> > They just segfault at this level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan > Newton > wrote: > >> > > >> > You presumably also save a bounds > check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, > Edward Kmett > wrote: > >> >> > >> >> Also there are 4 different "things" > here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the sizes > into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without cardmarking for > different sizes can be done > >> >> pretty > >> >> easily, but as noted, the infotable > variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, > Edward Kmett > wrote: > >> >>> > >> >>> Well, on the plus side you'd save 16 > bytes per object, which adds up > >> >>> if > >> >>> they were small enough and there are > enough of them. You get a bit > >> >>> better > >> >>> locality of reference in terms of > what fits in the first cache line of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at 6:14 PM, > Ryan Newton > > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short term I can > imagine places we will settle with > >> >>>> arrays even if it means tracking > lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't actually > match their siblings. > >> >>>> > >> >>>> Is there anything to recommend the > hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using them to > fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at 3:07 PM > Edward Kmett > > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, but the > one you suggest requires a lot more > >> >>>>> plumbing and doesn't subsume all > of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, > Ryan Newton > > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an array > like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do without > structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload at front. > But wouldn't the general solution > >> >>>>>> for > >> >>>>>> structs be one that that allows > new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM > Edward Kmett > > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of MutableStruct# with > a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is basically > what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were > stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a couple of > words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we build in > addition to MutVar# rather than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if we had to > fix those numbers and build info > >> >>>>>>> tables that knew them, and > typechecker support, for instance, it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of things that we > can do with the 'sized' versions > >> >>>>>>> above, like working with evil > unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure cease to be > possible, so it isn't even a pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the needs I have > are covered just by adding the one > >> >>>>>>> primitive. The last 10% gets > pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, > Ryan Newton > > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility of a > general solution for mutable structs > >> >>>>>>>> (like Ed said), and I'm trying > to fully understand why it's hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack MutVar into > constructors because of object > >> >>>>>>>> identity problems. But what > about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# objects, > generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, but is it > problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this is also > critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable structures, > just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM > Simon Peyton Jones > >> >>>>>>>> > wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take > this email and turn it into a short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make it into a > wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett > [mailto:ekmett at gmail.com > ] > >> >>>>>>>>> Sent: 27 August 2015 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T Chakravarty; > Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an > Array# with a modified invariant. It > >> >>>>>>>>> points directly to other > unlifted ArrayArray#'s or ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in #, they > are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to make some > of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing them as > a placeholder for a better thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario where > you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef (Maybe > DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL to the > next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> > MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply > unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another by adding > a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL !(IORef DLL) > !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still stuck with > a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL > ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every > operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the speed of > an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory bound on > loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a > number of data structures where the > >> >>>>>>>>> indirection of going from > something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real pointer to > my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# > because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list when we > dirty it. There is a well defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the > representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArray# > RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two pointers > in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ > directly. It has reduced the amount of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I touch on > a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go out to the > heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase it to > the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers > together in memory though. I'm paying for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as well, which I > don't particularly need with just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can shed that > with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which is just > the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit when you > don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > (SmallMutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted my object > itself to live in # and have two > >> >>>>>>>>> mutable fields and be able to > share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points directly > to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * wrapper > on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the > imperative world and Haskell, and then just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other > arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up a new > Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I allocate > on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately I can > exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the > readMutableArrayArray# and > >> >>>>>>>>> writeMutableArrayArray# calls > to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one DLL wrapper > which just 'bootstraps me' into a > >> >>>>>>>>> strict world, and everything > there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ \s -> case > readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite > happy to optimize all of that code to > >> >>>>>>>>> keep things unboxed. The 'DLL' > wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and you chain > operations of this sort! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Cleaning it Up > >> >>>>>>>>> > >> >>>>>>>>> ------------------ > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Now I have one outermost > indirection pointing to an array that > >> >>>>>>>>> points directly to other arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm stuck paying for a card > marking table per object, but I can > >> >>>>>>>>> fix > >> >>>>>>>>> that by duplicating the code > for MutableArrayArray# and using a > >> >>>>>>>>> SmallMutableArray#. I can hack > up primops that let me store a > >> >>>>>>>>> mixture of > >> >>>>>>>>> SmallMutableArray# fields and > normal ones in the data structure. > >> >>>>>>>>> Operationally, I can even do > so by just unsafeCoercing the > >> >>>>>>>>> existing > >> >>>>>>>>> SmallMutableArray# primitives > to change the kind of one of the > >> >>>>>>>>> arguments it > >> >>>>>>>>> takes. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This is almost ideal, but not > quite. I often have fields that > >> >>>>>>>>> would > >> >>>>>>>>> be best left unboxed. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLLInt = DLL !Int !(IORef > DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> was able to unpack the Int, > but we lost that. We can currently > >> >>>>>>>>> at > >> >>>>>>>>> best point one of the entries > of the SmallMutableArray# at a > >> >>>>>>>>> boxed or at a > >> >>>>>>>>> MutableByteArray# for all of > our misc. data and shove the int in > >> >>>>>>>>> question in > >> >>>>>>>>> there. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> e.g. if I were to implement a > hash-array-mapped-trie I need to > >> >>>>>>>>> store masks and administrivia > as I walk down the tree. Having to > >> >>>>>>>>> go off to > >> >>>>>>>>> the side costs me the entire > win from avoiding the first pointer > >> >>>>>>>>> chase. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But, if like Ryan suggested, > we had a heap object we could > >> >>>>>>>>> construct that had n words > with unsafe access and m pointers to > >> >>>>>>>>> other heap > >> >>>>>>>>> objects, one that could put > itself on the mutable list when any > >> >>>>>>>>> of those > >> >>>>>>>>> pointers changed then I could > shed this last factor of two in > >> >>>>>>>>> all > >> >>>>>>>>> circumstances. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Prototype > >> >>>>>>>>> > >> >>>>>>>>> ------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Over the last few days I've > put together a small prototype > >> >>>>>>>>> implementation with a few > non-trivial imperative data structures > >> >>>>>>>>> for things > >> >>>>>>>>> like Tarjan's link-cut trees, > the list labeling problem and > >> >>>>>>>>> order-maintenance. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> https://github.com/ekmett/structs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Notable bits: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal.LinkCut > provides an implementation of > >> >>>>>>>>> link-cut > >> >>>>>>>>> trees in this style. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal provides > the rather horrifying guts that > >> >>>>>>>>> make > >> >>>>>>>>> it go fast. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Once compiled with -O or -O2, > if you look at the core, almost > >> >>>>>>>>> all > >> >>>>>>>>> the references to the LinkCut > or Object data constructor get > >> >>>>>>>>> optimized away, > >> >>>>>>>>> and we're left with beautiful > strict code directly mutating out > >> >>>>>>>>> underlying > >> >>>>>>>>> representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take > this email and turn it into a short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 > AM, Simon Peyton Jones > >> >>>>>>>>> > wrote: > >> >>>>>>>>> > >> >>>>>>>>> Just to say that I have no > idea what is going on in this thread. > >> >>>>>>>>> What is ArrayArray? What is > the issue in general? Is there a > >> >>>>>>>>> ticket? Is > >> >>>>>>>>> there a wiki page? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> If it?s important, an > ab-initio wiki page + ticket would be a > >> >>>>>>>>> good > >> >>>>>>>>> thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: ghc-devs > [mailto:ghc-devs-bounces at haskell.org > ] On Behalf > >> >>>>>>>>> Of > >> >>>>>>>>> Edward Kmett > >> >>>>>>>>> Sent: 21 August 2015 05:25 > >> >>>>>>>>> To: Manuel M T Chakravarty > >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> When (ab)using them for this > purpose, SmallArrayArray's would be > >> >>>>>>>>> very handy as well. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider right now if I have > something like an order-maintenance > >> >>>>>>>>> structure I have: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper {-# > UNPACK #-} !(MutableByteArray s) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper > s)) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Upper s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower {-# > UNPACK #-} !(MutVar s (Upper s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutableByteArray > s) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Lower s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The former contains, > logically, a mutable integer and two > >> >>>>>>>>> pointers, > >> >>>>>>>>> one for forward and one for > backwards. The latter is basically > >> >>>>>>>>> the same > >> >>>>>>>>> thing with a mutable reference > up pointing at the structure > >> >>>>>>>>> above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On the heap this is an object > that points to a structure for the > >> >>>>>>>>> bytearray, and points to > another structure for each mutvar which > >> >>>>>>>>> each point > >> >>>>>>>>> to the other 'Upper' > structure. So there is a level of > >> >>>>>>>>> indirection smeared > >> >>>>>>>>> over everything. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So this is a pair of doubly > linked lists with an upward link > >> >>>>>>>>> from > >> >>>>>>>>> the structure below to the > structure above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Converted into ArrayArray#s > I'd get > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper > (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> w/ the first slot being a > pointer to a MutableByteArray#, and > >> >>>>>>>>> the > >> >>>>>>>>> next 2 slots pointing to the > previous and next previous objects, > >> >>>>>>>>> represented > >> >>>>>>>>> just as their > MutableArrayArray#s. I can use > >> >>>>>>>>> sameMutableArrayArray# on these > >> >>>>>>>>> for object identity, which > lets me check for the ends of the > >> >>>>>>>>> lists by tying > >> >>>>>>>>> things back on themselves. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and below that > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower > (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> is similar, with an extra > MutableArrayArray slot pointing up to > >> >>>>>>>>> an > >> >>>>>>>>> upper structure. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can then write a handful of > combinators for getting out the > >> >>>>>>>>> slots > >> >>>>>>>>> in question, while it has > gained a level of indirection between > >> >>>>>>>>> the wrapper > >> >>>>>>>>> to put it in * and the > MutableArrayArray# s in #, that one can > >> >>>>>>>>> be basically > >> >>>>>>>>> erased by ghc. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Unlike before I don't have > several separate objects on the heap > >> >>>>>>>>> for > >> >>>>>>>>> each thing. I only have 2 now. > The MutableArrayArray# for the > >> >>>>>>>>> object itself, > >> >>>>>>>>> and the MutableByteArray# that > it references to carry around the > >> >>>>>>>>> mutable > >> >>>>>>>>> int. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The only pain points are > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 1.) the aforementioned > limitation that currently prevents me > >> >>>>>>>>> from > >> >>>>>>>>> stuffing normal boxed data > through a SmallArray or Array into an > >> >>>>>>>>> ArrayArray > >> >>>>>>>>> leaving me in a little ghetto > disconnected from the rest of > >> >>>>>>>>> Haskell, > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2.) the lack of > SmallArrayArray's, which could let us avoid the > >> >>>>>>>>> card marking overhead. These > objects are all small, 3-4 pointers > >> >>>>>>>>> wide. Card > >> >>>>>>>>> marking doesn't help. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Alternately I could just try > to do really evil things and > >> >>>>>>>>> convert > >> >>>>>>>>> the whole mess to SmallArrays > and then figure out how to > >> >>>>>>>>> unsafeCoerce my way > >> >>>>>>>>> to glory, stuffing the #'d > references to the other arrays > >> >>>>>>>>> directly into the > >> >>>>>>>>> SmallArray as slots, removing > the limitation we see here by > >> >>>>>>>>> aping the > >> >>>>>>>>> MutableArrayArray# s API, but > that gets really really dangerous! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm pretty much willing to > sacrifice almost anything on the > >> >>>>>>>>> altar > >> >>>>>>>>> of speed here, but I'd like to > be able to let the GC move them > >> >>>>>>>>> and collect > >> >>>>>>>>> them which rules out simpler > Ptr and Addr based solutions. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 > PM, Manuel M T Chakravarty > >> >>>>>>>>> > wrote: > >> >>>>>>>>> > >> >>>>>>>>> That?s an interesting idea. > >> >>>>>>>>> > >> >>>>>>>>> Manuel > >> >>>>>>>>> > >> >>>>>>>>> > Edward Kmett > >: > >> >>>>>>>>> > >> >>>>>>>>> > > >> >>>>>>>>> > Would it be possible to add > unsafe primops to add Array# and > >> >>>>>>>>> > SmallArray# entries to an > ArrayArray#? The fact that the > >> >>>>>>>>> > ArrayArray# entries > >> >>>>>>>>> > are all directly unlifted > avoiding a level of indirection for > >> >>>>>>>>> > the containing > >> >>>>>>>>> > structure is amazing, but I > can only currently use it if my > >> >>>>>>>>> > leaf level data > >> >>>>>>>>> > can be 100% unboxed and > distributed among ByteArray#s. It'd be > >> >>>>>>>>> > nice to be > >> >>>>>>>>> > able to have the ability to > put SmallArray# a stuff down at > >> >>>>>>>>> > the leaves to > >> >>>>>>>>> > hold lifted contents. > >> >>>>>>>>> > > >> >>>>>>>>> > I accept fully that if I > name the wrong type when I go to > >> >>>>>>>>> > access > >> >>>>>>>>> > one of the fields it'll lie > to me, but I suppose it'd do that > >> >>>>>>>>> > if i tried to > >> >>>>>>>>> > use one of the members that > held a nested ArrayArray# as a > >> >>>>>>>>> > ByteArray# > >> >>>>>>>>> > anyways, so it isn't like > there is a safety story preventing > >> >>>>>>>>> > this. > >> >>>>>>>>> > > >> >>>>>>>>> > I've been hunting for ways > to try to kill the indirection > >> >>>>>>>>> > problems I get with Haskell > and mutable structures, and I > >> >>>>>>>>> > could shoehorn a > >> >>>>>>>>> > number of them into > ArrayArrays if this worked. > >> >>>>>>>>> > > >> >>>>>>>>> > Right now I'm stuck paying > for 2 or 3 levels of unnecessary > >> >>>>>>>>> > indirection compared to > c/java and this could reduce that pain > >> >>>>>>>>> > to just 1 > >> >>>>>>>>> > level of unnecessary > indirection. > >> >>>>>>>>> > > >> >>>>>>>>> > -Edward > >> >>>>>>>>> > >> >>>>>>>>> > > _______________________________________________ > >> >>>>>>>>> > ghc-devs mailing list > >> >>>>>>>>> > ghc-devs at haskell.org > > >> >>>>>>>>> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > _______________________________________________ > >> >>>>>>>>> ghc-devs mailing list > >> >>>>>>>>> ghc-devs at haskell.org > > >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> >> > >> > > >> > > >> > > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > > >> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > >____ > > ____ > > ____ > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs____ > > ____ > > ____ > > __ __ > > From simonpj at microsoft.com Tue Sep 8 07:40:51 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 07:40:51 +0000 Subject: Unlifted data types In-Reply-To: <1441663307-sup-612@sabre> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <1441661177-sup-2150@sabre> <9cafcebc6d274b2385f202a4fd224174@DB4PR30MB030.064d.mgd.msft.net> <1441663307-sup-612@sabre> Message-ID: <11b6bb1806894856b0fcedda6884e083@DB4PR30MB030.064d.mgd.msft.net> | The problem 'Force' is trying to solve is the fact that Haskell | currently has many existing lifted data types, and they all have | ~essentially identical unlifted versions. But for a user to write the | lifted and unlifted version, they have to copy paste their code or use | 'Force'. But (Force [a]) will only be head-strict. You still have to make an essentially-identical version if you want a strict list. Ditto all components of a data structure. Is the gain (of head-strictness) really worth it? Incidentally, on the Unlifted-vs-# discussion, I'm not against making the distinction. I can see advantages in carving out a strict subset of Haskell, which this would help to do. Simon From simonpj at microsoft.com Tue Sep 8 07:52:12 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 07:52:12 +0000 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> Message-ID: | And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough. That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot: type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs) mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) That doesn't look terribly convenient. | ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level). Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x I don't see Force helping here. Simon From marlowsd at gmail.com Tue Sep 8 07:53:00 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 08:53:00 +0100 Subject: Unpacking sum types In-Reply-To: References: Message-ID: <55EE93DC.7050409@gmail.com> On 07/09/2015 15:35, Simon Peyton Jones wrote: > Good start. > > I have updated the page to separate the source-language design (what the > programmer sees) from the implementation. > > And I have included boxed sums as well ? it would be deeply strange not > to do so. How did you envisage implementing anonymous boxed sums? What is their heap representation? One option is to use some kind of generic object with a dynamic number of pointers and non-pointers, and one field for the tag. The layout would need to be stored in the object. This isn't a particularly efficient representation, though. Perhaps there could be a family of smaller specialised versions for common sizes. Do we have a use case for the boxed version, or is it just for consistency? Cheers Simon > Looks good to me! > > Simon > > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* 01 September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton > *Cc:* ghc-devs at haskell.org > *Subject:* RFC: Unpacking sum types > > I have a draft design for unpacking sum types that I'd like some > feedback on. In particular feedback both on: > > * the writing and clarity of the proposal and > > * the proposal itself. > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > -- Johan > From mail at joachim-breitner.de Tue Sep 8 08:14:43 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Tue, 08 Sep 2015 10:14:43 +0200 Subject: Unpacking sum types In-Reply-To: <55EE93DC.7050409@gmail.com> References: <55EE93DC.7050409@gmail.com> Message-ID: <1441700083.1307.7.camel@joachim-breitner.de> Hi, Am Dienstag, den 08.09.2015, 08:53 +0100 schrieb Simon Marlow: > On 07/09/2015 15:35, Simon Peyton Jones wrote: > > Good start. > > > > I have updated the page to separate the source-language design (what the > > programmer sees) from the implementation. > > > > And I have included boxed sums as well ? it would be deeply strange not > > to do so. > > How did you envisage implementing anonymous boxed sums? What is their > heap representation? > > One option is to use some kind of generic object with a dynamic number > of pointers and non-pointers, and one field for the tag. Why a dynamic number of pointers? All constructors of an anonymous sum type contain precisely one pointer (just like Left and Right do), as they are normal boxed, polymorphic data types. Also the constructors (0 of 1 | _ ) (0 of 2 | _ ) (0 of 3 | _ ) (using Lennart?s syntax here) can all use the same info-table: At runtime, we only care about the tag, not the arity of the sum type. So just like for products, we could statically generate info tables for the constructors (0 of ? | _ ) (1 of ? | _ ) ? (63 of ? | _ ) and simply do not support more than these. (Or, if we really want to support these, start to nest them. 63? will already go a long way... :-)) Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From simonpj at microsoft.com Tue Sep 8 08:28:50 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 08:28:50 +0000 Subject: AnonymousSums data con syntax In-Reply-To: References: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> <1441657274.28403.7.camel@joachim-breitner.de> Message-ID: I can see the force of this discussion about data type constructors for sums, but ? We already do this for tuples: (,,,,) is a type constructor and you have to count commas. We could use a number here but we don?t. ? Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z)) I do not expect big sums in practice. That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or something like that. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Lennart Kolmodin Sent: 08 September 2015 07:12 To: Joachim Breitner Cc: ghc-devs at haskell.org Subject: Re: AnonymousSums data con syntax 2015-09-07 21:21 GMT+01:00 Joachim Breitner >: Hi, Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones: > > Are we okay with stealing some operator sections for this? E.G. (x > > > > ). I think the boxed sums larger than 2 choices are all technically overlapping with sections. > > I hadn't thought of that. I suppose that in distfix notation we > could require spaces > (x | |) > since vertical bar by itself isn't an operator. But then (_||) x > might feel more compact. > > Also a section (x ||) isn't valid in a pattern, so we would not need > to require spaces there. > > But my gut feel is: yes, with AnonymousSums we should just steal the > syntax. It won't hurt existing code (since it won't use > AnonymousSums), and if you *are* using AnonymousSums then the distfix > notation is probably more valuable than the sections for an operator > you probably aren't using. I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don?t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Tue Sep 8 08:29:36 2015 From: ekmett at gmail.com (Edward Kmett) Date: Tue, 8 Sep 2015 04:29:36 -0400 Subject: ArrayArrays In-Reply-To: <55EE90ED.1040609@gmail.com> References: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> <55EE90ED.1040609@gmail.com> Message-ID: Once you start to include all the other primitive types there is a bit more of an explosion. MVar#, TVar#, MutVar#, Small variants, etc. can all be modified to carry unlifted content. Being able to be parametric over that choice would permit a number of things in user land to do the same thing with an open-ended set of design possibilities that are rather hard to contemplate in advance. e.g. being able to abstract over them could let you just use a normal (,) to carry around unlifted parametric data types or being able to talk about [MVar# s a] drastically reducing the number of one off data types we need to invent. If you can talk about the machinery mentioned above then you can have typeclasses parameterized on an argument that could be either unlifted or lifted. I'm not willing to fight too hard for it, but it feels more like the "right" solution than retaining a cut-and-paste copy of the same code and bifurcating further on each argument you want to consider such a degree of freedom. As such it seems like a pretty big win for a comparatively minor change to the levity polymorphism machinery. -Edward On Tue, Sep 8, 2015 at 3:40 AM, Simon Marlow wrote: > This would be very cool, however it's questionable whether it's worth it. > > Without any unlifted kind, we need > - ArrayArray# > - a set of new/read/write primops for every element type, > either built-in or made from unsafeCoerce# > > With the unlifted kind, we would need > - ArrayArray# > - one set of new/read/write primops > > With levity polymorphism, we would need > - none of this, Array# can be used > > So having an unlifted kind already kills a lot of the duplication, > polymorphism only kills a bit more. > > Cheers > Simon > > On 08/09/2015 00:14, Edward Kmett wrote: > >> Assume we had the ability to talk about Levity in a new way and instead >> of just: >> >> data Levity = Lifted | Unlifted >> >> type * = TYPE 'Lifted >> type # = TYPE 'Unlifted >> >> we replace had a more nuanced notion of TYPE parameterized on another >> data type: >> >> data Levity = Lifted | Unlifted >> data Param = Composite | Simple Levity >> >> and we parameterized TYPE with a Param rather than Levity. >> >> Existing strange representations can continue to live in TYPE 'Composite >> >> (# Int# , Double #) :: TYPE 'Composite >> >> and we don't support parametricity in there, just like, currently we >> don't allow parametricity in #. >> >> We can include the undefined example from Richard's talk: >> >> undefined :: forall (v :: Param). v >> >> and ultimately lift it into his pi type when it is available just as >> before. >> >> But we could let consider TYPE ('Simple 'Unlifted) as a form of >> 'parametric #' covering unlifted things we're willing to allow >> polymorphism over because they are just pointers to something in the >> heap, that just happens to not be able to be _|_ or a thunk. >> >> In this setting, recalling that above, I modified Richard's TYPE to take >> a Param instead of Levity, we can define a type alias for things that >> live as a simple pointer to a heap allocated object: >> >> type GC (l :: Levity) = TYPE ('Simple l) >> type * = GC 'Lifted >> >> and then we can look at existing primitives generalized: >> >> Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted >> MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted >> SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted >> SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC >> 'Unlifted >> MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted >> MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted >> >> Weak#, StablePtr#, StableName#, etc. all can take similar modifications. >> >> Recall that an ArrayArray# was just an Array# hacked up to be able to >> hold onto the subset of # that is collectable. >> >> Almost all of the operations on these data types can work on the more >> general kind of argument. >> >> newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> >> State# s -> (# State# s, MutableArray# s a #) >> >> writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# >> s a -> Int# -> a -> State# s -> State# s >> >> readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s >> a -> Int# -> State# s -> (# State# s, a #) >> >> etc. >> >> Only a couple of our existing primitives _can't_ generalize this way. >> The one that leaps to mind is atomicModifyMutVar, which would need to >> stay constrained to only work on arguments in *, because of the way it >> operates. >> >> With that we can still talk about >> >> MutableArray# s Int >> >> but now we can also talk about: >> >> MutableArray# s (MutableArray# s Int) >> >> without the layer of indirection through a box in * and without an >> explosion of primops. The same newFoo, readFoo, writeFoo machinery works >> for both kinds. >> >> The struct machinery doesn't get to take advantage of this, but it would >> let us clean house elsewhere in Prim and drastically improve the range >> of applicability of the existing primitives with nothing more than a >> small change to the levity machinery. >> >> I'm not attached to any of the names above, I coined them just to give >> us a concrete thing to talk about. >> >> Here I'm only proposing we extend machinery in GHC.Prim this way, but an >> interesting 'now that the barn door is open' question is to consider >> that our existing Haskell data types often admit a similar form of >> parametricity and nothing in principle prevents this from working for >> Maybe or [] and once you permit inference to fire across all of GC l >> then it seems to me that you'd start to get those same capabilities >> there as well when LevityPolymorphism was turned on. >> >> -Edward >> >> On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones >> > wrote: >> >> This could make the menagerie of ways to pack >> {Small}{Mutable}Array{Array}# references into a >> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing >> the need for folks to descend into the use of the more evil >> structure primitives we're talking about, and letting us keep a few >> more principles around us.____ >> >> __ __ >> >> I?m lost. Can you give some concrete examples that illustrate how >> levity polymorphism will help us?____ >> >> >> Simon____ >> >> __ __ >> >> *From:*Edward Kmett [mailto:ekmett at gmail.com > >] >> *Sent:* 07 September 2015 21:17 >> *To:* Simon Peyton Jones >> *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T >> Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates >> *Subject:* Re: ArrayArrays____ >> >> __ __ >> >> I had a brief discussion with Richard during the Haskell Symposium >> about how we might be able to let parametricity help a bit in >> reducing the space of necessarily primops to a slightly more >> manageable level. ____ >> >> __ __ >> >> Notably, it'd be interesting to explore the ability to allow >> parametricity over the portion of # that is just a gcptr.____ >> >> __ __ >> >> We could do this if the levity polymorphism machinery was tweaked a >> bit. You could envision the ability to abstract over things in both >> * and the subset of # that are represented by a gcptr, then >> modifying the existing array primitives to be parametric in that >> choice of levity for their argument so long as it was of a "heap >> object" levity.____ >> >> __ __ >> >> This could make the menagerie of ways to pack >> {Small}{Mutable}Array{Array}# references into a >> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing >> the need for folks to descend into the use of the more evil >> structure primitives we're talking about, and letting us keep a few >> more principles around us.____ >> >> __ __ >> >> Then in the cases like `atomicModifyMutVar#` where it needs to >> actually be in * rather than just a gcptr, due to the constructed >> field selectors it introduces on the heap then we could keep the >> existing less polymorphic type.____ >> >> __ __ >> >> -Edward____ >> >> __ __ >> >> On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones >> > wrote:____ >> >> It was fun to meet and discuss this.____ >> >> ____ >> >> Did someone volunteer to write a wiki page that describes the >> proposed design? And, I earnestly hope, also describes the >> menagerie of currently available array types and primops so that >> users can have some chance of picking the right one?!____ >> >> ____ >> >> Thanks____ >> >> ____ >> >> Simon____ >> >> ____ >> >> *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org >> ] *On Behalf Of *Ryan Newton >> *Sent:* 31 August 2015 23:11 >> *To:* Edward Kmett; Johan Tibell >> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; >> ghc-devs; Ryan Scott; Ryan Yates >> *Subject:* Re: ArrayArrays____ >> >> ____ >> >> Dear Edward, Ryan Yates, and other interested parties -- ____ >> >> ____ >> >> So when should we meet up about this?____ >> >> ____ >> >> May I propose the Tues afternoon break for everyone at ICFP who >> is interested in this topic? We can meet out in the coffee area >> and congregate around Edward Kmett, who is tall and should be >> easy to find ;-).____ >> >> ____ >> >> I think Ryan is going to show us how to use his new primops for >> combined array + other fields in one heap object?____ >> >> ____ >> >> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > > wrote:____ >> >> Without a custom primitive it doesn't help much there, you >> have to store the indirection to the mask.____ >> >> ____ >> >> With a custom primitive it should cut the on heap >> root-to-leaf path of everything in the HAMT in half. A >> shorter HashMap was actually one of the motivating factors >> for me doing this. It is rather astoundingly difficult to >> beat the performance of HashMap, so I had to start cheating >> pretty badly. ;)____ >> >> ____ >> >> -Edward____ >> >> ____ >> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >> > >> wrote:____ >> >> I'd also be interested to chat at ICFP to see if I can >> use this for my HAMT implementation.____ >> >> ____ >> >> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett >> > wrote:____ >> >> Sounds good to me. Right now I'm just hacking up >> composable accessors for "typed slots" in a fairly >> lens-like fashion, and treating the set of slots I >> define and the 'new' function I build for the data >> type as its API, and build atop that. This could >> eventually graduate to template-haskell, but I'm not >> entirely satisfied with the solution I have. I >> currently distinguish between what I'm calling >> "slots" (things that point directly to another >> SmallMutableArrayArray# sans wrapper) and "fields" >> which point directly to the usual Haskell data types >> because unifying the two notions meant that I >> couldn't lift some coercions out "far enough" to >> make them vanish.____ >> >> ____ >> >> I'll be happy to run through my current working set >> of issues in person and -- as things get nailed down >> further -- in a longer lived medium than in personal >> conversations. ;)____ >> >> ____ >> >> -Edward____ >> >> ____ >> >> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton >> > >> wrote:____ >> >> I'd also love to meet up at ICFP and discuss >> this. I think the array primops plus a TH layer >> that lets (ab)use them many times without too >> much marginal cost sounds great. And I'd like >> to learn how we could be either early users of, >> or help with, this infrastructure.____ >> >> ____ >> >> CC'ing in Ryan Scot and Omer Agacan who may also >> be interested in dropping in on such discussions >> @ICFP, and Chao-Hong Chen, a Ph.D. student who >> is currently working on concurrent data >> structures in Haskell, but will not be at >> ICFP.____ >> >> ____ >> >> ____ >> >> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates >> > > wrote:____ >> >> I completely agree. I would love to spend >> some time during ICFP and >> friends talking about what it could look >> like. My small array for STM >> changes for the RTS can be seen here [1]. >> It is on a branch somewhere >> between 7.8 and 7.10 and includes irrelevant >> STM bits and some >> confusing naming choices (sorry), but should >> cover all the details >> needed to implement it for a non-STM >> context. The biggest surprise >> for me was following small array too closely >> and having a word/byte >> offset miss-match [2]. >> >> [1]: >> >> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >> [2]: >> https://ghc.haskell.org/trac/ghc/ticket/10413 >> >> Ryan____ >> >> >> On Fri, Aug 28, 2015 at 10:09 PM, Edward >> Kmett > > wrote: >> > I'd love to have that last 10%, but its a >> lot of work to get there and more >> > importantly I don't know quite what it >> should look like. >> > >> > On the other hand, I do have a pretty >> good idea of how the primitives above >> > could be banged out and tested in a long >> evening, well in time for 7.12. And >> > as noted earlier, those remain useful >> even if a nicer typed version with an >> > extra level of indirection to the sizes >> is built up after. >> > >> > The rest sounds like a good graduate >> student project for someone who has >> > graduate students lying around. Maybe >> somebody at Indiana University who has >> > an interest in type theory and >> parallelism can find us one. =) >> > >> > -Edward >> > >> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan >> Yates > > wrote: >> >> >> >> I think from my perspective, the >> motivation for getting the type >> >> checker involved is primarily bringing >> this to the level where users >> >> could be expected to build these >> structures. it is reasonable to >> >> think that there are people who want to >> use STM (a context with >> >> mutation already) to implement a >> straight forward data structure that >> >> avoids extra indirection penalty. There >> should be some places where >> >> knowing that things are field accesses >> rather then array indexing >> >> could be helpful, but I think GHC is >> good right now about handling >> >> constant offsets. In my code I don't do >> any bounds checking as I know >> >> I will only be accessing my arrays with >> constant indexes. I make >> >> wrappers for each field access and leave >> all the unsafe stuff in >> >> there. When things go wrong though, the >> compiler is no help. Maybe >> >> template Haskell that generates the >> appropriate wrappers is the right >> >> direction to go. >> >> There is another benefit for me when >> working with these as arrays in >> >> that it is quite simple and direct >> (given the hoops already jumped >> >> through) to play with alignment. I can >> ensure two pointers are never >> >> on the same cache-line by just spacing >> things out in the array. >> >> >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward >> Kmett > > wrote: >> >> > They just segfault at this level. ;) >> >> > >> >> > Sent from my iPhone >> >> > >> >> > On Aug 28, 2015, at 7:25 PM, Ryan >> Newton > > wrote: >> >> > >> >> > You presumably also save a bounds >> check on reads by hard-coding the >> >> > sizes? >> >> > >> >> > On Fri, Aug 28, 2015 at 3:39 PM, >> Edward Kmett > > wrote: >> >> >> >> >> >> Also there are 4 different "things" >> here, basically depending on two >> >> >> independent questions: >> >> >> >> >> >> a.) if you want to shove the sizes >> into the info table, and >> >> >> b.) if you want cardmarking. >> >> >> >> >> >> Versions with/without cardmarking for >> different sizes can be done >> >> >> pretty >> >> >> easily, but as noted, the infotable >> variants are pretty invasive. >> >> >> >> >> >> -Edward >> >> >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, >> Edward Kmett > > wrote: >> >> >>> >> >> >>> Well, on the plus side you'd save 16 >> bytes per object, which adds up >> >> >>> if >> >> >>> they were small enough and there are >> enough of them. You get a bit >> >> >>> better >> >> >>> locality of reference in terms of >> what fits in the first cache line of >> >> >>> them. >> >> >>> >> >> >>> -Edward >> >> >>> >> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, >> Ryan Newton > > >> >> >>> wrote: >> >> >>>> >> >> >>>> Yes. And for the short term I can >> imagine places we will settle with >> >> >>>> arrays even if it means tracking >> lengths unnecessarily and >> >> >>>> unsafeCoercing >> >> >>>> pointers whose types don't actually >> match their siblings. >> >> >>>> >> >> >>>> Is there anything to recommend the >> hacks mentioned for fixed sized >> >> >>>> array >> >> >>>> objects *other* than using them to >> fake structs? (Much to >> >> >>>> derecommend, as >> >> >>>> you mentioned!) >> >> >>>> >> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM >> Edward Kmett > > >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> I think both are useful, but the >> one you suggest requires a lot more >> >> >>>>> plumbing and doesn't subsume all >> of the usecases of the other. >> >> >>>>> >> >> >>>>> -Edward >> >> >>>>> >> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, >> Ryan Newton > > >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>> So that primitive is an array >> like thing (Same pointed type, >> >> >>>>>> unbounded >> >> >>>>>> length) with extra payload. >> >> >>>>>> >> >> >>>>>> I can see how we can do without >> structs if we have arrays, >> >> >>>>>> especially >> >> >>>>>> with the extra payload at front. >> But wouldn't the general solution >> >> >>>>>> for >> >> >>>>>> structs be one that that allows >> new user data type defs for # >> >> >>>>>> types? >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM >> Edward Kmett > > >> >> >>>>>> wrote: >> >> >>>>>>> >> >> >>>>>>> Some form of MutableStruct# with >> a known number of words and a >> >> >>>>>>> known >> >> >>>>>>> number of pointers is basically >> what Ryan Yates was suggesting >> >> >>>>>>> above, but >> >> >>>>>>> where the word counts were >> stored in the objects themselves. >> >> >>>>>>> >> >> >>>>>>> Given that it'd have a couple of >> words for those counts it'd >> >> >>>>>>> likely >> >> >>>>>>> want to be something we build in >> addition to MutVar# rather than a >> >> >>>>>>> replacement. >> >> >>>>>>> >> >> >>>>>>> On the other hand, if we had to >> fix those numbers and build info >> >> >>>>>>> tables that knew them, and >> typechecker support, for instance, it'd >> >> >>>>>>> get >> >> >>>>>>> rather invasive. >> >> >>>>>>> >> >> >>>>>>> Also, a number of things that we >> can do with the 'sized' versions >> >> >>>>>>> above, like working with evil >> unsized c-style arrays directly >> >> >>>>>>> inline at the >> >> >>>>>>> end of the structure cease to be >> possible, so it isn't even a pure >> >> >>>>>>> win if we >> >> >>>>>>> did the engineering effort. >> >> >>>>>>> >> >> >>>>>>> I think 90% of the needs I have >> are covered just by adding the one >> >> >>>>>>> primitive. The last 10% gets >> pretty invasive. >> >> >>>>>>> >> >> >>>>>>> -Edward >> >> >>>>>>> >> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, >> Ryan Newton > > >> >> >>>>>>> wrote: >> >> >>>>>>>> >> >> >>>>>>>> I like the possibility of a >> general solution for mutable structs >> >> >>>>>>>> (like Ed said), and I'm trying >> to fully understand why it's hard. >> >> >>>>>>>> >> >> >>>>>>>> So, we can't unpack MutVar into >> constructors because of object >> >> >>>>>>>> identity problems. But what >> about directly supporting an >> >> >>>>>>>> extensible set of >> >> >>>>>>>> unlifted MutStruct# objects, >> generalizing (and even replacing) >> >> >>>>>>>> MutVar#? That >> >> >>>>>>>> may be too much work, but is it >> problematic otherwise? >> >> >>>>>>>> >> >> >>>>>>>> Needless to say, this is also >> critical if we ever want best in >> >> >>>>>>>> class >> >> >>>>>>>> lockfree mutable structures, >> just like their Stm and sequential >> >> >>>>>>>> counterparts. >> >> >>>>>>>> >> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM >> Simon Peyton Jones >> >> >>>>>>>> > > wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> At the very least I'll take >> this email and turn it into a short >> >> >>>>>>>>> article. >> >> >>>>>>>>> >> >> >>>>>>>>> Yes, please do make it into a >> wiki page on the GHC Trac, and >> >> >>>>>>>>> maybe >> >> >>>>>>>>> make a ticket for it. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Thanks >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Simon >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> From: Edward Kmett >> [mailto:ekmett at gmail.com >> ] >> >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >> >>>>>>>>> To: Simon Peyton Jones >> >> >>>>>>>>> Cc: Manuel M T Chakravarty; >> Simon Marlow; ghc-devs >> >> >>>>>>>>> Subject: Re: ArrayArrays >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# is just an >> Array# with a modified invariant. It >> >> >>>>>>>>> points directly to other >> unlifted ArrayArray#'s or ByteArray#'s. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> While those live in #, they >> are garbage collected objects, so >> >> >>>>>>>>> this >> >> >>>>>>>>> all lives on the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> They were added to make some >> of the DPH stuff fast when it has >> >> >>>>>>>>> to >> >> >>>>>>>>> deal with nested arrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm currently abusing them as >> a placeholder for a better thing. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The Problem >> >> >>>>>>>>> >> >> >>>>>>>>> ----------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Consider the scenario where >> you write a classic doubly-linked >> >> >>>>>>>>> list >> >> >>>>>>>>> in Haskell. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (IORef (Maybe >> DLL) (IORef (Maybe DLL) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Chasing from one DLL to the >> next requires following 3 pointers >> >> >>>>>>>>> on >> >> >>>>>>>>> the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> >> MutVar# RealWorld (Maybe DLL) ~> >> >> >>>>>>>>> Maybe >> >> >>>>>>>>> DLL ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> That is 3 levels of indirection. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim one by simply >> unpacking the IORef with >> >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim another by adding >> a 'Nil' constructor for DLL and >> >> >>>>>>>>> worsening our representation. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL !(IORef DLL) >> !(IORef DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> but now we're still stuck with >> a level of indirection >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL >> ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> This means that every >> operation we perform on this structure >> >> >>>>>>>>> will >> >> >>>>>>>>> be about half of the speed of >> an implementation in most other >> >> >>>>>>>>> languages >> >> >>>>>>>>> assuming we're memory bound on >> loading things into cache! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Making Progress >> >> >>>>>>>>> >> >> >>>>>>>>> ---------------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I have been working on a >> number of data structures where the >> >> >>>>>>>>> indirection of going from >> something in * out to an object in # >> >> >>>>>>>>> which >> >> >>>>>>>>> contains the real pointer to >> my target and coming back >> >> >>>>>>>>> effectively doubles >> >> >>>>>>>>> my runtime. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We go out to the MutVar# >> because we are allowed to put the >> >> >>>>>>>>> MutVar# >> >> >>>>>>>>> onto the mutable list when we >> dirty it. There is a well defined >> >> >>>>>>>>> write-barrier. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I could change out the >> representation to use >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (MutableArray# >> RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I can just store two pointers >> in the MutableArray# every time, >> >> >>>>>>>>> but >> >> >>>>>>>>> this doesn't help _much_ >> directly. It has reduced the amount of >> >> >>>>>>>>> distinct >> >> >>>>>>>>> addresses in memory I touch on >> a walk of the DLL from 3 per >> >> >>>>>>>>> object to 2. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I still have to go out to the >> heap from my DLL and get to the >> >> >>>>>>>>> array >> >> >>>>>>>>> object and then chase it to >> the next DLL and chase that to the >> >> >>>>>>>>> next array. I >> >> >>>>>>>>> do get my two pointers >> together in memory though. I'm paying for >> >> >>>>>>>>> a card >> >> >>>>>>>>> marking table as well, which I >> don't particularly need with just >> >> >>>>>>>>> two >> >> >>>>>>>>> pointers, but we can shed that >> with the "SmallMutableArray#" >> >> >>>>>>>>> machinery added >> >> >>>>>>>>> back in 7.10, which is just >> the old array code a a new data >> >> >>>>>>>>> type, which can >> >> >>>>>>>>> speed things up a bit when you >> don't have very big arrays: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL >> (SmallMutableArray# RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> But what if I wanted my object >> itself to live in # and have two >> >> >>>>>>>>> mutable fields and be able to >> share the sme write barrier? >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# points directly >> to other unlifted array types. >> >> >>>>>>>>> What >> >> >>>>>>>>> if we have one # -> * wrapper >> on the outside to deal with the >> >> >>>>>>>>> impedence >> >> >>>>>>>>> mismatch between the >> imperative world and Haskell, and then just >> >> >>>>>>>>> let the >> >> >>>>>>>>> ArrayArray#'s hold other >> arrayarrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL >> (MutableArrayArray# RealWorld) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> now I need to make up a new >> Nil, which I can just make be a >> >> >>>>>>>>> special >> >> >>>>>>>>> MutableArrayArray# I allocate >> on program startup. I can even >> >> >>>>>>>>> abuse pattern >> >> >>>>>>>>> synonyms. Alternately I can >> exploit the internals further to >> >> >>>>>>>>> make this >> >> >>>>>>>>> cheaper. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Then I can use the >> readMutableArrayArray# and >> >> >>>>>>>>> writeMutableArrayArray# calls >> to directly access the preceding >> >> >>>>>>>>> and next >> >> >>>>>>>>> entry in the linked list. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> So now we have one DLL wrapper >> which just 'bootstraps me' into a >> >> >>>>>>>>> strict world, and everything >> there lives in #. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> next :: DLL -> IO DLL >> >> >>>>>>>>> >> >> >>>>>>>>> next (DLL m) = IO $ \s -> case >> readMutableArrayArray# s of >> >> >>>>>>>>> >> >> >>>>>>>>> (# s', n #) -> (# s', DLL n >> #) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> It turns out GHC is quite >> happy to optimize all of that code to >> >> >>>>>>>>> keep things unboxed. The 'DLL' >> wrappers get removed pretty >> >> >>>>>>>>> easily when they >> >> >>>>>>>>> are known strict and you chain > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Tue Sep 8 08:31:45 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 08:31:45 +0000 Subject: Unpacking sum types In-Reply-To: <55EE93DC.7050409@gmail.com> References: <55EE93DC.7050409@gmail.com> Message-ID: | How did you envisage implementing anonymous boxed sums? What is their | heap representation? *Exactly* like tuples; that is, we have a family of data type declarations: data (a|b) = (_|) a | (|_) b data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc. Simon | | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs at haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | > From marlowsd at gmail.com Tue Sep 8 08:54:36 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 09:54:36 +0100 Subject: Unpacking sum types In-Reply-To: References: <55EE93DC.7050409@gmail.com> Message-ID: <55EEA24C.9080504@gmail.com> On 08/09/2015 09:31, Simon Peyton Jones wrote: > | How did you envisage implementing anonymous boxed sums? What is their > | heap representation? > > *Exactly* like tuples; that is, we have a family of data type declarations: > > data (a|b) = (_|) a > | (|_) b > > data (a|b|c) = (_||) a > | (|_|) b > | (||_) c > ..etc. I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon > Simon > > | > | One option is to use some kind of generic object with a dynamic number > | of pointers and non-pointers, and one field for the tag. The layout > | would need to be stored in the object. This isn't a particularly > | efficient representation, though. Perhaps there could be a family of > | smaller specialised versions for common sizes. > | > | Do we have a use case for the boxed version, or is it just for > | consistency? > | > | Cheers > | Simon > | > | > | > Looks good to me! > | > > | > Simon > | > > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > | > *Sent:* 01 September 2015 18:24 > | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton > | > *Cc:* ghc-devs at haskell.org > | > *Subject:* RFC: Unpacking sum types > | > > | > I have a draft design for unpacking sum types that I'd like some > | > feedback on. In particular feedback both on: > | > > | > * the writing and clarity of the proposal and > | > > | > * the proposal itself. > | > > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > | > > | > -- Johan > | > > From marlowsd at gmail.com Tue Sep 8 08:54:48 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 09:54:48 +0100 Subject: Unpacking sum types In-Reply-To: References: <55EE93DC.7050409@gmail.com> Message-ID: <55EEA258.2030200@gmail.com> On 08/09/2015 09:31, Simon Peyton Jones wrote: > | How did you envisage implementing anonymous boxed sums? What is their > | heap representation? > > *Exactly* like tuples; that is, we have a family of data type declarations: > > data (a|b) = (_|) a > | (|_) b > > data (a|b|c) = (_||) a > | (|_|) b > | (||_) c > ..etc. I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon > Simon > > | > | One option is to use some kind of generic object with a dynamic number > | of pointers and non-pointers, and one field for the tag. The layout > | would need to be stored in the object. This isn't a particularly > | efficient representation, though. Perhaps there could be a family of > | smaller specialised versions for common sizes. > | > | Do we have a use case for the boxed version, or is it just for > | consistency? > | > | Cheers > | Simon > | > | > | > Looks good to me! > | > > | > Simon > | > > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > | > *Sent:* 01 September 2015 18:24 > | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton > | > *Cc:* ghc-devs at haskell.org > | > *Subject:* RFC: Unpacking sum types > | > > | > I have a draft design for unpacking sum types that I'd like some > | > feedback on. In particular feedback both on: > | > > | > * the writing and clarity of the proposal and > | > > | > * the proposal itself. > | > > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > | > > | > -- Johan > | > > From marlowsd at gmail.com Tue Sep 8 08:56:46 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 09:56:46 +0100 Subject: ArrayArrays In-Reply-To: References: <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> <55EE90ED.1040609@gmail.com> Message-ID: <55EEA2CE.6020606@gmail.com> On 08/09/2015 09:29, Edward Kmett wrote: > Once you start to include all the other primitive types there is a bit > more of an explosion. MVar#, TVar#, MutVar#, Small variants, etc. can > all be modified to carry unlifted content. Yep, that's a fair point. Cheers Simon > Being able to be parametric over that choice would permit a number of > things in user land to do the same thing with an open-ended set of > design possibilities that are rather hard to contemplate in advance. > e.g. being able to abstract over them could let you just use a normal > (,) to carry around unlifted parametric data types or being able to talk > about [MVar# s a] drastically reducing the number of one off data types > we need to invent. > > If you can talk about the machinery mentioned above then you can have > typeclasses parameterized on an argument that could be either unlifted > or lifted. > > I'm not willing to fight too hard for it, but it feels more like the > "right" solution than retaining a cut-and-paste copy of the same code > and bifurcating further on each argument you want to consider such a > degree of freedom. > > As such it seems like a pretty big win for a comparatively minor change > to the levity polymorphism machinery. > > -Edward > > On Tue, Sep 8, 2015 at 3:40 AM, Simon Marlow > wrote: > > This would be very cool, however it's questionable whether it's > worth it. > > Without any unlifted kind, we need > - ArrayArray# > - a set of new/read/write primops for every element type, > either built-in or made from unsafeCoerce# > > With the unlifted kind, we would need > - ArrayArray# > - one set of new/read/write primops > > With levity polymorphism, we would need > - none of this, Array# can be used > > So having an unlifted kind already kills a lot of the duplication, > polymorphism only kills a bit more. > > Cheers > Simon > > On 08/09/2015 00:14, Edward Kmett wrote: > > Assume we had the ability to talk about Levity in a new way and > instead > of just: > > data Levity = Lifted | Unlifted > > type * = TYPE 'Lifted > type # = TYPE 'Unlifted > > we replace had a more nuanced notion of TYPE parameterized on > another > data type: > > data Levity = Lifted | Unlifted > data Param = Composite | Simple Levity > > and we parameterized TYPE with a Param rather than Levity. > > Existing strange representations can continue to live in TYPE > 'Composite > > (# Int# , Double #) :: TYPE 'Composite > > and we don't support parametricity in there, just like, currently we > don't allow parametricity in #. > > We can include the undefined example from Richard's talk: > > undefined :: forall (v :: Param). v > > and ultimately lift it into his pi type when it is available > just as before. > > But we could let consider TYPE ('Simple 'Unlifted) as a form of > 'parametric #' covering unlifted things we're willing to allow > polymorphism over because they are just pointers to something in the > heap, that just happens to not be able to be _|_ or a thunk. > > In this setting, recalling that above, I modified Richard's TYPE > to take > a Param instead of Levity, we can define a type alias for things > that > live as a simple pointer to a heap allocated object: > > type GC (l :: Levity) = TYPE ('Simple l) > type * = GC 'Lifted > > and then we can look at existing primitives generalized: > > Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted > MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC > 'Unlifted > SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted > SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC > 'Unlifted > MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted > MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted > > Weak#, StablePtr#, StableName#, etc. all can take similar > modifications. > > Recall that an ArrayArray# was just an Array# hacked up to be > able to > hold onto the subset of # that is collectable. > > Almost all of the operations on these data types can work on the > more > general kind of argument. > > newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> > State# s -> (# State# s, MutableArray# s a #) > > writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). > MutableArray# > s a -> Int# -> a -> State# s -> State# s > > readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). > MutableArray# s > a -> Int# -> State# s -> (# State# s, a #) > > etc. > > Only a couple of our existing primitives _can't_ generalize this > way. > The one that leaps to mind is atomicModifyMutVar, which would > need to > stay constrained to only work on arguments in *, because of the > way it > operates. > > With that we can still talk about > > MutableArray# s Int > > but now we can also talk about: > > MutableArray# s (MutableArray# s Int) > > without the layer of indirection through a box in * and without an > explosion of primops. The same newFoo, readFoo, writeFoo > machinery works > for both kinds. > > The struct machinery doesn't get to take advantage of this, but > it would > let us clean house elsewhere in Prim and drastically improve the > range > of applicability of the existing primitives with nothing more than a > small change to the levity machinery. > > I'm not attached to any of the names above, I coined them just > to give > us a concrete thing to talk about. > > Here I'm only proposing we extend machinery in GHC.Prim this > way, but an > interesting 'now that the barn door is open' question is to consider > that our existing Haskell data types often admit a similar form of > parametricity and nothing in principle prevents this from > working for > Maybe or [] and once you permit inference to fire across all of GC l > then it seems to me that you'd start to get those same capabilities > there as well when LevityPolymorphism was turned on. > > -Edward > > On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones > > >> > wrote: > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, > reducing > the need for folks to descend into the use of the more evil > structure primitives we're talking about, and letting us > keep a few > more principles around us.____ > > __ __ > > I?m lost. Can you give some concrete examples that > illustrate how > levity polymorphism will help us?____ > > > Simon____ > > __ __ > > *From:*Edward Kmett [mailto:ekmett at gmail.com > >] > *Sent:* 07 September 2015 21:17 > *To:* Simon Peyton Jones > *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T > Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays____ > > __ __ > > I had a brief discussion with Richard during the Haskell > Symposium > about how we might be able to let parametricity help a bit in > reducing the space of necessarily primops to a slightly more > manageable level. ____ > > __ __ > > Notably, it'd be interesting to explore the ability to allow > parametricity over the portion of # that is just a gcptr.____ > > __ __ > > We could do this if the levity polymorphism machinery was > tweaked a > bit. You could envision the ability to abstract over things > in both > * and the subset of # that are represented by a gcptr, then > modifying the existing array primitives to be parametric in > that > choice of levity for their argument so long as it was of a > "heap > object" levity.____ > > __ __ > > This could make the menagerie of ways to pack > {Small}{Mutable}Array{Array}# references into a > {Small}{Mutable}Array{Array}#' actually typecheck soundly, > reducing > the need for folks to descend into the use of the more evil > structure primitives we're talking about, and letting us > keep a few > more principles around us.____ > > __ __ > > Then in the cases like `atomicModifyMutVar#` where it needs to > actually be in * rather than just a gcptr, due to the > constructed > field selectors it introduces on the heap then we could > keep the > existing less polymorphic type.____ > > __ __ > > -Edward____ > > __ __ > > On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > > >> > wrote:____ > > It was fun to meet and discuss this.____ > > ____ > > Did someone volunteer to write a wiki page that > describes the > proposed design? And, I earnestly hope, also describes the > menagerie of currently available array types and > primops so that > users can have some chance of picking the right one?!____ > > ____ > > Thanks____ > > ____ > > Simon____ > > ____ > > *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org > > >] *On Behalf Of *Ryan Newton > *Sent:* 31 August 2015 23:11 > *To:* Edward Kmett; Johan Tibell > *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; > ghc-devs; Ryan Scott; Ryan Yates > *Subject:* Re: ArrayArrays____ > > ____ > > Dear Edward, Ryan Yates, and other interested parties > -- ____ > > ____ > > So when should we meet up about this?____ > > ____ > > May I propose the Tues afternoon break for everyone at > ICFP who > is interested in this topic? We can meet out in the > coffee area > and congregate around Edward Kmett, who is tall and > should be > easy to find ;-).____ > > ____ > > I think Ryan is going to show us how to use his new > primops for > combined array + other fields in one heap object?____ > > ____ > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett > > >> > wrote:____ > > Without a custom primitive it doesn't help much > there, you > have to store the indirection to the mask.____ > > ____ > > With a custom primitive it should cut the on heap > root-to-leaf path of everything in the HAMT in half. A > shorter HashMap was actually one of the motivating > factors > for me doing this. It is rather astoundingly > difficult to > beat the performance of HashMap, so I had to start > cheating > pretty badly. ;)____ > > ____ > > -Edward____ > > ____ > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > >> > wrote:____ > > I'd also be interested to chat at ICFP to see > if I can > use this for my HAMT implementation.____ > > ____ > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett > > >> wrote:____ > > Sounds good to me. Right now I'm just > hacking up > composable accessors for "typed slots" in a > fairly > lens-like fashion, and treating the set of > slots I > define and the 'new' function I build for > the data > type as its API, and build atop that. This > could > eventually graduate to template-haskell, > but I'm not > entirely satisfied with the solution I have. I > currently distinguish between what I'm calling > "slots" (things that point directly to another > SmallMutableArrayArray# sans wrapper) and > "fields" > which point directly to the usual Haskell > data types > because unifying the two notions meant that I > couldn't lift some coercions out "far > enough" to > make them vanish.____ > > ____ > > I'll be happy to run through my current > working set > of issues in person and -- as things get > nailed down > further -- in a longer lived medium than in > personal > conversations. ;)____ > > ____ > > -Edward____ > > ____ > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton > >> > wrote:____ > > I'd also love to meet up at ICFP and > discuss > this. I think the array primops plus a > TH layer > that lets (ab)use them many times > without too > much marginal cost sounds great. And > I'd like > to learn how we could be either early > users of, > or help with, this infrastructure.____ > > ____ > > CC'ing in Ryan Scot and Omer Agacan who > may also > be interested in dropping in on such > discussions > @ICFP, and Chao-Hong Chen, a Ph.D. > student who > is currently working on concurrent data > structures in Haskell, but will not be > at ICFP.____ > > ____ > > ____ > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates > > >> wrote:____ > > I completely agree. I would love > to spend > some time during ICFP and > friends talking about what it could > look > like. My small array for STM > changes for the RTS can be seen > here [1]. > It is on a branch somewhere > between 7.8 and 7.10 and includes > irrelevant > STM bits and some > confusing naming choices (sorry), > but should > cover all the details > needed to implement it for a non-STM > context. The biggest surprise > for me was following small array > too closely > and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: > https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan____ > > > On Fri, Aug 28, 2015 at 10:09 PM, > Edward > Kmett > >> wrote: > > I'd love to have that last 10%, > but its a > lot of work to get there and more > > importantly I don't know quite > what it > should look like. > > > > On the other hand, I do have a > pretty > good idea of how the primitives above > > could be banged out and tested > in a long > evening, well in time for 7.12. And > > as noted earlier, those remain > useful > even if a nicer typed version with an > > extra level of indirection to > the sizes > is built up after. > > > > The rest sounds like a good graduate > student project for someone who has > > graduate students lying around. > Maybe > somebody at Indiana University who has > > an interest in type theory and > parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, > Ryan > Yates > >> wrote: > >> > >> I think from my perspective, the > motivation for getting the type > >> checker involved is primarily > bringing > this to the level where users > >> could be expected to build these > structures. it is reasonable to > >> think that there are people who > want to > use STM (a context with > >> mutation already) to implement a > straight forward data structure that > >> avoids extra indirection > penalty. There > should be some places where > >> knowing that things are field > accesses > rather then array indexing > >> could be helpful, but I think > GHC is > good right now about handling > >> constant offsets. In my code I > don't do > any bounds checking as I know > >> I will only be accessing my > arrays with > constant indexes. I make > >> wrappers for each field access > and leave > all the unsafe stuff in > >> there. When things go wrong > though, the > compiler is no help. Maybe > >> template Haskell that generates the > appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me > when > working with these as arrays in > >> that it is quite simple and direct > (given the hoops already jumped > >> through) to play with > alignment. I can > ensure two pointers are never > >> on the same cache-line by just > spacing > things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 > PM, Edward > Kmett > >> wrote: > >> > They just segfault at this > level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan > Newton > >> wrote: > >> > > >> > You presumably also save a bounds > check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, > Edward Kmett > >> wrote: > >> >> > >> >> Also there are 4 different > "things" > here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the > sizes > into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without > cardmarking for > different sizes can be done > >> >> pretty > >> >> easily, but as noted, the > infotable > variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, > Edward Kmett > >> wrote: > >> >>> > >> >>> Well, on the plus side > you'd save 16 > bytes per object, which adds up > >> >>> if > >> >>> they were small enough and > there are > enough of them. You get a bit > >> >>> better > >> >>> locality of reference in > terms of > what fits in the first cache line of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at > 6:14 PM, > Ryan Newton > >> > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short > term I can > imagine places we will settle with > >> >>>> arrays even if it means > tracking > lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't > actually > match their siblings. > >> >>>> > >> >>>> Is there anything to > recommend the > hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using > them to > fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at > 3:07 PM > Edward Kmett > >> > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, > but the > one you suggest requires a lot more > >> >>>>> plumbing and doesn't > subsume all > of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at > 5:51 PM, > Ryan Newton > >> > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an > array > like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do > without > structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload > at front. > But wouldn't the general solution > >> >>>>>> for > >> >>>>>> structs be one that that > allows > new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at > 4:43 PM > Edward Kmett > >> > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of > MutableStruct# with > a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is > basically > what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were > stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a > couple of > words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we > build in > addition to MutVar# rather than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if > we had to > fix those numbers and build info > >> >>>>>>> tables that knew them, and > typechecker support, for instance, it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of > things that we > can do with the 'sized' versions > >> >>>>>>> above, like working > with evil > unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure > cease to be > possible, so it isn't even a pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the > needs I have > are covered just by adding the one > >> >>>>>>> primitive. The last 10% > gets > pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at > 5:30 PM, > Ryan Newton > >> > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility > of a > general solution for mutable structs > >> >>>>>>>> (like Ed said), and > I'm trying > to fully understand why it's hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack > MutVar into > constructors because of object > >> >>>>>>>> identity problems. But > what > about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# > objects, > generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, > but is it > problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this > is also > critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable > structures, > just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 > at 4:43 AM > Simon Peyton Jones > >> >>>>>>>> > >> wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least > I'll take > this email and turn it into a short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make > it into a > wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett > [mailto:ekmett at gmail.com > > >] > >> >>>>>>>>> Sent: 27 August 2015 > 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T > Chakravarty; > Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an > Array# with a modified invariant. It > >> >>>>>>>>> points directly to other > unlifted ArrayArray#'s or ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in > #, they > are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to > make some > of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing > them as > a placeholder for a better thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario > where > you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef > (Maybe > DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL > to the > next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe > DLL) ~> > MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of > indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply > unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields > or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another > by adding > a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our > representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > !(IORef DLL) > !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still > stuck with > a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# > RealWorld DLL > ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every > operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the > speed of > an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory > bound on > loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a > number of data structures where the > >> >>>>>>>>> indirection of going from > something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real > pointer to > my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# > because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list > when we > dirty it. There is a well defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the > representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > (MutableArray# > RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two > pointers > in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ > directly. It has reduced the amount of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I > touch on > a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go > out to the > heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase > it to > the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers > together in memory though. I'm > paying for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as > well, which I > don't particularly need with just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can > shed that > with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which > is just > the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit > when you > don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > (SmallMutableArray# RealWorld DLL) > | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted > my object > itself to live in # and have two > >> >>>>>>>>> mutable fields and be > able to > share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points > directly > to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * > wrapper > on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the > imperative world and Haskell, and > then just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other > arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL > (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up > a new > Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I > allocate > on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately > I can > exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the > readMutableArrayArray# and > >> >>>>>>>>> > writeMutableArrayArray# calls > to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one > DLL wrapper > which just 'bootstraps me' into a > >> >>>>>>>>> strict world, and > everything > there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ > \s -> case > readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# > s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite > happy to optimize all of that code to > >> >>>>>>>>> keep things unboxed. > The 'DLL' > wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and > you chain > > From kane at kane.cx Tue Sep 8 10:12:33 2015 From: kane at kane.cx (David Kraeutmann) Date: Tue, 8 Sep 2015 12:12:33 +0200 Subject: AnonymousSums data con syntax In-Reply-To: References: <9eb2c9041f6142ce947a4b323c0b2bff@DB4PR30MB030.064d.mgd.msft.net> <1441657274.28403.7.camel@joachim-breitner.de> Message-ID: For what's it worth, I feel like (|True|||) looks better than (2/5|True) or (2 of 5|True). Not sure if the confusion w/r/t (x||) as or section or 3-ary anonymous sum is worth it though. On Tue, Sep 8, 2015 at 10:28 AM, Simon Peyton Jones wrote: > I can see the force of this discussion about data type constructors for > sums, but > > ? We already do this for tuples: (,,,,) is a type constructor and > you have to count commas. We could use a number here but we don?t. > > ? Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z)) > > I do not expect big sums in practice. > > > > That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or > something like that. > > > > Simon > > > > From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Lennart > Kolmodin > Sent: 08 September 2015 07:12 > To: Joachim Breitner > Cc: ghc-devs at haskell.org > Subject: Re: AnonymousSums data con syntax > > > > > > > > 2015-09-07 21:21 GMT+01:00 Joachim Breitner : > > Hi, > > Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones: >> > Are we okay with stealing some operator sections for this? E.G. (x >> > > > ). I think the boxed sums larger than 2 choices are all technically >> > > > overlapping with sections. >> >> I hadn't thought of that. I suppose that in distfix notation we >> could require spaces >> (x | |) >> since vertical bar by itself isn't an operator. But then (_||) x >> might feel more compact. >> >> Also a section (x ||) isn't valid in a pattern, so we would not need >> to require spaces there. >> >> But my gut feel is: yes, with AnonymousSums we should just steal the >> syntax. It won't hurt existing code (since it won't use >> AnonymousSums), and if you *are* using AnonymousSums then the distfix >> notation is probably more valuable than the sections for an operator >> you probably aren't using. > > I wonder if this syntax for constructors is really that great. Yes, you > there is similarly with the type constructor (which is nice), but for > the data constructor, do we really want an unary encoding and have our > users count bars? > > I believe the user (and also us, having to read core) would be better > served by some syntax that involves plain numbers. > > > > I reacted the same way to the proposed syntax. > > Imagine already having an anonymous sum type and then deciding adding > another constructor. Naturally you'd have to update your code to handle the > new constructor, but you also need to update the code for all other > constructors as well by adding another bar in the right place. That seems > unnecessary and there's no need to do that for named sum types. > > > > What about explicitly stating the index as a number? > > > > (1 | Int) :: ( String | Int | Bool ) > > (#1 | Int #) :: (# String | Int | Bool #) > > > > case sum of > > (0 | myString ) -> ... > > (1 | myInt ) -> ... > > (2 | myBool ) -> ... > > > > This allows you to at least add new constructors at the end without changing > existing code. > > Is it harder to resolve by type inference since we're not stating the number > of constructors? If so we could do something similar to Joachim's proposal; > > > > case sum of > > (0 of 3 | myString ) -> ... > > (1 of 3 | myInt ) -> ... > > (2 of 3 | myBool ) -> ... > > > > .. and at least you don't have to count bars. > > > > > Given that of is already a keyword, how about something involving "3 > of 4"? For example > > (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) > > and > > case sum of > (Put# x in 1 of 3) -> ... > (Put# x in 2 of 3) -> ... > (Put# x in 3 of 3) -> ... > > (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) > > > I don?t find this particular choice very great, but something with > numbers rather than ASCII art seems to make more sense here. Is there > something even better? > > Greetings, > Joachim > > > > > -- > Joachim ?nomeata? Breitner > mail at joachim-breitner.de ? http://www.joachim-breitner.de/ > Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F > Debian Developer: nomeata at debian.org > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From simonpj at microsoft.com Tue Sep 8 11:48:21 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 11:48:21 +0000 Subject: Unpacking sum types In-Reply-To: <55EEA24C.9080504@gmail.com> References: <55EE93DC.7050409@gmail.com> <55EEA24C.9080504@gmail.com> Message-ID: | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple. (# (# Int,Bool #) | Int #) Simon | -----Original Message----- | From: Simon Marlow [mailto:marlowsd at gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs at haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs at haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > From simonpj at microsoft.com Tue Sep 8 11:50:27 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 11:50:27 +0000 Subject: ArrayArrays In-Reply-To: References: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> <55EE90ED.1040609@gmail.com> Message-ID: <1634726badf84376a7a583283a76ac4e@DB4PR30MB030.064d.mgd.msft.net> I'm not willing to fight too hard for it, but it feels more like the "right" solution than retaining a cut-and-paste copy of the same code and bifurcating further on each argument you want to consider such a degree of freedom. Like I say, I?m not against allowing polymorphism over unlifted-but-boxed types, and I can see the advantages. But it?s a separate proposal in its own right. Simon From: Edward Kmett [mailto:ekmett at gmail.com] Sent: 08 September 2015 09:30 To: Simon Marlow Cc: Simon Peyton Jones; Ryan Newton; Johan Tibell; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates Subject: Re: ArrayArrays Once you start to include all the other primitive types there is a bit more of an explosion. MVar#, TVar#, MutVar#, Small variants, etc. can all be modified to carry unlifted content. Being able to be parametric over that choice would permit a number of things in user land to do the same thing with an open-ended set of design possibilities that are rather hard to contemplate in advance. e.g. being able to abstract over them could let you just use a normal (,) to carry around unlifted parametric data types or being able to talk about [MVar# s a] drastically reducing the number of one off data types we need to invent. If you can talk about the machinery mentioned above then you can have typeclasses parameterized on an argument that could be either unlifted or lifted. I'm not willing to fight too hard for it, but it feels more like the "right" solution than retaining a cut-and-paste copy of the same code and bifurcating further on each argument you want to consider such a degree of freedom. As such it seems like a pretty big win for a comparatively minor change to the levity polymorphism machinery. -Edward On Tue, Sep 8, 2015 at 3:40 AM, Simon Marlow > wrote: This would be very cool, however it's questionable whether it's worth it. Without any unlifted kind, we need - ArrayArray# - a set of new/read/write primops for every element type, either built-in or made from unsafeCoerce# With the unlifted kind, we would need - ArrayArray# - one set of new/read/write primops With levity polymorphism, we would need - none of this, Array# can be used So having an unlifted kind already kills a lot of the duplication, polymorphism only kills a bit more. Cheers Simon On 08/09/2015 00:14, Edward Kmett wrote: Assume we had the ability to talk about Levity in a new way and instead of just: data Levity = Lifted | Unlifted type * = TYPE 'Lifted type # = TYPE 'Unlifted we replace had a more nuanced notion of TYPE parameterized on another data type: data Levity = Lifted | Unlifted data Param = Composite | Simple Levity and we parameterized TYPE with a Param rather than Levity. Existing strange representations can continue to live in TYPE 'Composite (# Int# , Double #) :: TYPE 'Composite and we don't support parametricity in there, just like, currently we don't allow parametricity in #. We can include the undefined example from Richard's talk: undefined :: forall (v :: Param). v and ultimately lift it into his pi type when it is available just as before. But we could let consider TYPE ('Simple 'Unlifted) as a form of 'parametric #' covering unlifted things we're willing to allow polymorphism over because they are just pointers to something in the heap, that just happens to not be able to be _|_ or a thunk. In this setting, recalling that above, I modified Richard's TYPE to take a Param instead of Levity, we can define a type alias for things that live as a simple pointer to a heap allocated object: type GC (l :: Levity) = TYPE ('Simple l) type * = GC 'Lifted and then we can look at existing primitives generalized: Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted Weak#, StablePtr#, StableName#, etc. all can take similar modifications. Recall that an ArrayArray# was just an Array# hacked up to be able to hold onto the subset of # that is collectable. Almost all of the operations on these data types can work on the more general kind of argument. newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> State# s -> (# State# s, MutableArray# s a #) writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a -> Int# -> a -> State# s -> State# s readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a -> Int# -> State# s -> (# State# s, a #) etc. Only a couple of our existing primitives _can't_ generalize this way. The one that leaps to mind is atomicModifyMutVar, which would need to stay constrained to only work on arguments in *, because of the way it operates. With that we can still talk about MutableArray# s Int but now we can also talk about: MutableArray# s (MutableArray# s Int) without the layer of indirection through a box in * and without an explosion of primops. The same newFoo, readFoo, writeFoo machinery works for both kinds. The struct machinery doesn't get to take advantage of this, but it would let us clean house elsewhere in Prim and drastically improve the range of applicability of the existing primitives with nothing more than a small change to the levity machinery. I'm not attached to any of the names above, I coined them just to give us a concrete thing to talk about. Here I'm only proposing we extend machinery in GHC.Prim this way, but an interesting 'now that the barn door is open' question is to consider that our existing Haskell data types often admit a similar form of parametricity and nothing in principle prevents this from working for Maybe or [] and once you permit inference to fire across all of GC l then it seems to me that you'd start to get those same capabilities there as well when LevityPolymorphism was turned on. -Edward On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones >> wrote: This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}# references into a {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the need for folks to descend into the use of the more evil structure primitives we're talking about, and letting us keep a few more principles around us.____ __ __ I?m lost. Can you give some concrete examples that illustrate how levity polymorphism will help us?____ Simon____ __ __ *From:*Edward Kmett [mailto:ekmett at gmail.com >] *Sent:* 07 September 2015 21:17 *To:* Simon Peyton Jones *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates *Subject:* Re: ArrayArrays____ __ __ I had a brief discussion with Richard during the Haskell Symposium about how we might be able to let parametricity help a bit in reducing the space of necessarily primops to a slightly more manageable level. ____ __ __ Notably, it'd be interesting to explore the ability to allow parametricity over the portion of # that is just a gcptr.____ __ __ We could do this if the levity polymorphism machinery was tweaked a bit. You could envision the ability to abstract over things in both * and the subset of # that are represented by a gcptr, then modifying the existing array primitives to be parametric in that choice of levity for their argument so long as it was of a "heap object" levity.____ __ __ This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}# references into a {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the need for folks to descend into the use of the more evil structure primitives we're talking about, and letting us keep a few more principles around us.____ __ __ Then in the cases like `atomicModifyMutVar#` where it needs to actually be in * rather than just a gcptr, due to the constructed field selectors it introduces on the heap then we could keep the existing less polymorphic type.____ __ __ -Edward____ __ __ On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones >> wrote:____ It was fun to meet and discuss this.____ ____ Did someone volunteer to write a wiki page that describes the proposed design? And, I earnestly hope, also describes the menagerie of currently available array types and primops so that users can have some chance of picking the right one?!____ ____ Thanks____ ____ Simon____ ____ *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org >] *On Behalf Of *Ryan Newton *Sent:* 31 August 2015 23:11 *To:* Edward Kmett; Johan Tibell *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates *Subject:* Re: ArrayArrays____ ____ Dear Edward, Ryan Yates, and other interested parties -- ____ ____ So when should we meet up about this?____ ____ May I propose the Tues afternoon break for everyone at ICFP who is interested in this topic? We can meet out in the coffee area and congregate around Edward Kmett, who is tall and should be easy to find ;-).____ ____ I think Ryan is going to show us how to use his new primops for combined array + other fields in one heap object?____ ____ On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett >> wrote:____ Without a custom primitive it doesn't help much there, you have to store the indirection to the mask.____ ____ With a custom primitive it should cut the on heap root-to-leaf path of everything in the HAMT in half. A shorter HashMap was actually one of the motivating factors for me doing this. It is rather astoundingly difficult to beat the performance of HashMap, so I had to start cheating pretty badly. ;)____ ____ -Edward____ ____ On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >> wrote:____ I'd also be interested to chat at ICFP to see if I can use this for my HAMT implementation.____ ____ On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett >> wrote:____ Sounds good to me. Right now I'm just hacking up composable accessors for "typed slots" in a fairly lens-like fashion, and treating the set of slots I define and the 'new' function I build for the data type as its API, and build atop that. This could eventually graduate to template-haskell, but I'm not entirely satisfied with the solution I have. I currently distinguish between what I'm calling "slots" (things that point directly to another SmallMutableArrayArray# sans wrapper) and "fields" which point directly to the usual Haskell data types because unifying the two notions meant that I couldn't lift some coercions out "far enough" to make them vanish.____ ____ I'll be happy to run through my current working set of issues in person and -- as things get nailed down further -- in a longer lived medium than in personal conversations. ;)____ ____ -Edward____ ____ On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton >> wrote:____ I'd also love to meet up at ICFP and discuss this. I think the array primops plus a TH layer that lets (ab)use them many times without too much marginal cost sounds great. And I'd like to learn how we could be either early users of, or help with, this infrastructure.____ ____ CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is currently working on concurrent data structures in Haskell, but will not be at ICFP.____ ____ ____ On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates >> wrote:____ I completely agree. I would love to spend some time during ICFP and friends talking about what it could look like. My small array for STM changes for the RTS can be seen here [1]. It is on a branch somewhere between 7.8 and 7.10 and includes irrelevant STM bits and some confusing naming choices (sorry), but should cover all the details needed to implement it for a non-STM context. The biggest surprise for me was following small array too closely and having a word/byte offset miss-match [2]. [1]: https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 Ryan____ On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett >> wrote: > I'd love to have that last 10%, but its a lot of work to get there and more > importantly I don't know quite what it should look like. > > On the other hand, I do have a pretty good idea of how the primitives above > could be banged out and tested in a long evening, well in time for 7.12. And > as noted earlier, those remain useful even if a nicer typed version with an > extra level of indirection to the sizes is built up after. > > The rest sounds like a good graduate student project for someone who has > graduate students lying around. Maybe somebody at Indiana University who has > an interest in type theory and parallelism can find us one. =) > > -Edward > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >> wrote: >> >> I think from my perspective, the motivation for getting the type >> checker involved is primarily bringing this to the level where users >> could be expected to build these structures. it is reasonable to >> think that there are people who want to use STM (a context with >> mutation already) to implement a straight forward data structure that >> avoids extra indirection penalty. There should be some places where >> knowing that things are field accesses rather then array indexing >> could be helpful, but I think GHC is good right now about handling >> constant offsets. In my code I don't do any bounds checking as I know >> I will only be accessing my arrays with constant indexes. I make >> wrappers for each field access and leave all the unsafe stuff in >> there. When things go wrong though, the compiler is no help. Maybe >> template Haskell that generates the appropriate wrappers is the right >> direction to go. >> There is another benefit for me when working with these as arrays in >> that it is quite simple and direct (given the hoops already jumped >> through) to play with alignment. I can ensure two pointers are never >> on the same cache-line by just spacing things out in the array. >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >> wrote: >> > They just segfault at this level. ;) >> > >> > Sent from my iPhone >> > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >> wrote: >> > >> > You presumably also save a bounds check on reads by hard-coding the >> > sizes? >> > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >> wrote: >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> independent questions: >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> b.) if you want cardmarking. >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> pretty >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> -Edward >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >> wrote: >> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up >> >>> if >> >>> they were small enough and there are enough of them. You get a bit >> >>> better >> >>> locality of reference in terms of what fits in the first cache line of >> >>> them. >> >>> >> >>> -Edward >> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >> >> >>> wrote: >> >>>> >> >>>> Yes. And for the short term I can imagine places we will settle with >> >>>> arrays even if it means tracking lengths unnecessarily and >> >>>> unsafeCoercing >> >>>> pointers whose types don't actually match their siblings. >> >>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >>>> array >> >>>> objects *other* than using them to fake structs? (Much to >> >>>> derecommend, as >> >>>> you mentioned!) >> >>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >> >> >>>> wrote: >> >>>>> >> >>>>> I think both are useful, but the one you suggest requires a lot more >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >>>>> >> >>>>> -Edward >> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >> >> >>>>> wrote: >> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >>>>>> unbounded >> >>>>>> length) with extra payload. >> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >> >>>>>> especially >> >>>>>> with the extra payload at front. But wouldn't the general solution >> >>>>>> for >> >>>>>> structs be one that that allows new user data type defs for # >> >>>>>> types? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >>>>>>> known >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >>>>>>> above, but >> >>>>>>> where the word counts were stored in the objects themselves. >> >>>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >>>>>>> likely >> >>>>>>> want to be something we build in addition to MutVar# rather than a >> >>>>>>> replacement. >> >>>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build info >> >>>>>>> tables that knew them, and typechecker support, for instance, it'd >> >>>>>>> get >> >>>>>>> rather invasive. >> >>>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' versions >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >>>>>>> inline at the >> >>>>>>> end of the structure cease to be possible, so it isn't even a pure >> >>>>>>> win if we >> >>>>>>> did the engineering effort. >> >>>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding the one >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >>>>>>> >> >>>>>>> -Edward >> >>>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >> >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable structs >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >> >>>>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >>>>>>>> identity problems. But what about directly supporting an >> >>>>>>>> extensible set of >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >>>>>>>> MutVar#? That >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >>>>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >>>>>>>> class >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential >> >>>>>>>> counterparts. >> >>>>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >>>>>>>> >> wrote: >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >>>>>>>>> maybe >> >>>>>>>>> make a ticket for it. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Thanks >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com >] >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >>>>>>>>> To: Simon Peyton Jones >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >>>>>>>>> this >> >>>>>>>>> all lives on the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has >> >>>>>>>>> to >> >>>>>>>>> deal with nested arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The Problem >> >>>>>>>>> >> >>>>>>>>> ----------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >>>>>>>>> list >> >>>>>>>>> in Haskell. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers >> >>>>>>>>> on >> >>>>>>>>> the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >>>>>>>>> Maybe >> >>>>>>>>> DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> That is 3 levels of indirection. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >>>>>>>>> worsening our representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This means that every operation we perform on this structure >> >>>>>>>>> will >> >>>>>>>>> be about half of the speed of an implementation in most other >> >>>>>>>>> languages >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Making Progress >> >>>>>>>>> >> >>>>>>>>> ---------------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I have been working on a number of data structures where the >> >>>>>>>>> indirection of going from something in * out to an object in # >> >>>>>>>>> which >> >>>>>>>>> contains the real pointer to my target and coming back >> >>>>>>>>> effectively doubles >> >>>>>>>>> my runtime. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >>>>>>>>> MutVar# >> >>>>>>>>> onto the mutable list when we dirty it. There is a well defined >> >>>>>>>>> write-barrier. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I could change out the representation to use >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, >> >>>>>>>>> but >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of >> >>>>>>>>> distinct >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >>>>>>>>> object to 2. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >>>>>>>>> array >> >>>>>>>>> object and then chase it to the next DLL and chase that to the >> >>>>>>>>> next array. I >> >>>>>>>>> do get my two pointers together in memory though. I'm paying for >> >>>>>>>>> a card >> >>>>>>>>> marking table as well, which I don't particularly need with just >> >>>>>>>>> two >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >>>>>>>>> machinery added >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >>>>>>>>> type, which can >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have two >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >>>>>>>>> What >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >>>>>>>>> impedence >> >>>>>>>>> mismatch between the imperative world and Haskell, and then just >> >>>>>>>>> let the >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >>>>>>>>> special >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >>>>>>>>> abuse pattern >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >>>>>>>>> make this >> >>>>>>>>> cheaper. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding >> >>>>>>>>> and next >> >>>>>>>>> entry in the linked list. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >> >>>>>>>>> strict world, and everything there lives in #. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >> >>>>>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >>>>>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >>>>>>>>> easily when they >> >>>>>>>>> are known strict and you chain -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at andres-loeh.de Tue Sep 8 11:59:42 2015 From: mail at andres-loeh.de (Andres Loeh) Date: Tue, 8 Sep 2015 13:59:42 +0200 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: I don't think there's any fundamental reason why unboxed fields prevent a Generic instance, as long as we're happy that unboxed values will be re-boxed in the generic representation. It simply seems as if nobody has thought of implementing this. As an example, consider the following hand-written example which works just fine: {-# LANGUAGE MagicHash, KindSignatures, PolyKinds, TypeOperators, TypeFamilies #-} module GenUnboxed where import GHC.Exts import GHC.Generics import Generics.Deriving.Eq data UPair = UPair Int# Char# instance Generic UPair where type Rep UPair = K1 R Int :*: K1 R Char from (UPair x y) = K1 (I# x) :*: K1 (C# y) to (K1 (I# x) :*: K1 (C# y)) = UPair x y instance GEq UPair test :: Bool test = let p = UPair 3# 'x'# in geq p p Cheers, Andres On Mon, Sep 7, 2015 at 10:02 PM, Ryan Scott wrote: > Unlifted types can't be used polymorphically or in instance > declarations, so this makes it impossible to do something like > > instance Generic Int# > > or store an Int# in one branch of a (:*:), preventing generics from > doing anything in #-land. (unless someone has found a way to hack > around this). > > I would be okay with implementing a generics-based approach, but we'd > have to add a caveat that it will only work out-of-the-box on GHC 8.0 > or later, due to TH's need to look up package information. (We could > give users the ability to specify a package name manually as a > workaround.) > > If this were added, where would be the best place to put it? th-lift? > generic-deriving? template-haskell? A new package (lift-generics)? > > Ryan S. > > On Mon, Sep 7, 2015 at 3:10 PM, Matthew Pickering > wrote: >> Continuing my support of the generics route. Is there a fundamental >> reason why it couldn't handle unlifted types? Given their relative >> paucity, it seems like a fair compromise to generically define lift >> instances for all normal data types but require TH for unlifted types. >> This approach seems much smoother from a maintenance perspective. >> >> On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott wrote: >>> There is a Lift typeclass defined in template-haskell [1] which, when >>> a data type is an instance, permits it to be directly used in a TH >>> quotation, like so >>> >>> data Example = Example >>> >>> instance Lift Example where >>> lift Example = conE (mkNameG_d "" "" "Example") >>> >>> e :: Example >>> e = [| Example |] >>> >>> Making Lift instances for most data types is straightforward and >>> mechanical, so the proposal is to allow automatic derivation of Lift >>> via a -XDeriveLift extension: >>> >>> data Example = Example deriving Lift >>> >>> This is actually a pretty a pretty old proposal [2], dating back to >>> 2007. I wanted to have this feature for my needs, so I submitted a >>> proof-of-concept at the GHC Trac issue page [3]. >>> >>> The question now is: do we really want to bake this feature into GHC? >>> Since not many people opined on the Trac page, I wanted to submit this >>> here for wider visibility and to have a discussion. >>> >>> Here are some arguments I have heard against this feature (please tell >>> me if I am misrepresenting your opinion): >>> >>> * We already have a th-lift package [4] on Hackage which allows >>> derivation of Lift via Template Haskell functions. In addition, if >>> you're using Lift, chances are you're also using the -XTemplateHaskell >>> extension in the first place, so th-lift should be suitable. >>> * The same functionality could be added via GHC generics (as of GHC >>> 7.12/8.0, which adds the ability to reify a datatype's package name >>> [5]), if -XTemplateHaskell can't be used. >>> * Adding another -XDerive- extension places a burden on GHC devs to >>> maintain it in the future in response to further Template Haskell >>> changes. >>> >>> Here are my (opinionated) responses to each of these: >>> >>> * th-lift isn't as fully-featured as a -XDerive- extension at the >>> moment, since it can't do sophisticated type inference [6] or derive >>> for data families. This is something that could be addressed with a >>> patch to th-lift, though. >>> * GHC generics wouldn't be enough to handle unlifted types like Int#, >>> Char#, or Double# (which other -XDerive- extensions do). >>> * This is a subjective measurement, but in terms of the amount of code >>> I had to add, -XDeriveLift was substantially simpler than other >>> -XDerive extensions, because there are fewer weird corner cases. Plus, >>> I'd volunteer to maintain it :) >>> >>> Simon PJ wanted to know if other Template Haskell programmers would >>> find -XDeriveLift useful. Would you be able to use it? Would you like >>> to see a solution other than putting it into GHC? I'd love to hear >>> feedback so we can bring some closure to this 8-year-old feature >>> request. >>> >>> Ryan S. >>> >>> ----- >>> [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift >>> [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html >>> [3] https://ghc.haskell.org/trac/ghc/ticket/1830 >>> [4] http://hackage.haskell.org/package/th-lift >>> [5] https://ghc.haskell.org/trac/ghc/ticket/10030 >>> [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Tue Sep 8 12:03:05 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 12:03:05 +0000 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: <6962f286ec8346f6866f25b7112352f6@DB4PR30MB030.064d.mgd.msft.net> | I don't think there's any fundamental reason why unboxed fields | prevent a Generic instance, as long as we're happy that unboxed values | will be re-boxed in the generic representation. It simply seems as if Interesting and quite reasonable idea, as an extension to `deriving(Generic)`. Make a ticket? Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | Andres Loeh | Sent: 08 September 2015 13:00 | To: Ryan Scott | Cc: GHC developers | Subject: Re: Proposal: Automatic derivation of Lift | | I don't think there's any fundamental reason why unboxed fields | prevent a Generic instance, as long as we're happy that unboxed values | will be re-boxed in the generic representation. It simply seems as if | nobody has thought of implementing this. As an example, consider the | following hand-written example which works just fine: | | {-# LANGUAGE MagicHash, KindSignatures, PolyKinds, TypeOperators, | TypeFamilies #-} module GenUnboxed where | | import GHC.Exts | import GHC.Generics | import Generics.Deriving.Eq | | data UPair = UPair Int# Char# | | instance Generic UPair where | type Rep UPair = K1 R Int :*: K1 R Char | from (UPair x y) = K1 (I# x) :*: K1 (C# y) | to (K1 (I# x) :*: K1 (C# y)) = UPair x y | | instance GEq UPair | | test :: Bool | test = let p = UPair 3# 'x'# in geq p p | | Cheers, | Andres | | On Mon, Sep 7, 2015 at 10:02 PM, Ryan Scott | wrote: | > Unlifted types can't be used polymorphically or in instance | > declarations, so this makes it impossible to do something like | > | > instance Generic Int# | > | > or store an Int# in one branch of a (:*:), preventing generics from | > doing anything in #-land. (unless someone has found a way to hack | > around this). | > | > I would be okay with implementing a generics-based approach, but | we'd | > have to add a caveat that it will only work out-of-the-box on GHC | 8.0 | > or later, due to TH's need to look up package information. (We could | > give users the ability to specify a package name manually as a | > workaround.) | > | > If this were added, where would be the best place to put it? th- | lift? | > generic-deriving? template-haskell? A new package (lift-generics)? | > | > Ryan S. | > | > On Mon, Sep 7, 2015 at 3:10 PM, Matthew Pickering | > wrote: | >> Continuing my support of the generics route. Is there a fundamental | >> reason why it couldn't handle unlifted types? Given their relative | >> paucity, it seems like a fair compromise to generically define lift | >> instances for all normal data types but require TH for unlifted | types. | >> This approach seems much smoother from a maintenance perspective. | >> | >> On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott | wrote: | >>> There is a Lift typeclass defined in template-haskell [1] which, | >>> when a data type is an instance, permits it to be directly used in | a | >>> TH quotation, like so | >>> | >>> data Example = Example | >>> | >>> instance Lift Example where | >>> lift Example = conE (mkNameG_d "" | >>> "" "Example") | >>> | >>> e :: Example | >>> e = [| Example |] | >>> | >>> Making Lift instances for most data types is straightforward and | >>> mechanical, so the proposal is to allow automatic derivation of | Lift | >>> via a -XDeriveLift extension: | >>> | >>> data Example = Example deriving Lift | >>> | >>> This is actually a pretty a pretty old proposal [2], dating back | to | >>> 2007. I wanted to have this feature for my needs, so I submitted a | >>> proof-of-concept at the GHC Trac issue page [3]. | >>> | >>> The question now is: do we really want to bake this feature into | GHC? | >>> Since not many people opined on the Trac page, I wanted to submit | >>> this here for wider visibility and to have a discussion. | >>> | >>> Here are some arguments I have heard against this feature (please | >>> tell me if I am misrepresenting your opinion): | >>> | >>> * We already have a th-lift package [4] on Hackage which allows | >>> derivation of Lift via Template Haskell functions. In addition, if | >>> you're using Lift, chances are you're also using the | >>> -XTemplateHaskell extension in the first place, so th-lift should | be suitable. | >>> * The same functionality could be added via GHC generics (as of | GHC | >>> 7.12/8.0, which adds the ability to reify a datatype's package | name | >>> [5]), if -XTemplateHaskell can't be used. | >>> * Adding another -XDerive- extension places a burden on GHC devs | to | >>> maintain it in the future in response to further Template Haskell | >>> changes. | >>> | >>> Here are my (opinionated) responses to each of these: | >>> | >>> * th-lift isn't as fully-featured as a -XDerive- extension at the | >>> moment, since it can't do sophisticated type inference [6] or | derive | >>> for data families. This is something that could be addressed with | a | >>> patch to th-lift, though. | >>> * GHC generics wouldn't be enough to handle unlifted types like | >>> Int#, Char#, or Double# (which other -XDerive- extensions do). | >>> * This is a subjective measurement, but in terms of the amount of | >>> code I had to add, -XDeriveLift was substantially simpler than | other | >>> -XDerive extensions, because there are fewer weird corner cases. | >>> Plus, I'd volunteer to maintain it :) | >>> | >>> Simon PJ wanted to know if other Template Haskell programmers | would | >>> find -XDeriveLift useful. Would you be able to use it? Would you | >>> like to see a solution other than putting it into GHC? I'd love to | >>> hear feedback so we can bring some closure to this 8-year-old | >>> feature request. | >>> | >>> Ryan S. | >>> | >>> ----- | >>> [1] | >>> http://hackage.haskell.org/package/template-haskell- | 2.10.0.0/docs/La | >>> nguage-Haskell-TH-Syntax.html#t:Lift | >>> [2] | >>> https://mail.haskell.org/pipermail/template-haskell/2007- | October/000 | >>> 635.html [3] https://ghc.haskell.org/trac/ghc/ticket/1830 | >>> [4] http://hackage.haskell.org/package/th-lift | >>> [5] https://ghc.haskell.org/trac/ghc/ticket/10030 | >>> [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 | >>> _______________________________________________ | >>> ghc-devs mailing list | >>> ghc-devs at haskell.org | >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs at haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Tue Sep 8 12:05:26 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 8 Sep 2015 08:05:26 -0400 Subject: Unpacking sum types In-Reply-To: References: <55EE93DC.7050409@gmail.com> <55EEA24C.9080504@gmail.com> Message-ID: <9BA63ECF-DADE-4C32-B707-D494F13E4CE2@cis.upenn.edu> I just added two design notes to the wiki page: 1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces. 2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.) Glad to see this coming together! Richard On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones wrote: > | I see, but then you can't have multiple fields, like > | > | ( (# Int,Bool #) |) > | > | You'd have to box the inner tuple too. Ok, I suppose. > > Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple. > > (# (# Int,Bool #) | Int #) > > Simon > > | -----Original Message----- > | From: Simon Marlow [mailto:marlowsd at gmail.com] > | Sent: 08 September 2015 09:55 > | To: Simon Peyton Jones; Johan Tibell; Ryan Newton > | Cc: ghc-devs at haskell.org > | Subject: Re: Unpacking sum types > | > | On 08/09/2015 09:31, Simon Peyton Jones wrote: > | > | How did you envisage implementing anonymous boxed sums? What is > | > | their heap representation? > | > > | > *Exactly* like tuples; that is, we have a family of data type > | declarations: > | > > | > data (a|b) = (_|) a > | > | (|_) b > | > > | > data (a|b|c) = (_||) a > | > | (|_|) b > | > | (||_) c > | > ..etc. > | > | I see, but then you can't have multiple fields, like > | > | ( (# Int,Bool #) |) > | > | You'd have to box the inner tuple too. Ok, I suppose. > | > | Cheers > | Simon > | > | > | > Simon > | > > | > | > | > | One option is to use some kind of generic object with a dynamic > | > | number of pointers and non-pointers, and one field for the tag. > | > | The layout would need to be stored in the object. This isn't a > | > | particularly efficient representation, though. Perhaps there > | could > | > | be a family of smaller specialised versions for common sizes. > | > | > | > | Do we have a use case for the boxed version, or is it just for > | > | consistency? > | > | > | > | Cheers > | > | Simon > | > | > | > | > | > | > Looks good to me! > | > | > > | > | > Simon > | > | > > | > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* > | 01 > | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; > | Ryan > | > | Newton > *Cc:* ghc-devs at haskell.org > *Subject:* RFC: Unpacking > | > | sum types > > I have a draft design for unpacking sum types that > | > | I'd like some > feedback on. In particular feedback both on: > | > | > > | > | > * the writing and clarity of the proposal and > | > | > > | > | > * the proposal itself. > | > | > > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > | > | > > | > | > -- Johan > | > | > > | > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Tue Sep 8 12:10:19 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 12:10:19 +0000 Subject: ArrayArrays In-Reply-To: <55EE90ED.1040609@gmail.com> References: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> <55EE90ED.1040609@gmail.com> Message-ID: <734d9e915b8446a8ae79c131d6f50c9d@DB4PR30MB030.064d.mgd.msft.net> | Without any unlifted kind, we need | - ArrayArray# | - a set of new/read/write primops for every element type, | either built-in or made from unsafeCoerce# | | With the unlifted kind, we would need | - ArrayArray# | - one set of new/read/write primops | | With levity polymorphism, we would need | - none of this, Array# can be used I don't think levity polymorphism will work here. The code for a function needs to know whether an intermediate value of type 'a' is strict or not. It HAS to choose (unless we compile two versions of every function). So I don't see how to be polymorphic over a type variable that can range over both lifted and unlifted types. The only reason that 'error' is levity-polymorphic over both lifted and unlifted types is that it never returns! error :: forall (a :: AnyKind). String -> a the code for error never manipulates a value of type 'a', so all is well. But it's an incredibly special case. Simon From simonpj at microsoft.com Tue Sep 8 12:35:00 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 12:35:00 +0000 Subject: Shared data type for extension flags In-Reply-To: References: Message-ID: <72a513a1fc4345b0a48d9727b7b10d53@DB4PR30MB030.064d.mgd.msft.net> Yes, we?d have to broaden the description of the package. I defer to Edward Yang and Duncan Coutts who have a clearer idea of the architecture in this area. Simon From: Michael Smith [mailto:michael at diglumi.com] Sent: 02 September 2015 17:27 To: Simon Peyton Jones; Matthew Pickering Cc: GHC developers Subject: Re: Shared data type for extension flags The package description for that is "The GHC compiler's view of the GHC package database format", and this doesn't really have to do with the package database format. Would it be okay to put this in there anyway? On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones > wrote: we already have such a shared library, I think: bin-package-db. would that do? Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Michael Smith Sent: 02 September 2015 09:21 To: Matthew Pickering Cc: GHC developers Subject: Re: Shared data type for extension flags That sounds like a good approach. Are there other things that would go nicely in a shared package like this, in addition to the extension data type? On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering > wrote: Surely the easiest way here (including for other tooling - ie haskell-src-exts) is to create a package which just provides this enumeration. GHC, cabal, th, haskell-src-exts and so on then all depend on this package rather than creating their own enumeration. On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith > wrote: > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > capababilty > to Template Haskell to detect which language extensions enabled. > Unfortunately, > since template-haskell can't depend on ghc (as ghc depends on > template-haskell), > it can't simply re-export the ExtensionFlag type from DynFlags to the user. > > There is a second data type encoding the list of possible language > extensions in > the Cabal package, in Language.Haskell.Extension [3]. But template-haskell > doesn't already depend on Cabal, and doing so seems like it would cause > difficulties, as the two packages can be upgraded separately. > > So adding this new feature to Template Haskell requires introducing a > *third* > data type for language extensions. It also requires enumerating this full > list > in two more places, to convert back and forth between the TH Extension data > type > and GHC's internal ExtensionFlag data type. > > Is there another way here? Can there be one single shared data type for this > somehow? > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > [2] https://phabricator.haskell.org/D1200 > [3] > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Tue Sep 8 13:45:40 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 8 Sep 2015 09:45:40 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> I have put up an alternate set of proposals on https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes These sidestep around `Force` and `suspend` but probably have other problems. They make heavy use of levity polymorphism. Back story: this all was developed in a late-evening Haskell Symposium session that took place in the hotel bar. It seems Edward and I walked away with quite different understandings of what had taken place. I've written up my understanding. Most likely, the Right Idea is a combination of this all! See what you think. Thanks! Richard On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones wrote: > | And to > | be honest, I'm not sure we need arbitrary data types in Unlifted; > | Force (which would be primitive) might be enough. > > That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot: > > type StrictList a = Force (StrictList' a) > data StrictList' a = Nil | Cons !a (StrictList a) > > mapStrict :: (a -> b) -> StrictList a -> StrictList b > mapStrict f xs = mapStrict' f (suspend xs) > > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b > mapStrict' f Nil = Nil > mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) > > > That doesn't look terribly convenient. > > | ensure that threads don't simply > | pass thunks between each other. But, if you have unlifted types, then > | you can have: > | > | data UMVar (a :: Unlifted) > | > | and then the type rules out the possibility of passing thunks through > | a reference (at least at the top level). > > Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define > newtype UMVar a = UMV (MVar a) > putUMVar :: UMVar a -> a -> IO () > putUMVar (UMVar v) x = x `seq` putMVar v x > > I don't see Force helping here. > > Simon > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From jan.stolarek at p.lodz.pl Tue Sep 8 14:15:02 2015 From: jan.stolarek at p.lodz.pl (Jan Stolarek) Date: Tue, 8 Sep 2015 16:15:02 +0200 Subject: Unlifted data types In-Reply-To: <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> References: <1441353701-sup-9422@sabre> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: <201509081615.03167.jan.stolarek@p.lodz.pl> I think the wiki page is imprecise when it says: > data unlifted UBool = UTrue | UFalse > > Intuitively, if you have x :: UBool in scope, you are guaranteed to have UTrue or UFalse, and > not bottom. But I still can say: foo :: UBool foo = foo and now foo contains bottom. I know that any attempt to use foo will lead to its immediate evaluation, but that is not exactly the same as "not containing a bottom". Or am I missing something here? Janek Dnia wtorek, 8 wrze?nia 2015, Richard Eisenberg napisa?: > I have put up an alternate set of proposals on > > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes > > These sidestep around `Force` and `suspend` but probably have other > problems. They make heavy use of levity polymorphism. > > Back story: this all was developed in a late-evening Haskell Symposium > session that took place in the hotel bar. It seems Edward and I walked away > with quite different understandings of what had taken place. I've written > up my understanding. Most likely, the Right Idea is a combination of this > all! > > See what you think. > > Thanks! > Richard > > On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones wrote: > > | And to > > | be honest, I'm not sure we need arbitrary data types in Unlifted; > > | Force (which would be primitive) might be enough. > > > > That's an interesting thought. But presumably you'd have to use > > 'suspend' (a terrible name) a lot: > > > > type StrictList a = Force (StrictList' a) > > data StrictList' a = Nil | Cons !a (StrictList a) > > > > mapStrict :: (a -> b) -> StrictList a -> StrictList b > > mapStrict f xs = mapStrict' f (suspend xs) > > > > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b > > mapStrict' f Nil = Nil > > mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) > > > > > > That doesn't look terribly convenient. > > > > | ensure that threads don't simply > > | pass thunks between each other. But, if you have unlifted types, then > > | you can have: > > | > > | data UMVar (a :: Unlifted) > > | > > | and then the type rules out the possibility of passing thunks through > > | a reference (at least at the top level). > > > > Really? Presumably UMVar is a new primitive? With a family of operations > > like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) > > putUMVar :: UMVar a -> a -> IO () > > putUMVar (UMVar v) x = x `seq` putMVar v x > > > > I don't see Force helping here. > > > > Simon > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Tue Sep 8 14:56:31 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 8 Sep 2015 10:56:31 -0400 Subject: Unlifted data types In-Reply-To: <201509081615.03167.jan.stolarek@p.lodz.pl> References: <1441353701-sup-9422@sabre> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <201509081615.03167.jan.stolarek@p.lodz.pl> Message-ID: <6F1B0D49-C676-44A7-BA30-BE810EC63847@cis.upenn.edu> On Sep 8, 2015, at 10:15 AM, Jan Stolarek wrote: > But I still can say: > > foo :: UBool > foo = foo > > ... Or am I missing > something here? I'm afraid you are. Top-level variables may not have an unlifted type, for exactly this reason. If you were to do this on a local let, your program would loop when it hits the let, so there's no problem there. Richard From simonpj at microsoft.com Tue Sep 8 14:58:23 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 8 Sep 2015 14:58:23 +0000 Subject: Unlifted data types In-Reply-To: <201509081615.03167.jan.stolarek@p.lodz.pl> References: <1441353701-sup-9422@sabre> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <201509081615.03167.jan.stolarek@p.lodz.pl> Message-ID: | > data unlifted UBool = UTrue | UFalse | > | > Intuitively, if you have x :: UBool in scope, you are guaranteed to | > have UTrue or UFalse, and not bottom. | | But I still can say: | | foo :: UBool | foo = foo | | and now foo contains bottom. You definitely CANNOT have a top-level declaration for a value of an unlifted type, any more than you can have for an Int# or unboxed tuple today. That should resolve your question. Simon From marlowsd at gmail.com Tue Sep 8 14:58:38 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Tue, 8 Sep 2015 15:58:38 +0100 Subject: ArrayArrays In-Reply-To: <734d9e915b8446a8ae79c131d6f50c9d@DB4PR30MB030.064d.mgd.msft.net> References: <325b043066bb48a79f254b75ba9753ee@DB4PR30MB030.064d.mgd.msft.net> <55EE90ED.1040609@gmail.com> <734d9e915b8446a8ae79c131d6f50c9d@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55EEF79E.604@gmail.com> On 08/09/2015 13:10, Simon Peyton Jones wrote: > > | Without any unlifted kind, we need > | - ArrayArray# > | - a set of new/read/write primops for every element type, > | either built-in or made from unsafeCoerce# > | > | With the unlifted kind, we would need > | - ArrayArray# > | - one set of new/read/write primops > | > | With levity polymorphism, we would need > | - none of this, Array# can be used > > I don't think levity polymorphism will work here. The code for a function needs to know whether an intermediate value of type 'a' is strict or not. It HAS to choose (unless we compile two versions of every function). So I don't see how to be polymorphic over a type variable that can range over both lifted and unlifted types. > > The only reason that 'error' is levity-polymorphic over both lifted and unlifted types is that it never returns! > error :: forall (a :: AnyKind). String -> a > the code for error never manipulates a value of type 'a', so all is well. But it's an incredibly special case. I think there's a bit of confusion here, Ed's email a bit earlier described the proposal for the third option above: https://mail.haskell.org/pipermail/ghc-devs/2015-September/009867.html For generalising these primops it would be fine, there are no thunks being built. Cheers Simon From jan.stolarek at p.lodz.pl Tue Sep 8 15:26:14 2015 From: jan.stolarek at p.lodz.pl (Jan Stolarek) Date: Tue, 8 Sep 2015 17:26:14 +0200 Subject: Unlifted data types In-Reply-To: <6F1B0D49-C676-44A7-BA30-BE810EC63847@cis.upenn.edu> References: <1441353701-sup-9422@sabre> <201509081615.03167.jan.stolarek@p.lodz.pl> <6F1B0D49-C676-44A7-BA30-BE810EC63847@cis.upenn.edu> Message-ID: <201509081726.14448.jan.stolarek@p.lodz.pl> > Top-level variables may not have an unlifted type Ah, that makes much more sense now. Thanks. Janek From alan.zimm at gmail.com Tue Sep 8 18:49:21 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Tue, 8 Sep 2015 20:49:21 +0200 Subject: Haskell Error Messages Message-ID: Is there currently any planned work around making the haskell error messages able to support something like the ones in IDRIS, as shown in David Christianson's talk "A Pretty printer that says what it means" at HIW? https://www.youtube.com/watch?v=m7BBCcIDXSg&list=PLnqUlCo055hVfNkQHP7z43r10yNo-mc7B&index=10 Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.gl.scott at gmail.com Tue Sep 8 19:01:31 2015 From: ryan.gl.scott at gmail.com (Ryan Scott) Date: Tue, 8 Sep 2015 15:01:31 -0400 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: Sorry, I forgot to reply-all earlier. > I hacked this up quickly just to show that it works in principle. In > practice, I think it's good to not just represent Int# as Int, but as > something like UInt where > > data UInt = UInt Int# > > i.e., is isomorphic to an Int, but distinguishable. Alternatively, > have a generic "unboxed" flag that could be inserted as a tag into the > surrounding K. I suppose we'd have to decide which is easier for programmers to use. Do we introduce UInt, UChar, et al. and require that users define instances of the desired typeclass for them: instance Lift UInt where lift (UInt i) = litE (intPrimL (I# i)) or do we introduce an unboxed flag and require users to write generic GLift instances using that flag: instance GLift (K1 Unboxed Int) where lift (K1 (Int i)) = litE (intPrimL (I# i)) The former has the advantage that you wouldn't need to change the GLift code to distinguish between (K1 Unboxed Int) and (K1 R Int), which might be a potential source of confusion for programmers. On the other hand, having an Unboxed flag requires only introducing one new data type, as opposed to a separate data type for each of the unlifted types that we want to work over. Ryan S. On Tue, Sep 8, 2015 at 7:59 AM, Andres Loeh wrote: > I don't think there's any fundamental reason why unboxed fields > prevent a Generic instance, as long as we're happy that unboxed values > will be re-boxed in the generic representation. It simply seems as if > nobody has thought of implementing this. As an example, consider the > following hand-written example which works just fine: > > {-# LANGUAGE MagicHash, KindSignatures, PolyKinds, TypeOperators, > TypeFamilies #-} > module GenUnboxed where > > import GHC.Exts > import GHC.Generics > import Generics.Deriving.Eq > > data UPair = UPair Int# Char# > > instance Generic UPair where > type Rep UPair = K1 R Int :*: K1 R Char > from (UPair x y) = K1 (I# x) :*: K1 (C# y) > to (K1 (I# x) :*: K1 (C# y)) = UPair x y > > instance GEq UPair > > test :: Bool > test = let p = UPair 3# 'x'# in geq p p > > Cheers, > Andres > > On Mon, Sep 7, 2015 at 10:02 PM, Ryan Scott wrote: >> Unlifted types can't be used polymorphically or in instance >> declarations, so this makes it impossible to do something like >> >> instance Generic Int# >> >> or store an Int# in one branch of a (:*:), preventing generics from >> doing anything in #-land. (unless someone has found a way to hack >> around this). >> >> I would be okay with implementing a generics-based approach, but we'd >> have to add a caveat that it will only work out-of-the-box on GHC 8.0 >> or later, due to TH's need to look up package information. (We could >> give users the ability to specify a package name manually as a >> workaround.) >> >> If this were added, where would be the best place to put it? th-lift? >> generic-deriving? template-haskell? A new package (lift-generics)? >> >> Ryan S. >> >> On Mon, Sep 7, 2015 at 3:10 PM, Matthew Pickering >> wrote: >>> Continuing my support of the generics route. Is there a fundamental >>> reason why it couldn't handle unlifted types? Given their relative >>> paucity, it seems like a fair compromise to generically define lift >>> instances for all normal data types but require TH for unlifted types. >>> This approach seems much smoother from a maintenance perspective. >>> >>> On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott wrote: >>>> There is a Lift typeclass defined in template-haskell [1] which, when >>>> a data type is an instance, permits it to be directly used in a TH >>>> quotation, like so >>>> >>>> data Example = Example >>>> >>>> instance Lift Example where >>>> lift Example = conE (mkNameG_d "" "" "Example") >>>> >>>> e :: Example >>>> e = [| Example |] >>>> >>>> Making Lift instances for most data types is straightforward and >>>> mechanical, so the proposal is to allow automatic derivation of Lift >>>> via a -XDeriveLift extension: >>>> >>>> data Example = Example deriving Lift >>>> >>>> This is actually a pretty a pretty old proposal [2], dating back to >>>> 2007. I wanted to have this feature for my needs, so I submitted a >>>> proof-of-concept at the GHC Trac issue page [3]. >>>> >>>> The question now is: do we really want to bake this feature into GHC? >>>> Since not many people opined on the Trac page, I wanted to submit this >>>> here for wider visibility and to have a discussion. >>>> >>>> Here are some arguments I have heard against this feature (please tell >>>> me if I am misrepresenting your opinion): >>>> >>>> * We already have a th-lift package [4] on Hackage which allows >>>> derivation of Lift via Template Haskell functions. In addition, if >>>> you're using Lift, chances are you're also using the -XTemplateHaskell >>>> extension in the first place, so th-lift should be suitable. >>>> * The same functionality could be added via GHC generics (as of GHC >>>> 7.12/8.0, which adds the ability to reify a datatype's package name >>>> [5]), if -XTemplateHaskell can't be used. >>>> * Adding another -XDerive- extension places a burden on GHC devs to >>>> maintain it in the future in response to further Template Haskell >>>> changes. >>>> >>>> Here are my (opinionated) responses to each of these: >>>> >>>> * th-lift isn't as fully-featured as a -XDerive- extension at the >>>> moment, since it can't do sophisticated type inference [6] or derive >>>> for data families. This is something that could be addressed with a >>>> patch to th-lift, though. >>>> * GHC generics wouldn't be enough to handle unlifted types like Int#, >>>> Char#, or Double# (which other -XDerive- extensions do). >>>> * This is a subjective measurement, but in terms of the amount of code >>>> I had to add, -XDeriveLift was substantially simpler than other >>>> -XDerive extensions, because there are fewer weird corner cases. Plus, >>>> I'd volunteer to maintain it :) >>>> >>>> Simon PJ wanted to know if other Template Haskell programmers would >>>> find -XDeriveLift useful. Would you be able to use it? Would you like >>>> to see a solution other than putting it into GHC? I'd love to hear >>>> feedback so we can bring some closure to this 8-year-old feature >>>> request. >>>> >>>> Ryan S. >>>> >>>> ----- >>>> [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift >>>> [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html >>>> [3] https://ghc.haskell.org/trac/ghc/ticket/1830 >>>> [4] http://hackage.haskell.org/package/th-lift >>>> [5] https://ghc.haskell.org/trac/ghc/ticket/10030 >>>> [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Wed Sep 9 01:37:18 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 8 Sep 2015 21:37:18 -0400 Subject: Haskell Error Messages In-Reply-To: References: Message-ID: <28F3A3E0-3209-4DB6-8455-E6AC740C7E2C@cis.upenn.edu> Ticket #8809 (https://ghc.haskell.org/trac/ghc/ticket/8809) seems the best spot to look for this. Richard On Sep 8, 2015, at 2:49 PM, "Alan & Kim Zimmerman" wrote: > Is there currently any planned work around making the haskell error messages able to support something like the ones in IDRIS, as shown in David Christianson's talk "A Pretty printer that says what it means" at HIW? > > https://www.youtube.com/watch?v=m7BBCcIDXSg&list=PLnqUlCo055hVfNkQHP7z43r10yNo-mc7B&index=10 > > Alan > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Wed Sep 9 02:43:55 2015 From: dan.doel at gmail.com (Dan Doel) Date: Tue, 8 Sep 2015 22:43:55 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> Message-ID: On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones wrote: > | And to > | be honest, I'm not sure we need arbitrary data types in Unlifted; > | Force (which would be primitive) might be enough. > > That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot: > > type StrictList a = Force (StrictList' a) > data StrictList' a = Nil | Cons !a (StrictList a) > > mapStrict :: (a -> b) -> StrictList a -> StrictList b > mapStrict f xs = mapStrict' f (suspend xs) > > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b > mapStrict' f Nil = Nil > mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) > > > That doesn't look terribly convenient. It's missing the part that makes it convenient. type StrictList a = Force (StrictList' a) data StrictList' a = Nil' | Cons' !a (StrictList a) pattern Nil = Force Nil' pattern Cons x xs = Force (Cons' x xs) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f Nil = Nil mapStrict f (Cons x xs) = Cons (f x) (mapStrict f xs) But, really, my point is that we already almost have StrictList _today_: data StrictList a = Nil | Cons !a !(StrictList a) The only difference between this and the previous definition (denotationally, at least) is the outer-most level. That's why I liked the original proposal (which probably disappeared too fast for most people to read it), which was more like being able to talk about `!a` as a thing in itself. It's the only semantic gap in being able to define totally unlifted data types right now. So maybe it's also the only operational gap that needs to be plugged, as well. But that was vetoed because `!a` in a data declaration doesn't make a constructor with type `!a -> ...`, but `a -> ...` which evaluates. > Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define > newtype UMVar a = UMV (MVar a) > putUMVar :: UMVar a -> a -> IO () > putUMVar (UMVar v) x = x `seq` putMVar v x > > I don't see Force helping here. Yes, true. It only helps ensure that the implementation is correct, rather than enabling a previously impossible construction. Kind of like certain uses of GADTs vs. phantom types. But the ArrayArray people already want UMVar (and the like) anyway, because it cuts out a layer of indirection for types that are already unlifted. -- Dan From dan.doel at gmail.com Wed Sep 9 02:57:23 2015 From: dan.doel at gmail.com (Dan Doel) Date: Tue, 8 Sep 2015 22:57:23 -0400 Subject: Unlifted data types In-Reply-To: <201509081726.14448.jan.stolarek@p.lodz.pl> References: <1441353701-sup-9422@sabre> <201509081615.03167.jan.stolarek@p.lodz.pl> <6F1B0D49-C676-44A7-BA30-BE810EC63847@cis.upenn.edu> <201509081726.14448.jan.stolarek@p.lodz.pl> Message-ID: I would say, by the way, that your question still makes some sense. When discussing strict evaluation, one can think of _values_ of a type and _expressions_ of a type. The (denotational) semantics of values would be unlifted, while expressions are lifted. And functions take values to expressions. So: f :: Int# -> Int# f x = f x is a valid definition, even though the result of the function is bottom, but has type Int#. And you are allowed: let i :: Int# i = error "whoa" in ... just not the same definition at the top level, since it'd be annoying for GHC to have to worry about evaluating all possible strict errors in every reachable module eagerly on program start up (I assume that's the/a reason). That may not be the only way to think about it, but it's how I tend to. -- Dan On Tue, Sep 8, 2015 at 11:26 AM, Jan Stolarek wrote: >> Top-level variables may not have an unlifted type > Ah, that makes much more sense now. Thanks. > > Janek > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From dan.doel at gmail.com Wed Sep 9 04:14:25 2015 From: dan.doel at gmail.com (Dan Doel) Date: Wed, 9 Sep 2015 00:14:25 -0400 Subject: Unpacking sum types In-Reply-To: <9BA63ECF-DADE-4C32-B707-D494F13E4CE2@cis.upenn.edu> References: <55EE93DC.7050409@gmail.com> <55EEA24C.9080504@gmail.com> <9BA63ECF-DADE-4C32-B707-D494F13E4CE2@cis.upenn.edu> Message-ID: I don't think any #-based operators are stolen at the term level, because # is required at both ends. `(#| x #)` is not a legal operator section (nor is `(#| x |#)`), and (#|_#) is not an operator name. The boxed version only steals operators because you can shove the entire thing to one side. I might have missed something, though. The type level steals type operators involving #, though. On Tue, Sep 8, 2015 at 8:05 AM, Richard Eisenberg wrote: > I just added two design notes to the wiki page: > 1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces. > > 2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.) > > Glad to see this coming together! > Richard > > On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones wrote: > >> | I see, but then you can't have multiple fields, like >> | >> | ( (# Int,Bool #) |) >> | >> | You'd have to box the inner tuple too. Ok, I suppose. >> >> Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple. >> >> (# (# Int,Bool #) | Int #) >> >> Simon >> >> | -----Original Message----- >> | From: Simon Marlow [mailto:marlowsd at gmail.com] >> | Sent: 08 September 2015 09:55 >> | To: Simon Peyton Jones; Johan Tibell; Ryan Newton >> | Cc: ghc-devs at haskell.org >> | Subject: Re: Unpacking sum types >> | >> | On 08/09/2015 09:31, Simon Peyton Jones wrote: >> | > | How did you envisage implementing anonymous boxed sums? What is >> | > | their heap representation? >> | > >> | > *Exactly* like tuples; that is, we have a family of data type >> | declarations: >> | > >> | > data (a|b) = (_|) a >> | > | (|_) b >> | > >> | > data (a|b|c) = (_||) a >> | > | (|_|) b >> | > | (||_) c >> | > ..etc. >> | >> | I see, but then you can't have multiple fields, like >> | >> | ( (# Int,Bool #) |) >> | >> | You'd have to box the inner tuple too. Ok, I suppose. >> | >> | Cheers >> | Simon >> | >> | >> | > Simon >> | > >> | > | >> | > | One option is to use some kind of generic object with a dynamic >> | > | number of pointers and non-pointers, and one field for the tag. >> | > | The layout would need to be stored in the object. This isn't a >> | > | particularly efficient representation, though. Perhaps there >> | could >> | > | be a family of smaller specialised versions for common sizes. >> | > | >> | > | Do we have a use case for the boxed version, or is it just for >> | > | consistency? >> | > | >> | > | Cheers >> | > | Simon >> | > | >> | > | >> | > | > Looks good to me! >> | > | > >> | > | > Simon >> | > | > >> | > | > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* >> | 01 >> | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; >> | Ryan >> | > | Newton > *Cc:* ghc-devs at haskell.org > *Subject:* RFC: Unpacking >> | > | sum types > > I have a draft design for unpacking sum types that >> | > | I'd like some > feedback on. In particular feedback both on: >> | > | > >> | > | > * the writing and clarity of the proposal and >> | > | > >> | > | > * the proposal itself. >> | > | > >> | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes >> | > | > >> | > | > -- Johan >> | > | > >> | > >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From greg at gregweber.info Wed Sep 9 04:20:12 2015 From: greg at gregweber.info (Greg Weber) Date: Tue, 8 Sep 2015 21:20:12 -0700 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <1469c7be53ed4f0dab3872de9fe5ad54@DB4PR30MB030.064d.mgd.msft.net> References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> <1469c7be53ed4f0dab3872de9fe5ad54@DB4PR30MB030.064d.mgd.msft.net> Message-ID: > (I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.) yes, a github PR is just a branch. Thanks for bringing the discussion back on track to a productive approach. This suggestion is what some of us were getting at and it would be better to just limit the discussion to this idea. On Mon, Sep 7, 2015 at 6:47 AM, Simon Peyton Jones wrote: > I am very much at the ignorant end of this debate: I'll just use whatever > I'm told to use. But I do resonate with this observation from Austin: > > | For one, having two code review tools of any form is completely > | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to > | switch, we should just switch. Having to have people decide how to > | contribute with two tools is as crazy as having two VCSs and just a > | way of asking people to get *more* confused, and have us answer more > | questions. That's something we need to avoid. > > As a code contributor and reviewer, this is awkward. As a contributor, how > do I choose? As a reviewer I'm presumably forced to learn both tools. > > But I'll go with the flow... I do not have a well-informed opinion about > the tradeoffs. > > (I'm tempted naively to ask: is there an automated way to go from a GitHub > PR to a Phab ticket? Then we could convert the former (if someone wants to > submit that way) into the latter.) > > Simon > > > | -----Original Message----- > | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of > | Austin Seipp > | Sent: 03 September 2015 05:42 > | To: Niklas Hamb?chen > | Cc: Simon Marlow; ghc-devs at haskell.org > | Subject: Re: Proposal: accept pull requests on GitHub > | > | (JFYI: I hate to announce my return with a giant novel of negative- > | nancy-ness about a proposal that just came up. I'm sorry about this!) > | > | TL;DR: I'm strongly -1 on this, because I think it introduces a lot of > | associated costs for everyone, the benefits aren't really clear, and I > | think it obscures the real core issue about "how do we get more > | contributors" and how to make that happen. Needless to say, GitHub > | does not magically solve both of these AFAICS. > | > | As is probably already widely known, I'm fairly against GitHub because > | I think at best its tools are mediocre and inappropriate for GHC - but > | I also don't think this proposal or the alternatives stemming from it > | are very good, and that it reduces visibility of the real, core > | complaints about what is wrong. Some of those problems may be with > | Phabricator, but it's hard to sort the wheat from the chaff, so to > | speak. > | > | For one, having two code review tools of any form is completely > | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to > | switch, we should just switch. Having to have people decide how to > | contribute with two tools is as crazy as having two VCSs and just a > | way of asking people to get *more* confused, and have us answer more > | questions. That's something we need to avoid. > | > | For the same reason, I'm also not a fan of 'use third party thing to > | augment other thing to remove its deficiencies making it OK', because > | the problem is _it adds surface area_ and other problems in other > | cases. It is a solution that should be considered a last resort, > | because it is a logical solution that applies to everything. If we > | have a bot that moves GH PRs into Phab and then review them there, the > | surface area of what we have to maintain and explain has suddenly > | exploded: because now instead of 1 thing we have 3 things (GH, Phab, > | bot) and the 3 interactions between them, for a multiplier of *six* > | things we have to deal with. And then we use reviewable,io, because GH > | reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian. > | We can logically 'automate' everything in all ways to make all > | contributors happy, but there's a real *cognitive* overhead to this > | and humans don't scale as well as computers do. It is not truly > | 'automated away' if the cognitive burden is still there. > | > | I also find it extremely strange to tell people "By the way, this > | method in which you've contributed, as was requested by community > | members, is actually a complete proxy for the real method of > | contributing, you can find all your imported code here". How is this > | supposed to make contribution *easier* as opposed to just more > | confusing? Now you've got the impression you're using "the real thing" > | when in reality it's shoved off somewhere else to have the nitpicking > | done. Just using Phabricator would be less complicated, IMO, and much > | more direct. > | > | The same thing goes for reviewable.io. Adding it as a layer over > | GitHub just makes the surface area larger, and puts less under our > | control. And is it going to exist in the same form in 2 or 3 years? > | Will it continue to offer the same tools, the same workflows that we > | "like", and what happens when we hit a wall? It's easy to say > | "probably" or "sure" to all this, until we hit something we dislike > | and have no possibility of fixing. > | > | And once you do all this, BTW, you can 'never go back'. It seems so > | easy to just say 'submit pull requests' once and nothing else, right? > | Wrong. Once you commit to that infrastructure, it is *there* and > | simply taking it out from under the feet of those using it is not only > | unfortunate, it is *a huge timesink to undo it all*. Which amounts to > | it never happening. Oh, but you can import everything elsewhere! The > | problem is you *can't* import everything, but more importantly you > | can't *import my memories in another way*, so it's a huge blow to > | contributors to ask them about these mental time sinks, then to forget > | them all. And as your project grows, this becomes more of a memory as > | you made a first and last choice to begin with. > | > | Phabricator was 'lucky' here because it had the gateway into being the > | first review tool for us. But that wasn't because it was *better* than > | GitHub. It was because we were already using it, and it did not > | interact badly with our other tools or force us to compromise things - > | so the *cost* was low. The cost is immeasurably higher by default > | against GitHub because of this, at least to me. That's just how it is > | sometimes. > | > | Keep in mind there is a cost to everything and how you fix it. GitHub > | is not a simple patch to add a GHC feature. It is a question that > | fundamentally concerns itself with the future of the project for a > | long time. The costs must be analyzed more aggressively. Again, > | Phabricator had 'first child' preferential treatment. That's not > | something we can undo now. > | > | I know this sounds like a lot of ad hoc mumbo jumbo, but please bear > | with me: we need to identify the *root issue* here to fix it. > | Otherwise we will pay for the costs of an improper fix for a long > | time, and we are going to keep having this conversation over, and over > | again. And we need to weigh in the cost of fixing it, which is why I > | mention that so much. > | > | So with all this in mind, you're back to just using GitHub. But again > | GitHub is quite mediocre at best. So what is the point of all this? > | It's hinted at here: > | > | > the number of contributions will go up, commits will be smaller, and > | there will be more of them per pull request (contributors will be able > | to put style changes and refactorings into separate commits, without > | jumping through a bunch of hoops). > | > | The real hint is that "the number of contributions will go up". That's > | a noble goal and I think it's at the heart of this proposal. > | > | Here's the meat of it question: what is the cost of achieving this > | goal? That is, what amount of work is sufficient to make this goal > | realizable, and finally - why is GitHub *the best use of our time for > | achieving this?* That's one aspect of the cost - that it's the best > | use of the time. I feel like this is fundamentally why I always seem > | to never 'get' this argument, and I'm sure it's very frustrating on > | behalf of the people who have talked to me about it and like GitHub. > | But I feel like I've never gotten a straight answer for GHC. > | > | If the goal is actually "make more people contribute", that's pretty > | broad. I can make that very easy: give everyone who ever submits a > | patch push access. This is a legitimate way to run large projects that > | has worked. People will almost certainly be more willing to commit, > | especially when overhead on patch submission is reduced so much. Why > | not just do that instead? It's not like we even mandate code review, > | although we could. You could reasonably trust CI to catch and revert > | things a lot of the time for people who commit directly to master. We > | all do it sometimes. > | > | I'm being serious about this. I can start doing that tomorrow because > | the *cost is low*, both now and reasonably speaking into some > | foreseeable future. It is one of many solutions to raw heart of the > | proposal. GitHub is not a low cost move, but also, it is a *long term > | cost* because of the technical deficiencies it won't aim to address > | (merge commits are ugly, branch reviews are weak, ticket/PR namespace > | overlaps with Trac, etc etc) or that we'll have to work around. > | > | That means that if we want GitHub to fix the "give us more > | contributors" problem, and it has a high cost, it not only has _to fix > | the problem_, it also has to do that well enough to offset its cost. I > | don't think it's clear that is the case right now, among a lot of > | other solutions. > | > | I don't think the root issue is "We _need_ GitHub to get more > | contributors". It sounds like the complaint is more "I don't like how > | Phabricator works right now". That's an important distinction, because > | the latter is not only more specific, it's more actionable: > | > | - Things like Arcanist can be tracked as a Git submodule. There is > | little to no pain in this, it's low cost, and it can always be > | synchronized with Phabricator. This eliminates the "Must clone > | arcanist" and "need to upgrade arcanist" points. > | > | - Similarly when Phabricator sometimes kills a lot of builds, it's > | because I do an upgrade. That's mostly an error on my part and I can > | simply schedule upgrades regularly, barring hotfixes or somesuch. That > | should basically eliminate these. The other build issues are from > | picking the wrong base commit from the revision, I think, which I > | believe should be fixable upstream (I need to get a solid example of > | one that isn't a mega ultra patch.) > | > | - If Harbormaster is not building dependent patches as mentioned in > | WhyNotPhabricator, that is a bug, and I have not been aware of it. > | Please make me aware of it so I can file bugs! I seriously don't look > | at _every_ patch, I need to know this. That could have probably been > | fixed ASAP otherwise. > | > | - We can get rid of the awkwardness of squashes etc by using > | Phabricator's "immutable" history, although it introduces merge > | commits. Whether this is acceptable is up to debate (I dislike merge > | commits, but could live with it). > | > | - I do not understand point #3, about answering questions. Here's > | the reality: every single one of those cases is *almost always an > | error*. That's not a joke. Forgetting to commit a file, amending > | changes in the working tree, and specifying a reviewer are all total > | errors as it stands today. Why is this a minus? It catches a useful > | class of 'interaction bugs'. If it's because sometimes Phabricator > | yells about build arifacts in the tree, those should be .gitignore'd. > | If it's because you have to 'git stash' sometimes, this is fairly > | trivial IMO. Finally, specifying reviewers IS inconvenient, but > | currently needed. We could easily assign a '#reviewers' tag that would > | add default reviewers. > | - In the future, Phabricator will hopefully be able to > | automatically assign the right reviewers to every single incoming > | patch, based on the source file paths in the tree, using the Owners > | tool. Technically, we could do that today if we wanted, it's just a > | little more effort to add more Herald rules. This will be far, far > | more robust than anything GitHub can offer, and eliminates point #3. > | > | - Styling, linting etc errors being included, because reviews are > | hard to create: This is tangential IMO. We need to just bite the > | bullet on this and settle on some lint and coding styles, and apply > | them to the tree uniformly. The reality is *nobody ever does style > | changes on their own*, and they are always accompanied by a diff, and > | they always have to redo the work of pulling them out, Phab or not. > | Literally 99% of the time we ask for this, it happens this way. > | Perhaps instead we should just eliminate this class of work by just > | running linters over all of the source code at once, and being happy > | with it. > | > | Doing this in fact has other benefits: like `arc lint` will always > | _correctly_ report when linting errors are violated. And we can reject > | patches that violate them, because they will always be accurate. > | > | - As for some of the quotes, some of them are funny, but the real > | message lies in the context. :) In particular, there have been several > | cases (such as the DWARF work) where the idea was "write 30 commits > | and put them on Phabricator". News flash: *this is bad*, no matter > | whether you're using Phabricator or not, because it makes reviewing > | the whole thing immensely difficult from a reviewer perspective. The > | point here is that we can clear this up by being more communicative > | about what we expect of authors of large patches, and communicating > | your intent ASAP so we can get patches in as fast as possible. Writing > | a patch is the easiest part of the work. > | > | And more: > | > | - Clean up the documentation, it's a mess. It feels nice that > | everything has clear, lucid explanations on the wiki, but the wiki is > | ridiculously massive and we have a tendancy for 'link creep' where we > | spread things out. The contributors docs could probably stand to be > | streamlined. We would have to do this anyway, moving to GitHub or not. > | > | - Improve the homepage, directly linking to this aforementioned > | page. > | > | - Make it clear what we expect of contributors. I feel like a lot of > | this could be explained by having a 5 minute drive-by guide for > | patches, and then a longer 10-minute guide about A) How to style > | things, B) How to format your patches if you're going to contribute > | regularly, C) Why it is this way, and D) finally links to all the > | other things you need to know. People going into Phabricator expecting > | it to behave like GitHub is a problem (more a cultural problem IMO but > | that's another story), and if this can't be directly fixed, the best > | thing to do is make it clear why it isn't. > | > | Those are just some of the things OTTOMH, but this email is already > | way too long. This is what I mean though: fixing most of these is > | going to have *seriously smaller cost* than moving to GitHub. It does > | not account for "The GitHub factor" of people contributing "just > | because it's on GitHub", but again, that value has to outweigh the > | other costs. I'm not seriously convinced it does. > | > | I know it's work to fix these things. But GitHub doesn't really > | magically make a lot of our needs go away, and it's not going to > | magically fix things like style or lint errors, the fact Travis-CI is > | still pretty insufficient for us in the long term (and Harbormaster is > | faster, on our own hardware, too), or that it will cause needlessly > | higher amounts of spam through Trac and GitHub itself. I don't think > | settling on it as - what seems to be - a first resort, is a really > | good idea. > | > | > | On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hamb?chen wrote: > | > On 02/09/15 22:42, Kosyrev Serge wrote: > | >> As a wild idea -- did anyone look at /Gitlab/ instead? > | > > | > Hi, yes. It does not currently have a sufficient review > | functionality > | > (cannot handle multiple revisions easily). > | > > | > On 02/09/15 20:51, Simon Marlow wrote: > | >> It might feel better > | >> for the author, but discovering what changed between two branches > | of > | >> multiple commits on github is almost impossible. > | > > | > I disagree with the first part of this: When the UI of the review > | tool > | > is good, it is easy to follow. But there's no open-source > | > implementation of that around. > | > > | > I agree that it is not easy to follow on Github. > | > _______________________________________________ > | > ghc-devs mailing list > | > ghc-devs at haskell.org > | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > | > > | > | > | > | -- > | Regards, > | > | Austin Seipp, Haskell Consultant > | Well-Typed LLP, http://www.well-typed.com/ > | _______________________________________________ > | ghc-devs mailing list > | ghc-devs at haskell.org > | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gershomb at gmail.com Wed Sep 9 05:18:10 2015 From: gershomb at gmail.com (Gershom B) Date: Wed, 9 Sep 2015 01:18:10 -0400 Subject: [haskell-infrastructure] the platform has outgrown Travis-CI In-Reply-To: References: Message-ID: Mark: Is this still an issue? I don't know if phab is the right place -- would have to check with Austin. But if this is a roadblock on the platform, we should certainly do something :-) -g On Sun, Jun 7, 2015 at 12:33 PM, Mark Lentczner wrote: > I finally figured out what was wrong with the Travis CI build for Haskell > Platform, and got it all working w/hvr's .debs of GHC (for the boot > compiler)... and ran smack into this: > > Your test run exceeded 50 minutes. > > > SO... I'd like to find another CI solution. Is phabricator.haskell.org an > option? Can we / should we create a project there? > > - Mark > > _______________________________________________ > haskell-infrastructure mailing list > haskell-infrastructure at community.galois.com > http://community.galois.com/mailman/listinfo/haskell-infrastructure > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Wed Sep 9 08:26:11 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 9 Sep 2015 08:26:11 +0000 Subject: Haskell Error Messages In-Reply-To: References: Message-ID: <2ece273850d841c9ada011cb18e3ad9c@DB4PR30MB030.064d.mgd.msft.net> Is there currently any planned work around making the haskell error messages able to support something like the ones in IDRIS, as shown in David Christianson's talk "A Pretty printer that says what it means" at HIW? Not that I know of, but it would be a Good Thing. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Alan & Kim Zimmerman Sent: 08 September 2015 19:49 To: ghc-devs at haskell.org Subject: Haskell Error Messages Is there currently any planned work around making the haskell error messages able to support something like the ones in IDRIS, as shown in David Christianson's talk "A Pretty printer that says what it means" at HIW? https://www.youtube.com/watch?v=m7BBCcIDXSg&list=PLnqUlCo055hVfNkQHP7z43r10yNo-mc7B&index=10 Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.zimm at gmail.com Wed Sep 9 10:31:29 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Wed, 9 Sep 2015 12:31:29 +0200 Subject: Haskell Error Messages In-Reply-To: <28F3A3E0-3209-4DB6-8455-E6AC740C7E2C@cis.upenn.edu> References: <28F3A3E0-3209-4DB6-8455-E6AC740C7E2C@cis.upenn.edu> Message-ID: That is indeed the ticket capturing the issue. Does anyone have plans to work on it? Alan On Wed, Sep 9, 2015 at 3:37 AM, Richard Eisenberg wrote: > Ticket #8809 (https://ghc.haskell.org/trac/ghc/ticket/8809) seems the > best spot to look for this. > > Richard > > On Sep 8, 2015, at 2:49 PM, "Alan & Kim Zimmerman" > wrote: > > Is there currently any planned work around making the haskell error > messages able to support something like the ones in IDRIS, as shown in > David Christianson's talk "A Pretty printer that says what it means" at HIW? > > > https://www.youtube.com/watch?v=m7BBCcIDXSg&list=PLnqUlCo055hVfNkQHP7z43r10yNo-mc7B&index=10 > > Alan > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From merijn at inconsistent.nl Wed Sep 9 11:14:36 2015 From: merijn at inconsistent.nl (Merijn Verstraaten) Date: Wed, 9 Sep 2015 13:14:36 +0200 Subject: Proposal: Automatic derivation of Lift In-Reply-To: References: Message-ID: <7A720C63-6BD1-4557-AF9A-C8C059AA7298@inconsistent.nl> I proposed automated derivation of Lift earlier (this year even?), but it got shot down as "needless and trivial to do using TH", so if people are now in favour consider me a very strong +1. This would make it significantly easier to implement an efficient version of https://ghc.haskell.org/trac/ghc/wiki/ValidateMonoLiterals proposal as a library using just TH. Cheers, Merijn > On 8 Sep 2015, at 21:01, Ryan Scott wrote: > > Sorry, I forgot to reply-all earlier. > >> I hacked this up quickly just to show that it works in principle. In >> practice, I think it's good to not just represent Int# as Int, but as >> something like UInt where >> >> data UInt = UInt Int# >> >> i.e., is isomorphic to an Int, but distinguishable. Alternatively, >> have a generic "unboxed" flag that could be inserted as a tag into the >> surrounding K. > > I suppose we'd have to decide which is easier for programmers to use. > Do we introduce UInt, UChar, et al. and require that users define > instances of the desired typeclass for them: > > instance Lift UInt where > lift (UInt i) = litE (intPrimL (I# i)) > > or do we introduce an unboxed flag and require users to write generic > GLift instances using that flag: > > instance GLift (K1 Unboxed Int) where > lift (K1 (Int i)) = litE (intPrimL (I# i)) > > The former has the advantage that you wouldn't need to change the > GLift code to distinguish between (K1 Unboxed Int) and (K1 R Int), > which might be a potential source of confusion for programmers. On the > other hand, having an Unboxed flag requires only introducing one new > data type, as opposed to a separate data type for each of the unlifted > types that we want to work over. > > Ryan S. > > On Tue, Sep 8, 2015 at 7:59 AM, Andres Loeh wrote: >> I don't think there's any fundamental reason why unboxed fields >> prevent a Generic instance, as long as we're happy that unboxed values >> will be re-boxed in the generic representation. It simply seems as if >> nobody has thought of implementing this. As an example, consider the >> following hand-written example which works just fine: >> >> {-# LANGUAGE MagicHash, KindSignatures, PolyKinds, TypeOperators, >> TypeFamilies #-} >> module GenUnboxed where >> >> import GHC.Exts >> import GHC.Generics >> import Generics.Deriving.Eq >> >> data UPair = UPair Int# Char# >> >> instance Generic UPair where >> type Rep UPair = K1 R Int :*: K1 R Char >> from (UPair x y) = K1 (I# x) :*: K1 (C# y) >> to (K1 (I# x) :*: K1 (C# y)) = UPair x y >> >> instance GEq UPair >> >> test :: Bool >> test = let p = UPair 3# 'x'# in geq p p >> >> Cheers, >> Andres >> >> On Mon, Sep 7, 2015 at 10:02 PM, Ryan Scott wrote: >>> Unlifted types can't be used polymorphically or in instance >>> declarations, so this makes it impossible to do something like >>> >>> instance Generic Int# >>> >>> or store an Int# in one branch of a (:*:), preventing generics from >>> doing anything in #-land. (unless someone has found a way to hack >>> around this). >>> >>> I would be okay with implementing a generics-based approach, but we'd >>> have to add a caveat that it will only work out-of-the-box on GHC 8.0 >>> or later, due to TH's need to look up package information. (We could >>> give users the ability to specify a package name manually as a >>> workaround.) >>> >>> If this were added, where would be the best place to put it? th-lift? >>> generic-deriving? template-haskell? A new package (lift-generics)? >>> >>> Ryan S. >>> >>> On Mon, Sep 7, 2015 at 3:10 PM, Matthew Pickering >>> wrote: >>>> Continuing my support of the generics route. Is there a fundamental >>>> reason why it couldn't handle unlifted types? Given their relative >>>> paucity, it seems like a fair compromise to generically define lift >>>> instances for all normal data types but require TH for unlifted types. >>>> This approach seems much smoother from a maintenance perspective. >>>> >>>> On Mon, Sep 7, 2015 at 5:26 PM, Ryan Scott wrote: >>>>> There is a Lift typeclass defined in template-haskell [1] which, when >>>>> a data type is an instance, permits it to be directly used in a TH >>>>> quotation, like so >>>>> >>>>> data Example = Example >>>>> >>>>> instance Lift Example where >>>>> lift Example = conE (mkNameG_d "" "" "Example") >>>>> >>>>> e :: Example >>>>> e = [| Example |] >>>>> >>>>> Making Lift instances for most data types is straightforward and >>>>> mechanical, so the proposal is to allow automatic derivation of Lift >>>>> via a -XDeriveLift extension: >>>>> >>>>> data Example = Example deriving Lift >>>>> >>>>> This is actually a pretty a pretty old proposal [2], dating back to >>>>> 2007. I wanted to have this feature for my needs, so I submitted a >>>>> proof-of-concept at the GHC Trac issue page [3]. >>>>> >>>>> The question now is: do we really want to bake this feature into GHC? >>>>> Since not many people opined on the Trac page, I wanted to submit this >>>>> here for wider visibility and to have a discussion. >>>>> >>>>> Here are some arguments I have heard against this feature (please tell >>>>> me if I am misrepresenting your opinion): >>>>> >>>>> * We already have a th-lift package [4] on Hackage which allows >>>>> derivation of Lift via Template Haskell functions. In addition, if >>>>> you're using Lift, chances are you're also using the -XTemplateHaskell >>>>> extension in the first place, so th-lift should be suitable. >>>>> * The same functionality could be added via GHC generics (as of GHC >>>>> 7.12/8.0, which adds the ability to reify a datatype's package name >>>>> [5]), if -XTemplateHaskell can't be used. >>>>> * Adding another -XDerive- extension places a burden on GHC devs to >>>>> maintain it in the future in response to further Template Haskell >>>>> changes. >>>>> >>>>> Here are my (opinionated) responses to each of these: >>>>> >>>>> * th-lift isn't as fully-featured as a -XDerive- extension at the >>>>> moment, since it can't do sophisticated type inference [6] or derive >>>>> for data families. This is something that could be addressed with a >>>>> patch to th-lift, though. >>>>> * GHC generics wouldn't be enough to handle unlifted types like Int#, >>>>> Char#, or Double# (which other -XDerive- extensions do). >>>>> * This is a subjective measurement, but in terms of the amount of code >>>>> I had to add, -XDeriveLift was substantially simpler than other >>>>> -XDerive extensions, because there are fewer weird corner cases. Plus, >>>>> I'd volunteer to maintain it :) >>>>> >>>>> Simon PJ wanted to know if other Template Haskell programmers would >>>>> find -XDeriveLift useful. Would you be able to use it? Would you like >>>>> to see a solution other than putting it into GHC? I'd love to hear >>>>> feedback so we can bring some closure to this 8-year-old feature >>>>> request. >>>>> >>>>> Ryan S. >>>>> >>>>> ----- >>>>> [1] http://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH-Syntax.html#t:Lift >>>>> [2] https://mail.haskell.org/pipermail/template-haskell/2007-October/000635.html >>>>> [3] https://ghc.haskell.org/trac/ghc/ticket/1830 >>>>> [4] http://hackage.haskell.org/package/th-lift >>>>> [5] https://ghc.haskell.org/trac/ghc/ticket/10030 >>>>> [6] https://ghc.haskell.org/trac/ghc/ticket/1830#comment:11 >>>>> _______________________________________________ >>>>> ghc-devs mailing list >>>>> ghc-devs at haskell.org >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From jmcf125 at openmailbox.org Wed Sep 9 11:22:26 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Wed, 9 Sep 2015 12:22:26 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <55EFEC9F.6070001@centrum.cz> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> Message-ID: <20150909111818.GA1439@jmcf125-Acer-Arch.home> Hi, sorry for not sending a CC to the mailing list, works differently from others I've used > > $ ls -R /home/jmcf125/ghc-raspberry-pi/sysroot/* > >/home/jmcf125/ghc-raspberry-pi/sysroot/lib: > >libcursesw.so libncurses.so libncurses++.so libncurses.so.5@ libncurses.so.5.9* libncursesw.so@ libncursesw.so.5@ libncursesw.so.5.9* > > > >/home/jmcf125/ghc-raspberry-pi/sysroot/usr: > >include/ > > > >/home/jmcf125/ghc-raspberry-pi/sysroot/usr/include: > >cursesapp.h cursesf.h curses.h cursesm.h cursesp.h cursesw.h cursslk.h ncurses_dll.h ncurses.h@ slcurses.h > > > This is misunderstadning, you need whole sysroot package for ARMv6. That > means basically copy whole / or just google for it. > > The reason is simple curses headers include another headers and those will > not be available in your way. Ditto for libs which depends at least on libc. That was actually just the 1st thing I tried, then as I said: > >I also tried extracting the ncurses official package for ARMv6 itself to > >sysroot, and got another error message, where the compiler did work. I > >begun to wonder whether I need a full ARMv6 system in sysroot to make > >this work, and as my Raspberry Pi is on an external hard drive, I > >mounted it on /mnt/jrpi, and changed the --sysroot option to it. > >Finally, configure worked! So, when doing that, configure works (the logs were for the previous attempts). But why is that when I try to make, with a working configure, and a real sysroot, that I get the following error? > > $ make -j5 > >(...) > >"inplace/bin/mkdirhier" compiler/stage2/doc/html/ghc//. > ><> > >"inplace/bin/mkdirhier" utils/hsc2hs/dist-install/build/tmp//. > >"inplace/bin/mkdirhier" utils/ghc-pkg/dist-install/build/tmp//. > >"inplace/bin/mkdirhier" utils/ghctags/dist-install/build/tmp//. > ><> > >"inplace/bin/mkdirhier" utils/ghc-pwd/dist-install/build/tmp//. > >"inplace/bin/mkdirhier" utils/ghc-cabal/dist-install/build/tmp//. > >"inplace/bin/mkdirhier" utils/hpc/dist-install/build/tmp//. > >"inplace/bin/mkdirhier" utils/runghc/dist-install/build/tmp//. > >docs/users_guide/users_guide.xml > >make[1]: docs/users_guide/users_guide.xml: Command not found > >docs/users_guide/what_glasgow_exts_does.gen.xml > >docs/users_guide/ghc.mk:24: recipe for target 'docs/users_guide/users_guide.xml' failed > >make[1]: *** [docs/users_guide/users_guide.xml] Error 127 > >make[1]: *** Waiting for unfinished jobs.... > >make[1]: docs/users_guide/what_glasgow_exts_does.gen.xml: Command not found > >docs/users_guide/ghc.mk:24: recipe for target 'docs/users_guide/what_glasgow_exts_does.gen.xml' failed > >make[1]: *** [docs/users_guide/what_glasgow_exts_does.gen.xml] Error 127 > >Writing utils/haddock/doc/haddock/license.html for section(license) > >Writing utils/haddock/doc/haddock/ch01s03.html for section > >Writing utils/haddock/doc/haddock/ch01s04.html for section > >Writing utils/haddock/doc/haddock/introduction.html for chapter(introduction) > >Writing utils/haddock/doc/haddock/invoking.html for chapter(invoking) > >Writing utils/haddock/doc/haddock/ch03s02.html for section > >Writing utils/haddock/doc/haddock/ch03s03.html for section > >Writing utils/haddock/doc/haddock/ch03s04.html for section > >Writing utils/haddock/doc/haddock/ch03s05.html for section > >Writing utils/haddock/doc/haddock/hyperlinking.html for section(hyperlinking) > >Writing utils/haddock/doc/haddock/module-attributes.html for section(module-attributes) > >Writing utils/haddock/doc/haddock/ch03s08.html for section > >Writing utils/haddock/doc/haddock/markup.html for chapter(markup) > >Writing utils/haddock/doc/haddock/ix01.html for index > >Writing utils/haddock/doc/haddock/index.html for book(haddock) > >cp mk/fptools.css utils/haddock/doc/haddock/ > >Makefile:71: recipe for target 'all' failed > >make: *** [all] Error 2 ghc-stage1 logically does not work, although it is created. I hope this is clearer. Cheers, Jo?o Miguel From simonpj at microsoft.com Wed Sep 9 11:31:08 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 9 Sep 2015 11:31:08 +0000 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> Message-ID: | (denotationally, at least) is the outer-most level. That's why I liked | the original proposal (which probably disappeared too fast for most | people to read it), which was more like being able to talk about `!a` | as a thing in itself. It's the only semantic gap in being able to That's another interesting idea. Currently ! is an annotation on a data constructor argument (only). It is not a type-former; so that !Int or !(Int ->Int) are not types. I think your point is that if we spell "Force" as "!" then it becomes a first-class type former. Let's try that: * t and !t are distinct types * You get from one to the other by using ! as a term-level data constructor, so you can use it in terms and in pattern matching. Thus - if (e::t) then (!e :: !t) - if (e::!t) then case of !x -> blah here (x::t) * As with newtype constructors, pattern-matching on !t doesn't imply evaluation; after all, it's already evaluated. * In Richard's notation, the type constructor (!) has kind (!) :: TYPE 'Boxed l -> TYPE 'Boxed Unlifted The wrinkle for data types is this. The declaration data T = MkT !Int !a produces a data constructor with type MkT :: forall a. Int -> a -> T But if (!t) is a first-class type, then you'd expect to get a data constructor with type MkT :: forall a. !Int -> !a -> T and that in turn would force many calls to look like (MkT !e1 !e2). But we could re-cast the special treatment for the top-level bang on data constructor arguments, to say that we get a worker/wrapper story so that the programmer's eye view is indeed that MkT :: forall a. Int -> a -> T, and the compiler generates the eval code to match things up. (Which is what happens now.) But if you wrote, say, data S = MkS (!Int, !a) then this magic would not happen and you really would get MkS :: (!Int, !a) -> S Interesting. Simon | -----Original Message----- | From: Dan Doel [mailto:dan.doel at gmail.com] | Sent: 09 September 2015 03:44 | To: Simon Peyton Jones | Cc: Edward Z. Yang; ghc-devs | Subject: Re: Unlifted data types | | On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones | wrote: | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | | It's missing the part that makes it convenient. | | type StrictList a = Force (StrictList' a) | data StrictList' a = Nil' | Cons' !a (StrictList a) | pattern Nil = Force Nil' | pattern Cons x xs = Force (Cons' x xs) | | mapStrict :: (a -> b) -> StrictList a -> StrictList b | mapStrict f Nil = Nil | mapStrict f (Cons x xs) = Cons (f x) (mapStrict f xs) | | But, really, my point is that we already almost have StrictList | _today_: | | data StrictList a = Nil | Cons !a !(StrictList a) | | The only difference between this and the previous definition | (denotationally, at least) is the outer-most level. That's why I liked | the original proposal (which probably disappeared too fast for most | people to read it), which was more like being able to talk about `!a` | as a thing in itself. It's the only semantic gap in being able to | define totally unlifted data types right now. So maybe it's also the | only operational gap that needs to be plugged, as well. | | But that was vetoed because `!a` in a data declaration doesn't make a | constructor with type `!a -> ...`, but `a -> ...` which evaluates. | | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | | Yes, true. It only helps ensure that the implementation is correct, | rather than enabling a previously impossible construction. Kind of | like certain uses of GADTs vs. phantom types. | | But the ArrayArray people already want UMVar (and the like) anyway, | because it cuts out a layer of indirection for types that are already | unlifted. | | -- Dan From eir at cis.upenn.edu Wed Sep 9 12:17:24 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 9 Sep 2015 08:17:24 -0400 Subject: Proposal: accept pull requests on GitHub In-Reply-To: References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> <1469c7be53ed4f0dab3872de9fe5ad54@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <94294117-CFB0-40C3-BC3C-54AD4B624C2F@cis.upenn.edu> > > (I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.) Or: is there a way contributors can create a Phab differential off a GitHub branch? This bypasses the GitHub PR but still provides a similar ease-of-use. This one seems rather easy to imagine. Richard From simonpj at microsoft.com Wed Sep 9 12:28:01 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 9 Sep 2015 12:28:01 +0000 Subject: Unlifted data types In-Reply-To: <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: I like your suggestion! I think it'd be better to have TYPE :: TypeShape -> * data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | Unlifted Now the awkward "unboxed/lifted" combination doesn't arise. Now, 'error' is "TypeShape-polymorphic": error :: forall (ts :: TypeShape) (a :: TYPE ts). String -> a What functions (if any) are "Levity-polymorphic". That is, they have identical code regardless of whether their (boxed) type args are lifted or unlifted? Answer: data constructors. As Richard says Cons :: forall (v1::Levity) (v2::Levity) (a::TYPE (Boxed v1)). a -> List v1 v2 a -> List v1 v2 a Why can it be levity-polymorphic? Because data constructors guarantee to construct no intermediate values (which they would be unsure whether or not to evaluate). Typically, though, functions over lists would not be levity-polymorphic, for reasons previously discussed. The awkward spot is the runtime system. Currently it goes to some lengths to ensure that it never introduces an indirection for a boxed-but-unlifted type. Simon Marlow would know exactly where. So I suspect that we WOULD need two versions of each (levity-polymorphic) data constructor, alas. And we'd need to know which version to use at every call site, which means the code would need to be levity-monomorphic. So we really would get very little levity polymorphism ineed. I think. All this fits nicely with Dan Doel's point about making ! a first class type constructor. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | Richard Eisenberg | Sent: 08 September 2015 14:46 | To: ghc-devs | Subject: Re: Unlifted data types | | I have put up an alternate set of proposals on | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | These sidestep around `Force` and `suspend` but probably have other | problems. They make heavy use of levity polymorphism. | | Back story: this all was developed in a late-evening Haskell Symposium | session that took place in the hotel bar. It seems Edward and I walked | away with quite different understandings of what had taken place. I've | written up my understanding. Most likely, the Right Idea is a | combination of this all! | | See what you think. | | Thanks! | Richard | | On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones | wrote: | | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | > | > | ensure that threads don't simply | > | pass thunks between each other. But, if you have unlifted types, | > | then you can have: | > | | > | data UMVar (a :: Unlifted) | > | | > | and then the type rules out the possibility of passing thunks | > | through a reference (at least at the top level). | > | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | > | > Simon | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs at haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From karel.gardas at centrum.cz Wed Sep 9 12:32:12 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Wed, 09 Sep 2015 14:32:12 +0200 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909111818.GA1439@jmcf125-Acer-Arch.home> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> Message-ID: <55F026CC.7090001@centrum.cz> On 09/ 9/15 01:22 PM, jmcf125 at openmailbox.org wrote: > So, when doing that, configure works (the logs were for the previous > attempts). But why is that when I try to make, with a working configure, > and a real sysroot, that I get the following error? >>> $ make -j5 >>> (...) >>> "inplace/bin/mkdirhier" compiler/stage2/doc/html/ghc//. >>> <> >>> "inplace/bin/mkdirhier" utils/hsc2hs/dist-install/build/tmp//. >>> "inplace/bin/mkdirhier" utils/ghc-pkg/dist-install/build/tmp//. >>> "inplace/bin/mkdirhier" utils/ghctags/dist-install/build/tmp//. >>> <> >>> "inplace/bin/mkdirhier" utils/ghc-pwd/dist-install/build/tmp//. >>> "inplace/bin/mkdirhier" utils/ghc-cabal/dist-install/build/tmp//. >>> "inplace/bin/mkdirhier" utils/hpc/dist-install/build/tmp//. >>> "inplace/bin/mkdirhier" utils/runghc/dist-install/build/tmp//. >>> docs/users_guide/users_guide.xml >>> make[1]: docs/users_guide/users_guide.xml: Command not found >>> docs/users_guide/what_glasgow_exts_does.gen.xml >>> docs/users_guide/ghc.mk:24: recipe for target 'docs/users_guide/users_guide.xml' failed >>> make[1]: *** [docs/users_guide/users_guide.xml] Error 127 >>> make[1]: *** Waiting for unfinished jobs.... >>> make[1]: docs/users_guide/what_glasgow_exts_does.gen.xml: Command not found >>> docs/users_guide/ghc.mk:24: recipe for target 'docs/users_guide/what_glasgow_exts_does.gen.xml' failed >>> make[1]: *** [docs/users_guide/what_glasgow_exts_does.gen.xml] Error 127 >>> Writing utils/haddock/doc/haddock/license.html for section(license) >>> Writing utils/haddock/doc/haddock/ch01s03.html for section >>> Writing utils/haddock/doc/haddock/ch01s04.html for section >>> Writing utils/haddock/doc/haddock/introduction.html for chapter(introduction) >>> Writing utils/haddock/doc/haddock/invoking.html for chapter(invoking) >>> Writing utils/haddock/doc/haddock/ch03s02.html for section >>> Writing utils/haddock/doc/haddock/ch03s03.html for section >>> Writing utils/haddock/doc/haddock/ch03s04.html for section >>> Writing utils/haddock/doc/haddock/ch03s05.html for section >>> Writing utils/haddock/doc/haddock/hyperlinking.html for section(hyperlinking) >>> Writing utils/haddock/doc/haddock/module-attributes.html for section(module-attributes) >>> Writing utils/haddock/doc/haddock/ch03s08.html for section >>> Writing utils/haddock/doc/haddock/markup.html for chapter(markup) >>> Writing utils/haddock/doc/haddock/ix01.html for index >>> Writing utils/haddock/doc/haddock/index.html for book(haddock) >>> cp mk/fptools.css utils/haddock/doc/haddock/ >>> Makefile:71: recipe for target 'all' failed >>> make: *** [all] Error 2 > > ghc-stage1 logically does not work, although it is created. How exactly ghc-stage1 behaves? What does ghc-stage1 --info tells you? Karel From eir at cis.upenn.edu Wed Sep 9 12:35:04 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 9 Sep 2015 08:35:04 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones wrote: > I think it'd be better to have > > TYPE :: TypeShape -> * > > data TypeShape = Unboxed | Boxed Levity > data Levity = Lifted | Unlifted > Yes, of course. > So we really would get very little levity polymorphism ineed. I think. That's right. The levity polymorphism is, essentially, only to have a nice type inference story. Once the code gets passed to the back end, the polymorphism would have to be removed. My idea was to use it to allow users to gloss (somewhat) over the ! vs. no-! distinction by having the compiler to the Right Thing during inference. Richard From simonpj at microsoft.com Wed Sep 9 12:40:27 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 9 Sep 2015 12:40:27 +0000 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: | That's right. The levity polymorphism is, essentially, only to have a | nice type inference story. Once the code gets passed to the back end, | the polymorphism would have to be removed. My idea was to use it to | allow users to gloss (somewhat) over the ! vs. no-! distinction by | having the compiler to the Right Thing during inference. Can you be more specific? What does "gloss over" mean? S | -----Original Message----- | From: Richard Eisenberg [mailto:eir at cis.upenn.edu] | Sent: 09 September 2015 13:35 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: Re: Unlifted data types | | | On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones | wrote: | | > I think it'd be better to have | > | > TYPE :: TypeShape -> * | > | > data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | | > Unlifted | > | | Yes, of course. | | > So we really would get very little levity polymorphism ineed. I | think. | | That's right. The levity polymorphism is, essentially, only to have a | nice type inference story. Once the code gets passed to the back end, | the polymorphism would have to be removed. My idea was to use it to | allow users to gloss (somewhat) over the ! vs. no-! distinction by | having the compiler to the Right Thing during inference. | | Richard From eir at cis.upenn.edu Wed Sep 9 12:44:53 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 9 Sep 2015 08:44:53 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: On Sep 9, 2015, at 8:40 AM, Simon Peyton Jones wrote: > Can you be more specific? What does "gloss over" mean? I mean that, for example, `length` will work over both strict lists and lazy lists. It will infer the strictness of its argument through ordinary type inference. So users have to be aware of strictness, but they will be able to use the same functions in both cases. Richard From simonpj at microsoft.com Wed Sep 9 12:57:44 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 9 Sep 2015 12:57:44 +0000 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> | I mean that, for example, `length` will work over both strict lists | and lazy lists. It will infer the strictness of its argument through | ordinary type inference. So users have to be aware of strictness, but | they will be able to use the same functions in both cases. I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic??? From jmcf125 at openmailbox.org Wed Sep 9 12:59:26 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Wed, 9 Sep 2015 13:59:26 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <55F026CC.7090001@centrum.cz> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> Message-ID: <20150909125926.GB1439@jmcf125-Acer-Arch.home> > How exactly ghc-stage1 behaves? What does ghc-stage1 --info tells you? $ ../inplace/bin/ghc-stage1 --make HelloWorld.lhs HelloWorld.lhs:1:1: Could not find module ?Prelude? There are files missing in the ?base-4.8.1.0? package, try running 'ghc-pkg check'. Use -v to see a list of the files searched for. $ ../inplace/bin/ghc-pkg check -v >ghc-pkg.check.log 2>&1 -------------- next part -------------- A non-text attachment was scrubbed... Name: HelloWorld.lhs Type: text/x-literate-haskell Size: 95 bytes Desc: not available URL: From oleg.grenrus at iki.fi Wed Sep 9 13:03:19 2015 From: oleg.grenrus at iki.fi (Oleg Grenrus) Date: Wed, 9 Sep 2015 16:03:19 +0300 Subject: Proposal: accept pull requests on GitHub In-Reply-To: <94294117-CFB0-40C3-BC3C-54AD4B624C2F@cis.upenn.edu> References: <55E7453A.90309@gmail.com> <87mvx4mu2x.fsf@andromedae.feelingofgreen.ru> <55E76572.3050405@nh2.me> <1469c7be53ed4f0dab3872de9fe5ad54@DB4PR30MB030.064d.mgd.msft.net> <94294117-CFB0-40C3-BC3C-54AD4B624C2F@cis.upenn.edu> Message-ID: <6012D2DF-2255-4117-86A2-E6D45462B0D2@iki.fi> As a junior ghc contributor, I have to comment that git push -u my-fork my-branch and arc diff are about of the same ?cognitive load?. Yes, one must have arc on the machine, but if the right version could live as a submodule in GHC tree: even better. And a bit tangental comment: I can imagine some isolated component to be developed by responsible people in the way they find the most productive, and then pushed to the central repository, in smaller or larger chunks. So not single central repository, but more like a tree. Yet Austin/GHCHQ would need to care only about the root. For example someone with push accept could say ?I?ll review and accept GitHub PR touching only base-library in the GHC tree, and submit Phab differentials upstream?. The con, is that the communication distance between original contributor and the end reviewer in Phabricator will increase, but it could still work if the person in between is committed enough. Yet base is a bit of the bad example, as almost every change is first discussed on the libraries-list. In some sense, this is ?Phab differential off a GitHub branch?, but with a real person in between, not the script. And maybe someone does something like that already, but not publicly . Git is distributed system after all. Oleg G. > On 09 Sep 2015, at 15:17, Richard Eisenberg wrote: > > >>> (I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.) > > Or: is there a way contributors can create a Phab differential off a GitHub branch? This bypasses the GitHub PR but still provides a similar ease-of-use. This one seems rather easy to imagine. > > Richard > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From eir at cis.upenn.edu Wed Sep 9 13:03:38 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 9 Sep 2015 09:03:38 -0400 Subject: Unlifted data types In-Reply-To: <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> On Sep 9, 2015, at 8:57 AM, Simon Peyton Jones wrote: > | I mean that, for example, `length` will work over both strict lists > | and lazy lists. It will infer the strictness of its argument through > | ordinary type inference. So users have to be aware of strictness, but > | they will be able to use the same functions in both cases. > > I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic??? > No functions (excepting `error` and friends) are truly levity polymorphic. We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it. For example, `map :: (a -> b) -> [a] -> [b]` has 4 places where levity polymorphism comes into play: the type a, the type b, the source list, and the result list. Any of these could be lazy or strict. And, I believe, all the combinations make sense. This means that we have one source Haskell declaration -- map -- that corresponds to 16 compiled functions. Obviously we wish to optimize somehow, and perhaps if we can't, this idea is bogus. But this is the basic idea. To be clear, nothing is different about `length` than `map` -- the fact that length is a consumer is utterly irrelevant in my example. Richard From jmcf125 at openmailbox.org Wed Sep 9 13:09:06 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Wed, 9 Sep 2015 14:09:06 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909125926.GB1439@jmcf125-Acer-Arch.home> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> Message-ID: <20150909130906.GE1439@jmcf125-Acer-Arch.home> > How exactly ghc-stage1 behaves? What does ghc-stage1 --info tells you? Again, sorry, I really thought I included this. [("Project name","The Glorious Glasgow Haskell Compilation System") ,("GCC extra via C opts"," -fwrapv") ,("C compiler command","arm-linux-gnueabihf-gcc-sysroot") ,("C compiler flags"," -fno-stack-protector") ,("C compiler link flags"," -fuse-ld=gold -Wl,-z,noexecstack") ,("Haskell CPP command","arm-linux-gnueabihf-gcc-sysroot") ,("Haskell CPP flags","-E -undef -traditional ") ,("ld command","/home/jmcf125/ghc-raspberry-pi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-ld.gold") ,("ld flags"," -z noexecstack") ,("ld supports compact unwind","YES") ,("ld supports build-id","YES") ,("ld supports filelist","NO") ,("ld is GNU ld","YES") ,("ar command","/home/jmcf125/ghc-raspberry-pi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-ar") ,("ar flags","q") ,("ar supports at file","YES") ,("touch command","touch") ,("dllwrap command","/bin/false") ,("windres command","/bin/false") ,("libtool command","libtool") ,("perl command","/usr/bin/perl") ,("cross compiling","YES") ,("target os","OSLinux") ,("target arch","ArchARM {armISA = ARMv6, armISAExt = [VFPv2], armABI = HARD}") ,("target word size","4") ,("target has GNU nonexec stack","False") ,("target has .ident directive","True") ,("target has subsections via symbols","False") ,("Unregisterised","NO") ,("LLVM llc command","llc") ,("LLVM opt command","opt") ,("Project version","7.10.2") ,("Project Git commit id","0da488c4438d88c9252e0b860426b8e74b5fc9e8") ,("Booter version","7.10.1") ,("Stage","1") ,("Build platform","x86_64-unknown-linux") ,("Host platform","x86_64-unknown-linux") ,("Target platform","arm-unknown-linux") ,("Have interpreter","YES") ,("Object splitting supported","NO") ,("Have native code generator","NO") ,("Support SMP","YES") ,("Tables next to code","YES") ,("RTS ways","l debug thr thr_debug thr_l ") ,("Support dynamic-too","YES") ,("Support parallel --make","YES") ,("Support reexported-modules","YES") ,("Support thinning and renaming package flags","YES") ,("Uses package keys","YES") ,("Dynamic by default","NO") ,("GHC Dynamic","NO") ,("Leading underscore","NO") ,("Debug on","False") ,("LibDir","/home/jmcf125/ghc-raspberry-pi/ghc/inplace/lib") ,("Global Package DB","/home/jmcf125/ghc-raspberry-pi/ghc/inplace/lib/package.conf.d") ] From karel.gardas at centrum.cz Wed Sep 9 13:23:19 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Wed, 09 Sep 2015 15:23:19 +0200 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909125926.GB1439@jmcf125-Acer-Arch.home> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> Message-ID: <55F032C7.9020307@centrum.cz> So ghc-stage1 is working. Good! Now just to find why your base is broken, please rebuild ghc completely and this time does not use any -j 5 option. It'll use just one core, but will stop on the first error. Let's see how far you get. Karel On 09/ 9/15 02:59 PM, jmcf125 at openmailbox.org wrote: >> How exactly ghc-stage1 behaves? What does ghc-stage1 --info tells you? > $ ../inplace/bin/ghc-stage1 --make HelloWorld.lhs > > HelloWorld.lhs:1:1: > Could not find module ?Prelude? > There are files missing in the ?base-4.8.1.0? package, > try running 'ghc-pkg check'. > Use -v to see a list of the files searched for. > > $ ../inplace/bin/ghc-pkg check -v >ghc-pkg.check.log 2>&1 > From jmcf125 at openmailbox.org Wed Sep 9 14:21:14 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Wed, 9 Sep 2015 15:21:14 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <55F032C7.9020307@centrum.cz> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> Message-ID: <20150909142114.GG1439@jmcf125-Acer-Arch.home> > So ghc-stage1 is working. Good! Now just to find why your base is broken, > please rebuild ghc completely and this time does not use any -j 5 option. > It'll use just one core, but will stop on the first error. Let's see how far > you get. Ah. Alright, it took a while longer. $ ./configure --target=arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc-sysroot --enable-unregisterised && make (...) "inplace/bin/ghc-stage1" -hisuf hi -osuf o -hcsuf hc -static -H64m -O0 -this-package-key ghcpr_8TmvWUcS1U1IKHT0levwg3 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package-key rts -this-package-key ghc-prim -XHaskell2010 -O -fllvm -no-user-package-db -rtsopts -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -c libraries/ghc-prim/./GHC/CString.hs -o libraries/ghc-prim/dist-install/build/GHC/CString.o You are using a new version of LLVM that hasn't been tested yet! We will try though... opt: /tmp/ghc23881_0/ghc_1.ll:7:6: error: unexpected type in metadata definition !0 = metadata !{metadata !"top", i8* null} ^ libraries/ghc-prim/ghc.mk:4: recipe for target 'libraries/ghc-prim/dist-install/build/GHC/CString.o' failed make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/CString.o] Error 1 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 This is weird, I think I'm not even using LLVM. Before this, there were loads of things like these: /tmp/ghc23862_0/ghc_2.hc:1528:30: warning: function called through a non-compatible type [enabled by default] ((void (*)(void *))(W_)&barf)((void *)(W_)&c7G_str);; ^ /tmp/ghc23862_0/ghc_2.hc:1528:30: note: if this code is reached, the program will abort Jo?o Miguel From karel.gardas at centrum.cz Wed Sep 9 15:46:44 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Wed, 09 Sep 2015 17:46:44 +0200 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909142114.GG1439@jmcf125-Acer-Arch.home> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> Message-ID: <55F05464.8090507@centrum.cz> On 09/ 9/15 04:21 PM, jmcf125 at openmailbox.org wrote: >> So ghc-stage1 is working. Good! Now just to find why your base is broken, >> please rebuild ghc completely and this time does not use any -j 5 option. >> It'll use just one core, but will stop on the first error. Let's see how far >> you get. > Ah. Alright, it took a while longer. > > $ ./configure --target=arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc-sysroot --enable-unregisterised && make > (...) > "inplace/bin/ghc-stage1" -hisuf hi -osuf o -hcsuf hc -static -H64m -O0 -this-package-key ghcpr_8TmvWUcS1U1IKHT0levwg3 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package-key rts -this-package-key ghc-prim -XHaskell2010 -O -fllvm -no-user-package-db -rtsopts -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -c libraries/ghc-prim/./GHC/CString.hs -o libraries/ghc-prim/dist-install/build/GHC/CString.o > You are using a new version of LLVM that hasn't been tested yet! > We will try though... ^ OK you can see this. > opt: /tmp/ghc23881_0/ghc_1.ll:7:6: error: unexpected type in metadata definition > !0 = metadata !{metadata !"top", i8* null} > ^ > libraries/ghc-prim/ghc.mk:4: recipe for target 'libraries/ghc-prim/dist-install/build/GHC/CString.o' failed > make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/CString.o] Error 1 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > > This is weird, I think I'm not even using LLVM. This is not weird at all! GHC does not provide ARM NCG and so it is using LLVM if you compile ARM registerised build. So what about to start with the least pain and install supported LLVM version? Cheers, Karel From rwbarton at gmail.com Wed Sep 9 15:59:43 2015 From: rwbarton at gmail.com (Reid Barton) Date: Wed, 9 Sep 2015 11:59:43 -0400 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <55F05464.8090507@centrum.cz> References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> <55F05464.8090507@centrum.cz> Message-ID: On Wed, Sep 9, 2015 at 11:46 AM, Karel Gardas wrote: > On 09/ 9/15 04:21 PM, jmcf125 at openmailbox.org wrote: > >> So ghc-stage1 is working. Good! Now just to find why your base is broken, >>> please rebuild ghc completely and this time does not use any -j 5 option. >>> It'll use just one core, but will stop on the first error. Let's see how >>> far >>> you get. >>> >> Ah. Alright, it took a while longer. >> >> $ ./configure --target=arm-linux-gnueabihf >> --with-gcc=arm-linux-gnueabihf-gcc-sysroot --enable-unregisterised && make >> (...) >> "inplace/bin/ghc-stage1" -hisuf hi -osuf o -hcsuf hc -static -H64m -O0 >> -this-package-key ghcpr_8TmvWUcS1U1IKHT0levwg3 -hide-all-packages -i >> -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build >> -ilibraries/ghc-prim/dist-install/build/autogen >> -Ilibraries/ghc-prim/dist-install/build >> -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. >> -optP-include >> -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h >> -package-key rts -this-package-key ghc-prim -XHaskell2010 -O -fllvm >> -no-user-package-db -rtsopts -odir >> libraries/ghc-prim/dist-install/build -hidir >> libraries/ghc-prim/dist-install/build -stubdir >> libraries/ghc-prim/dist-install/build -c >> libraries/ghc-prim/./GHC/CString.hs -o >> libraries/ghc-prim/dist-install/build/GHC/CString.o >> You are using a new version of LLVM that hasn't been tested yet! >> We will try though... >> > > ^ OK you can see this. > > opt: /tmp/ghc23881_0/ghc_1.ll:7:6: error: unexpected type in metadata >> definition >> !0 = metadata !{metadata !"top", i8* null} >> ^ >> libraries/ghc-prim/ghc.mk:4: recipe for target >> 'libraries/ghc-prim/dist-install/build/GHC/CString.o' failed >> make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/CString.o] Error 1 >> Makefile:71: recipe for target 'all' failed >> make: *** [all] Error 2 >> >> This is weird, I think I'm not even using LLVM. >> > > This is not weird at all! GHC does not provide ARM NCG and so it is using > LLVM if you compile ARM registerised build. > But "./configure [...] --enable-unregisterised" should mean using the C backend, not LLVM, right? So this still looks strange. Also there is an explicit "-fllvm" on the failing ghc-stage1 command line. What is in your build.mk? Maybe you are using one of the build flavors that sets -fllvm explicitly? That said you can also try installing the supported version of LLVM for ghc 7.10, which is LLVM 3.5. Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From karel.gardas at centrum.cz Wed Sep 9 16:07:19 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Wed, 09 Sep 2015 18:07:19 +0200 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: References: <20150907181811.GA1668@jmcf125-Acer-Arch.home> <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> <55F05464.8090507@centrum.cz> Message-ID: <55F05937.9030106@centrum.cz> On 09/ 9/15 05:59 PM, Reid Barton wrote: > This is not weird at all! GHC does not provide ARM NCG and so it is > using LLVM if you compile ARM registerised build. > > > But "./configure [...] --enable-unregisterised" should mean using the C > backend, not LLVM, right? So this still looks strange. Also there is an > explicit "-fllvm" on the failing ghc-stage1 command line. Indeed, I've overlooked that completely and I was referring to previous info of `ghc --info' where it's clearly registerised: ,("Unregisterised","NO") ,("LLVM llc command","llc") ,("LLVM opt command","opt") ,("Project version","7.10.2") ,("Project Git commit id","0da488c4438d88c9252e0b860426b8e74b5fc9e8") ,("Booter version","7.10.1") ,("Stage","1") ,("Build platform","x86_64-unknown-linux") ,("Host platform","x86_64-unknown-linux") ,("Target platform","arm-unknown-linux") Karel From dan.doel at gmail.com Wed Sep 9 16:44:10 2015 From: dan.doel at gmail.com (Dan Doel) Date: Wed, 9 Sep 2015 12:44:10 -0400 Subject: Unlifted data types In-Reply-To: <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> Message-ID: On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote: > No functions (excepting `error` and friends) are truly levity polymorphic. I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had: MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted) then: takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc. I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is. But this: > We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it. Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.) -- Dan From ekmett at gmail.com Wed Sep 9 19:30:19 2015 From: ekmett at gmail.com (Edward Kmett) Date: Wed, 9 Sep 2015 15:30:19 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> Message-ID: I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk. I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities. Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it. Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity. Most of the combinators for working with Maybe do need to care about that levity however. e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity. But if we look at what inference should probably grab for the levity of Functor: you'd get: class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default. These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument. Whether we could expect an average Haskeller to be willing to do so is an entirely different matter. -Edward On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel wrote: > On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg > wrote: > > No functions (excepting `error` and friends) are truly levity > polymorphic. > > I was talking with Ed Kmett about this yesterday, and he pointed out > that this isn't true. There are a significant array of levity > polymorphic functions having to do with reference types. They simply > shuffle around pointers with the right calling convention, and don't > really care what levity their arguments are, because they're just > operating uniformly either way. So if we had: > > MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed > Unlifted) > > then: > > takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s > l a -> State# s -> (# State# s, a #) > putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l > a -> a -> State# s -> State# s > > are genuinely parametric in l. And the same is true for MutVar#, > Array#, MutableArray#, etc. > > I think data type constructors are actually parametric, too (ignoring > data with ! in them for the moment; the underlying constructors of > those). Using a constructor just puts the pointers for the fields in > the type, and matching on a constructor gives them back. They don't > need to care whether their fields are lifted or not, they just > preserve whatever the case is. > > But this: > > > We use levity polymorphism in the types to get GHC to use its existing > type inference to infer strictness. By the time type inference is done, we > must ensure that no levity polymorphism remains, because the code generator > won't be able to deal with it. > > Is not parametric polymorphism; it is ad-hoc polymorphism. It even has > the defaulting step from type classes. Except the ad-hoc has been > given the same notation as the genuinely parametric, so you can no > longer identify the latter. (I'm not sure I'm a great fan of the > ad-hoc part anyway, to be honest.) > > -- Dan > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Wed Sep 9 19:42:29 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Wed, 9 Sep 2015 20:42:29 +0100 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> Message-ID: <55F08BA5.1020708@gmail.com> On 09/09/2015 13:28, Simon Peyton Jones wrote: > The awkward spot is the runtime system. Currently it goes to some > lengths to ensure that it never introduces an indirection for a > boxed-but-unlifted type. Simon Marlow would know exactly where. So > I suspect that we WOULD need two versions of each > (levity-polymorphic) data constructor, alas. And we'd need to know > which version to use at every call site, which means the code would > need to be levity-monomorphic. So we really would get very little > levity polymorphism ineed. I think. I *think* we're ok here. The RTS doesn't have any special machinery to avoid indirections to unlifted things that I'm aware of. Did you have a particular problem in mind? Indirections appear in various ways, but always as the result of a thunk being there in the first place - updates, selector thunks, and suspending duplicate work (during parallel evaluation) are all thunk-related. So does that mean data constructors can be levity-polymorphic? I don't see why not, but maybe I'm missing something. Cheers Simon > > All this fits nicely with Dan Doel's point about making ! a first class type constructor. > > Simon > > > | -----Original Message----- > | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of > | Richard Eisenberg > | Sent: 08 September 2015 14:46 > | To: ghc-devs > | Subject: Re: Unlifted data types > | > | I have put up an alternate set of proposals on > | > | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes > | > | These sidestep around `Force` and `suspend` but probably have other > | problems. They make heavy use of levity polymorphism. > | > | Back story: this all was developed in a late-evening Haskell Symposium > | session that took place in the hotel bar. It seems Edward and I walked > | away with quite different understandings of what had taken place. I've > | written up my understanding. Most likely, the Right Idea is a > | combination of this all! > | > | See what you think. > | > | Thanks! > | Richard > | > | On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones > | wrote: > | > | > | And to > | > | be honest, I'm not sure we need arbitrary data types in Unlifted; > | > | Force (which would be primitive) might be enough. > | > > | > That's an interesting thought. But presumably you'd have to use > | 'suspend' (a terrible name) a lot: > | > > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | > | > Cons !a (StrictList a) > | > > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs > | = > | > mapStrict' f (suspend xs) > | > > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' > | f > | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) > | > > | > > | > That doesn't look terribly convenient. > | > > | > | ensure that threads don't simply > | > | pass thunks between each other. But, if you have unlifted types, > | > | then you can have: > | > | > | > | data UMVar (a :: Unlifted) > | > | > | > | and then the type rules out the possibility of passing thunks > | > | through a reference (at least at the top level). > | > > | > Really? Presumably UMVar is a new primitive? With a family of > | operations like MVar? If so can't we just define > | > newtype UMVar a = UMV (MVar a) > | > putUMVar :: UMVar a -> a -> IO () > | > putUMVar (UMVar v) x = x `seq` putMVar v x > | > > | > I don't see Force helping here. > | > > | > Simon > | > _______________________________________________ > | > ghc-devs mailing list > | > ghc-devs at haskell.org > | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > | > | _______________________________________________ > | ghc-devs mailing list > | ghc-devs at haskell.org > | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From johan.tibell at gmail.com Wed Sep 9 22:22:14 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Wed, 9 Sep 2015 15:22:14 -0700 Subject: Converting unboxed sum types in StgCmm Message-ID: Hi! The original idea for implementing the backend part of the unboxed sums proposal was to convert from the core representation to the actual data representation (i.e. a tag followed by some pointer and non-pointer fields) in the unarise stg-to-stg pass. I have now realized that this won't work. The problem is that stg is too strongly typed. When we "desugar" sum types we need to convert functions receiving a value e.g. from f :: (# Bool | Char #) -> ... to f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... Since stg is still typed with normal Haskell types (e.g. Bool, Char, etc), this is not possible, as we cannot represent an argument which has two different types. It seems to me that we will have to do the conversion in the stg-to-cmm pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg function will have to change from idToReg :: DynFlags -> NonVoid Id -> LocalReg to idToReg :: DynFlags -> NonVoid Id -> [LocalReg] to accommodate the fact that we might need more than one register to store a binder. Any ideas for a better solution? -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmcf125 at openmailbox.org Wed Sep 9 23:33:34 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Thu, 10 Sep 2015 00:33:34 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: References: <55EDE771.9010404@centrum.cz> <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> <55F05464.8090507@centrum.cz> Message-ID: <20150909233333.GA31923@jmcf125-Acer-Arch.home> Okay, I tried with LLVM registerised, I've read about it and the idea sounds nice. > What is in your build.mk? Maybe you are using one of the build flavors that > sets -fllvm explicitly? Ah, so that was it. I followed Karel's blog, seems back then the BuildFlavour = quick-cross option he used didn't include LLVM, which makes sense, since there was no support. Now, the option is there, as you suspected: ifeq "$(BuildFlavour)" "quick-cross" (...) GhcStage2HcOpts = -O0 -fllvm GhcLibHcOpts = -O -fllvm (...) > > That said you can also try installing the supported version of LLVM for ghc > 7.10, which is LLVM 3.5. Good that it was still in Arch Linux repos, as llvm35. All went fine through stage 1, now ghc-stage1 works, and I mean: $ ../inplace/bin/ghc-stage1 --make HelloWorld.lhs [1 of 1] Compiling Main ( HelloWorld.lhs, HelloWorld.o ) Linking HelloWorld ... But I still got an error in the end (using make with only one core costed me a few hours...): (...) echo 'exec "$executablename" ${1+"$@"}' >> inplace/bin/dll-split chmod +x inplace/bin/dll-split inplace/bin/dll-split compiler/stage2/build/.depend-v-dyn.haskell "DynFlags" "Annotations ApiAnnotation Avail Bag BasicTypes Binary BooleanFormula BreakArray BufWrite Class CmdLineParser CmmType CoAxiom ConLike Coercion Config Constants CoreArity CoreFVs CoreSubst CoreSyn CoreTidy CoreUnfold CoreUtils CostCentre Ctype DataCon Demand Digraph DriverPhases DynFlags Encoding ErrUtils Exception ExtsCompat46 FamInstEnv FastFunctions FastMutInt FastString FastTypes Fingerprint FiniteMap ForeignCall Hooks HsBinds HsDecls HsDoc HsExpr HsImpExp HsLit PlaceHolder HsPat HsSyn HsTypes HsUtils HscTypes IOEnv Id IdInfo IfaceSyn IfaceType InstEnv Kind Lexeme Lexer ListSetOps Literal Maybes MkCore MkId Module MonadUtils Name NameEnv NameSet OccName OccurAnal OptCoercion OrdList Outputable PackageConfig Packages Pair Panic PatSyn PipelineMonad Platform PlatformConstants PprCore PrelNames PrelRules Pretty PrimOp RdrName Rules Serialized SrcLoc StaticFlags StringBuffer TcEvidence TcRnTypes TcType TrieMap TyCon Type TypeRep TysPrim TysWiredIn Unify UniqFM UniqSet UniqSupply Unique Util Var VarEnv VarSet Bitmap BlockId ByteCodeAsm ByteCodeInstr ByteCodeItbls CLabel Cmm CmmCallConv CmmExpr CmmInfo CmmMachOp CmmNode CmmUtils CodeGen.Platform CodeGen.Platform.ARM CodeGen.Platform.ARM64 CodeGen.Platform.NoRegs CodeGen.Platform.PPC CodeGen.Platform.PPC_Darwin CodeGen.Platform.SPARC CodeGen.Platform.X86 CodeGen.Platform.X86_64 FastBool Hoopl Hoopl.Dataflow InteractiveEvalTypes MkGraph PprCmm PprCmmDecl PprCmmExpr Reg RegClass SMRep StgCmmArgRep StgCmmClosure StgCmmEnv StgCmmLayout StgCmmMonad StgCmmProf StgCmmTicky StgCmmUtils StgSyn Stream" inplace/bin/dll-split: line 8: /home/jmcf125/ghc-raspberry-pi/ghc/inplace/lib/bin/dll-split: cannot execute binary file: Exec format error inplace/bin/dll-split: line 8: /home/jmcf125/ghc-raspberry-pi/ghc/inplace/lib/bin/dll-split: Success compiler/ghc.mk:655: recipe for target 'compiler/stage2/dll-split.stamp' failed make[1]: *** [compiler/stage2/dll-split.stamp] Error 126 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 OK, it did try to get to stage 2, as I thought it would. Is dll-split trying to execute a stage 2 binary? Why? What's going on? Thanks for the help so far, Jo?o Miguel From jmcf125 at openmailbox.org Wed Sep 9 23:38:12 2015 From: jmcf125 at openmailbox.org (jmcf125 at openmailbox.org) Date: Thu, 10 Sep 2015 00:38:12 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909233333.GA31923@jmcf125-Acer-Arch.home> References: <20150908203157.GA1557@jmcf125-Acer-Arch.home> <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> <55F05464.8090507@centrum.cz> <20150909233333.GA31923@jmcf125-Acer-Arch.home> Message-ID: <20150909233806.GA25685@jmcf125-Acer-Arch.home> Now I think of it, the blog post was about getting stage 1. I did. So that's what the cross-compile option is supposed to do, right? But why does it give an error instead of just stopping in stage 1? To get both stages, what option should I use? Should I just remove mk/build.mk? Best regards, Jo?o Miguel From ezyang at mit.edu Thu Sep 10 02:30:38 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 09 Sep 2015 19:30:38 -0700 Subject: Shared data type for extension flags In-Reply-To: <1441852095-sup-7919@sabre> References: <72a513a1fc4345b0a48d9727b7b10d53@DB4PR30MB030.064d.mgd.msft.net> <1441852095-sup-7919@sabre> Message-ID: <1441852224-sup-3531@sabre> I don't think it makes very much sense to reuse bin-package-db; at least, not without renaming it at the very least (who'd expect a list of language extension flags to live in a binary package database?) We could name it something like 'ghc-types'? Edward Excerpts from Simon Peyton Jones's message of 2015-09-08 05:35:00 -0700: > Yes, we?d have to broaden the description of the package. I defer to Edward Yang and Duncan Coutts who have a clearer idea of the architecture in this area. > > Simon > > From: Michael Smith [mailto:michael at diglumi.com] > Sent: 02 September 2015 17:27 > To: Simon Peyton Jones; Matthew Pickering > Cc: GHC developers > Subject: Re: Shared data type for extension flags > > > The package description for that is "The GHC compiler's view of the GHC package database format", and this doesn't really have to do with the package database format. Would it be okay to put this in there anyway? > > On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones > wrote: > we already have such a shared library, I think: bin-package-db. would that do? > > Simon > > From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Michael Smith > Sent: 02 September 2015 09:21 > To: Matthew Pickering > Cc: GHC developers > Subject: Re: Shared data type for extension flags > > That sounds like a good approach. Are there other things that would go nicely > in a shared package like this, in addition to the extension data type? > > On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering > wrote: > Surely the easiest way here (including for other tooling - ie > haskell-src-exts) is to create a package which just provides this > enumeration. GHC, cabal, th, haskell-src-exts and so on then all > depend on this package rather than creating their own enumeration. > > On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith > wrote: > > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > > capababilty > > to Template Haskell to detect which language extensions enabled. > > Unfortunately, > > since template-haskell can't depend on ghc (as ghc depends on > > template-haskell), > > it can't simply re-export the ExtensionFlag type from DynFlags to the user. > > > > There is a second data type encoding the list of possible language > > extensions in > > the Cabal package, in Language.Haskell.Extension [3]. But template-haskell > > doesn't already depend on Cabal, and doing so seems like it would cause > > difficulties, as the two packages can be upgraded separately. > > > > So adding this new feature to Template Haskell requires introducing a > > *third* > > data type for language extensions. It also requires enumerating this full > > list > > in two more places, to convert back and forth between the TH Extension data > > type > > and GHC's internal ExtensionFlag data type. > > > > Is there another way here? Can there be one single shared data type for this > > somehow? > > > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > > [2] https://phabricator.haskell.org/D1200 > > [3] > > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > From johan.tibell at gmail.com Thu Sep 10 05:16:27 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Wed, 9 Sep 2015 22:16:27 -0700 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: Message-ID: I wonder if rewriting any aliased pointer field as Any in Stg and any non-pointer field as Word# would work. I suspect that not all non-pointer fields (e.g. Double# on 32-bit) can be represented as Word#. On Wed, Sep 9, 2015 at 3:22 PM, Johan Tibell wrote: > Hi! > > The original idea for implementing the backend part of the unboxed sums > proposal was to convert from the core representation to the actual data > representation (i.e. a tag followed by some pointer and non-pointer fields) > in the unarise stg-to-stg > > pass. > > I have now realized that this won't work. The problem is that stg is too > strongly typed. When we "desugar" sum types we need to convert functions > receiving a value e.g. from > > f :: (# Bool | Char #) -> ... > > to > > f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... > > Since stg is still typed with normal Haskell types (e.g. Bool, Char, etc), > this is not possible, as we cannot represent an argument which has two > different types. > > It seems to me that we will have to do the conversion in the stg-to-cmm > pass, > which is quite a bit more involved. For example, StgCmmEnv.idToReg function > will have to change from > > idToReg :: DynFlags -> NonVoid Id -> LocalReg > > to > > idToReg :: DynFlags -> NonVoid Id -> [LocalReg] > > to accommodate the fact that we might need more than one register to store > a binder. > > Any ideas for a better solution? > > -- Johan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.doel at gmail.com Thu Sep 10 05:24:57 2015 From: dan.doel at gmail.com (Dan Doel) Date: Thu, 10 Sep 2015 01:24:57 -0400 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: Message-ID: Some of the SSE types are too big for that even on 64 bit, I think. Like DoubleX8#. On Thu, Sep 10, 2015 at 1:16 AM, Johan Tibell wrote: > I wonder if rewriting any aliased pointer field as Any in Stg and any > non-pointer field as Word# would work. I suspect that not all non-pointer > fields (e.g. Double# on 32-bit) can be represented as Word#. > > On Wed, Sep 9, 2015 at 3:22 PM, Johan Tibell wrote: >> >> Hi! >> >> The original idea for implementing the backend part of the unboxed sums >> proposal was to convert from the core representation to the actual data >> representation (i.e. a tag followed by some pointer and non-pointer fields) >> in the unarise stg-to-stg pass. >> >> I have now realized that this won't work. The problem is that stg is too >> strongly typed. When we "desugar" sum types we need to convert functions >> receiving a value e.g. from >> >> f :: (# Bool | Char #) -> ... >> >> to >> >> f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... >> >> Since stg is still typed with normal Haskell types (e.g. Bool, Char, etc), >> this is not possible, as we cannot represent an argument which has two >> different types. >> >> It seems to me that we will have to do the conversion in the stg-to-cmm >> pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg >> function will have to change from >> >> idToReg :: DynFlags -> NonVoid Id -> LocalReg >> >> to >> >> idToReg :: DynFlags -> NonVoid Id -> [LocalReg] >> >> to accommodate the fact that we might need more than one register to store >> a binder. >> >> Any ideas for a better solution? >> >> -- Johan >> > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From malcolm.wallace at me.com Thu Sep 10 08:28:30 2015 From: malcolm.wallace at me.com (malcolm.wallace) Date: Thu, 10 Sep 2015 08:28:30 +0000 (GMT) Subject: Shared data type for extension flags Message-ID: <98e9b596-be37-4e00-9924-8bc967c0b07f@me.com> "ghc-types" sounds like a package for fancy type hackery. ?I would never think to find language extension flags in such a place. ?How about "ghc-package-db", or "ghc-language-extensions"? Regards, Malcolm On 10 Sep, 2015,at 03:30 AM, "Edward Z. Yang" wrote: I don't think it makes very much sense to reuse bin-package-db; at least, not without renaming it at the very least (who'd expect a list of language extension flags to live in a binary package database?) We could name it something like 'ghc-types'? Edward Excerpts from Simon Peyton Jones's message of 2015-09-08 05:35:00 -0700: Yes, we?d have to broaden the description of the package. I defer to Edward Yang and Duncan Coutts who have a clearer idea of the architecture in this area. Simon From: Michael Smith [mailto:michael at diglumi.com] Sent: 02 September 2015 17:27 To: Simon Peyton Jones; Matthew Pickering Cc: GHC developers Subject: Re: Shared data type for extension flags The package description for that is "The GHC compiler's view of the GHC package database format", and this doesn't really have to do with the package database format. Would it be okay to put this in there anyway? On Wed, Sep 2, 2015, 07:33 Simon Peyton Jones > wrote: we already have such a shared library, I think: bin-package-db. would that do? Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Michael Smith Sent: 02 September 2015 09:21 To: Matthew Pickering Cc: GHC developers Subject: Re: Shared data type for extension flags That sounds like a good approach. Are there other things that would go nicely in a shared package like this, in addition to the extension data type? On Wed, Sep 2, 2015 at 1:00 AM, Matthew Pickering > wrote: Surely the easiest way here (including for other tooling - ie haskell-src-exts) is to create a package which just provides this enumeration. GHC, cabal, th, haskell-src-exts and so on then all depend on this package rather than creating their own enumeration. On Wed, Sep 2, 2015 at 9:47 AM, Michael Smith > wrote: > #10820 on Trac [1] and D1200 on Phabricator [2] discuss adding the > capababilty > to Template Haskell to detect which language extensions enabled. > Unfortunately, > since template-haskell can't depend on ghc (as ghc depends on > template-haskell), > it can't simply re-export the ExtensionFlag type from DynFlags to the user. > > There is a second data type encoding the list of possible language > extensions in > the Cabal package, in Language.Haskell.Extension [3]. But template-haskell > doesn't already depend on Cabal, and doing so seems like it would cause > difficulties, as the two packages can be upgraded separately. > > So adding this new feature to Template Haskell requires introducing a > *third* > data type for language extensions. It also requires enumerating this full > list > in two more places, to convert back and forth between the TH Extension data > type > and GHC's internal ExtensionFlag data type. > > Is there another way here? Can there be one single shared data type for this > somehow? > > [1] https://ghc.haskell.org/trac/ghc/ticket/10820 > [2] https://phabricator.haskell.org/D1200 > [3] > https://hackage.haskell.org/package/Cabal-1.22.4.0/docs/Language-Haskell-Extension.html > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Sep 10 09:37:43 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 10 Sep 2015 09:37:43 +0000 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: Message-ID: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> The problem is that stg is too strongly typed It?s not really typed, or at least only in a very half-hearted way. To be concrete I think you can just use Any for any Pointer arg. All STG needs to know, really, is which things are pointers. Detailed type info like ?are you a Char or a Bool? is strictly jam; indeed never used I think. (I could be wrong but I?m pretty sure I?m not wrong in a fundamental way. SImon From: Johan Tibell [mailto:johan.tibell at gmail.com] Sent: 09 September 2015 23:22 To: Simon Peyton Jones; Simon Marlow; ghc-devs at haskell.org Subject: Converting unboxed sum types in StgCmm Hi! The original idea for implementing the backend part of the unboxed sums proposal was to convert from the core representation to the actual data representation (i.e. a tag followed by some pointer and non-pointer fields) in the unarise stg-to-stg pass. I have now realized that this won't work. The problem is that stg is too strongly typed. When we "desugar" sum types we need to convert functions receiving a value e.g. from f :: (# Bool | Char #) -> ... to f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... Since stg is still typed with normal Haskell types (e.g. Bool, Char, etc), this is not possible, as we cannot represent an argument which has two different types. It seems to me that we will have to do the conversion in the stg-to-cmm pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg function will have to change from idToReg :: DynFlags -> NonVoid Id -> LocalReg to idToReg :: DynFlags -> NonVoid Id -> [LocalReg] to accommodate the fact that we might need more than one register to store a binder. Any ideas for a better solution? -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Sep 10 09:41:59 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 10 Sep 2015 09:41:59 +0000 Subject: Unlifted data types In-Reply-To: <55F08BA5.1020708@gmail.com> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <55F08BA5.1020708@gmail.com> Message-ID: <88c59a8d11a7450b91439d8dd728f7e8@DB4PR30MB030.064d.mgd.msft.net> | > The awkward spot is the runtime system. Currently it goes to some | > lengths to ensure that it never introduces an indirection for a | > boxed-but-unlifted type. Simon Marlow would know exactly where. So | | I *think* we're ok here. The RTS doesn't have any special machinery | to avoid indirections to unlifted things that I'm aware of. Did you | have a particular problem in mind? Well I can't point to anything very specific. I just recall that in various places, if a pointer was to an Array# we would immediately, eagerly, recurse in the GC rather than add the Array# to the queue for later processing. Maybe that is no longer true. Maybe it was never true. It shouldn't be hard to find out. If true, there would be a run-time system test that returns true for boxed-but-unlifted heap objects. I think it would be worth a look because if I'm right it could have a significant impact on the design. Simon From eir at cis.upenn.edu Thu Sep 10 14:26:43 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Thu, 10 Sep 2015 10:26:43 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> Message-ID: These observations from Ed and Dan are quite helpful. Could one of you put them on the wiki page? I hadn't considered the possibility of truly parametric levity polymorphism. Thanks! Richard On Sep 9, 2015, at 3:30 PM, Edward Kmett wrote: > I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk. > > I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities. > > Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it. > > Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity. > > Most of the combinators for working with Maybe do need to care about that levity however. > > e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity. > > But if we look at what inference should probably grab for the levity of Functor: > > you'd get: > > class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where > fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b > > Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default. > > These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument. > > Whether we could expect an average Haskeller to be willing to do so is an entirely different matter. > > -Edward > > > On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel wrote: > On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote: > > No functions (excepting `error` and friends) are truly levity polymorphic. > > I was talking with Ed Kmett about this yesterday, and he pointed out > that this isn't true. There are a significant array of levity > polymorphic functions having to do with reference types. They simply > shuffle around pointers with the right calling convention, and don't > really care what levity their arguments are, because they're just > operating uniformly either way. So if we had: > > MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted) > > then: > > takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s > l a -> State# s -> (# State# s, a #) > putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l > a -> a -> State# s -> State# s > > are genuinely parametric in l. And the same is true for MutVar#, > Array#, MutableArray#, etc. > > I think data type constructors are actually parametric, too (ignoring > data with ! in them for the moment; the underlying constructors of > those). Using a constructor just puts the pointers for the fields in > the type, and matching on a constructor gives them back. They don't > need to care whether their fields are lifted or not, they just > preserve whatever the case is. > > But this: > > > We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it. > > Is not parametric polymorphism; it is ad-hoc polymorphism. It even has > the defaulting step from type classes. Except the ad-hoc has been > given the same notation as the genuinely parametric, so you can no > longer identify the latter. (I'm not sure I'm a great fan of the > ad-hoc part anyway, to be honest.) > > -- Dan > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecrockett0 at gmail.com Thu Sep 10 15:07:59 2015 From: ecrockett0 at gmail.com (Eric Crockett) Date: Thu, 10 Sep 2015 11:07:59 -0400 Subject: more releases In-Reply-To: References: <3E39E8B5-89C2-40F6-9180-C6D73AF3926F@cis.upenn.edu> <87si6y1v30.fsf@gmail.com> <87oahlksnm.fsf@smart-cactus.org> <87si6wkdta.fsf@smart-cactus.org> <09dfe23cd20746c88beb0cfd308ef8f6@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Some people had asked what the users want and about typical usage, so I'll give the my perspective. I consider myself a pretty typical user of Haskell: PhD student (in theory, not languages), but still pushing the boundaries of the compiler. I've filed quite a few bugs, so I have experience with having to wait for them to get fixed. My code at various points has been littered with "see ticket #xxx for why I'm jumping through three hoops to accomplish this". As a result, I would be interested in getting builds with bugfixes. For example see the discussion on #10428: https://ghc.haskell.org/trac/ghc/ticket/10428. It's hard for a user to tell if/when a patch will be merged. I'm using 7.10.1 at the moment, but I was unsure if the patch for #10428 made it to 7.10.2. Ben: I download the GHC bindist directly from the GHC page precisely because the one on the PPA is (inevitably) ancient. Upgrading GHC (even minor releases; I just tried 7.10.2 to confirm this) is a pain because I have to spend an hour downloading and re-building all of the packages I need. However, I'd certainly be willing to do that for bugs that affect my code. Richard said, "Then a user's package library doesn't have to be recompiled when updating". If he means that I wouldn't have to do that, that's fantastic. However, I still wouldn't download every tiny release due to the 100mb download+install time to fix bugs that don't affect me (I'd only do that for bugs that *do* affect me). In short: I'd really like to have builds for every bug (or maybe every day/week) that I can easily download and install. On Mon, Sep 7, 2015 at 12:05 PM, Bardur Arantsson wrote: > On 09/07/2015 04:57 PM, Simon Peyton Jones wrote: > > Merging and releasing a fix to the stable branch always carries a cost: > > it might break something else. There is a real cost to merging, which > > is why we've followed the lazy strategy that Ben describes. > > > > A valid point, but the upside is that it's a very fast operation to > revert if a release is "bad"... and get that updated release into the wild. > > Regards, > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.tibell at gmail.com Thu Sep 10 15:16:33 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Thu, 10 Sep 2015 08:16:33 -0700 Subject: Converting unboxed sum types in StgCmm In-Reply-To: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I'll give that a try. The main use of the stg types in the stg-to-cmm pass is to call idPrimRep (which call typePrimRep) to figure out which register type we need to use. I guess as long as I rewrite the stg types so they give me the typePrimRep I want in the end I should be fine. On Thu, Sep 10, 2015 at 2:37 AM, Simon Peyton Jones wrote: > The problem is that stg is too strongly typed > > > > It?s not really typed, or at least only in a very half-hearted way. To be > concrete I think you can just use Any for any Pointer arg. All STG needs > to know, really, is which things are pointers. Detailed type info like > ?are you a Char or a Bool? is strictly jam; indeed never used I think. (I > could be wrong but I?m pretty sure I?m not wrong in a fundamental way. > > > > SImon > > > > *From:* Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* 09 September 2015 23:22 > *To:* Simon Peyton Jones; Simon Marlow; ghc-devs at haskell.org > *Subject:* Converting unboxed sum types in StgCmm > > > > Hi! > > > > The original idea for implementing the backend part of the unboxed sums > proposal was to convert from the core representation to the actual data > representation (i.e. a tag followed by some pointer and non-pointer fields) > in the unarise stg-to-stg > > pass. > > > > I have now realized that this won't work. The problem is that stg is too > strongly typed. When we "desugar" sum types we need to convert functions > receiving a value e.g. from > > > > f :: (# Bool | Char #) -> ... > > > > to > > > > f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... > > > > Since stg is still typed with normal Haskell types (e.g. Bool, Char, etc), > this is not possible, as we cannot represent an argument which has two > different types. > > > > It seems to me that we will have to do the conversion in the stg-to-cmm > > pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg > function will have to change from > > > > idToReg :: DynFlags -> NonVoid Id -> LocalReg > > > > to > > > > idToReg :: DynFlags -> NonVoid Id -> [LocalReg] > > > > to accommodate the fact that we might need more than one register to store > a binder. > > > > Any ideas for a better solution? > > > > -- Johan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allbery.b at gmail.com Thu Sep 10 15:50:49 2015 From: allbery.b at gmail.com (Brandon Allbery) Date: Thu, 10 Sep 2015 11:50:49 -0400 Subject: Shared data type for extension flags In-Reply-To: <98e9b596-be37-4e00-9924-8bc967c0b07f@me.com> References: <98e9b596-be37-4e00-9924-8bc967c0b07f@me.com> Message-ID: On Thu, Sep 10, 2015 at 4:28 AM, malcolm.wallace wrote: > "ghc-types" sounds like a package for fancy type hackery. I would never > think to find language extension flags in such a place. How about > "ghc-package-db", or "ghc-language-extensions"? > ghc-integration ghc-core-data -- brandon s allbery kf8nh sine nomine associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at sandbergericsson.se Thu Sep 10 16:16:58 2015 From: adam at sandbergericsson.se (Adam Sandberg Eriksson) Date: Thu, 10 Sep 2015 18:16:58 +0200 Subject: Shared data type for extension flags In-Reply-To: References: <98e9b596-be37-4e00-9924-8bc967c0b07f@me.com> Message-ID: <1441901818.1705729.380033505.30AC50AA@webmail.messagingengine.com> How about ghc-datatypes? ghc-core-data seems like it would have something to do with the Core language and ghc-integration sounds like it is a layer on top of ghc. Using the -types suffix has quite som precedence [1] for packages defining shared datatypes (as does the -core suffix), but given Malcolms comment it is perhaps not a good fit for GHC. [1]: http://hackage.haskell.org/packages/search?terms=types Adam Sandberg Eriksson On Thu, 10 Sep 2015, at 05:50 PM, Brandon Allbery wrote: > On Thu, Sep 10, 2015 at 4:28 AM, malcolm.wallace > wrote: >> "ghc-types" sounds like a package for fancy type hackery.? I would >> never think to find language extension flags in such a place.? How >> about "ghc-package-db", or "ghc-language-extensions"? > > ghc-integration ghc-core-data > > -- > brandon s allbery kf8nh ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sine nomine > associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, > kerberos, infrastructure, xmonad ? ? ? ?http://sinenomine.net > _________________________________________________ > ghc-devs mailing list ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Thu Sep 10 19:59:15 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Thu, 10 Sep 2015 20:59:15 +0100 Subject: Unlifted data types In-Reply-To: <88c59a8d11a7450b91439d8dd728f7e8@DB4PR30MB030.064d.mgd.msft.net> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <55F08BA5.1020708@gmail.com> <88c59a8d11a7450b91439d8dd728f7e8@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55F1E113.3050404@gmail.com> On 10/09/2015 10:41, Simon Peyton Jones wrote: > | > The awkward spot is the runtime system. Currently it goes to some > | > lengths to ensure that it never introduces an indirection for a > | > boxed-but-unlifted type. Simon Marlow would know exactly where. So > | > | I *think* we're ok here. The RTS doesn't have any special machinery > | to avoid indirections to unlifted things that I'm aware of. Did you > | have a particular problem in mind? > > Well I can't point to anything very specific. I just recall that in > various places, if a pointer was to an Array# we would immediately, > eagerly, recurse in the GC rather than add the Array# to the queue > for later processing. Maybe that is no longer true. Maybe it was > never true. Maybe that was an earlier variant of the GC. The current GC just treats unlifted objects like other objects in a breadth-first way, and it never introduces indirections except when reducing a selector thunk. So I think this is ok. Cheers Simon > It shouldn't be hard to find out. If true, there would be a run-time > system test that returns true for boxed-but-unlifted heap objects. > > I think it would be worth a look because if I'm right it could have a > significant impact on the design. > > Simon > From carter.schonwald at gmail.com Fri Sep 11 03:22:51 2015 From: carter.schonwald at gmail.com (Carter Schonwald) Date: Thu, 10 Sep 2015 23:22:51 -0400 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> Message-ID: Would this allow having a strict monoid instance for maybe, given the right hinting at the use site? On Wednesday, September 9, 2015, Edward Kmett wrote: > I think ultimately the two views of levity that we've been talking diverge > along the same lines as the pi vs forall discussion from your Levity > polymorphism talk. > > I've been focused entirely on situations where forall suffices, and no > distinction is needed in how you compile for both levities. > > Maybe could be polymorphic using a mere forall in the levity of the boxed > argument it carries as it doesn't care what it is, it never forces it, > pattern matching on it just gives it back when you pattern match on it. > > Eq or Ord could just as easily work over anything boxed. The particular Eq > _instance_ needs to care about the levity. > > Most of the combinators for working with Maybe do need to care about that > levity however. > > e.g. consider fmap in Functor, the particular instances would care. > Because you ultimately wind up using fmap to build 'f a' values and those > need to know how the let binding should work. There seems to be a pi at > work there. Correct operational behavior would depend on the levity. > > But if we look at what inference should probably grab for the levity of > Functor: > > you'd get: > > class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where > fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b > > Baed on the notion that given current practices, f would cause us to pick > a common kind for a and b, and the results of 'f'. Depending on how and if > we decided to default to * unless annotated in various situations would > drive this closer and closer to the existing Functor by default. > > These are indeed distinct functors with distinct operational behavior, and > we could implement each of them by supplying separate instances, as the > levity would take part in the instance resolution like any other kind > argument. > > Whether we could expect an average Haskeller to be willing to do so is an > entirely different matter. > > -Edward > > > On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel > wrote: > >> On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg > > wrote: >> > No functions (excepting `error` and friends) are truly levity >> polymorphic. >> >> I was talking with Ed Kmett about this yesterday, and he pointed out >> that this isn't true. There are a significant array of levity >> polymorphic functions having to do with reference types. They simply >> shuffle around pointers with the right calling convention, and don't >> really care what levity their arguments are, because they're just >> operating uniformly either way. So if we had: >> >> MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed >> Unlifted) >> >> then: >> >> takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s >> l a -> State# s -> (# State# s, a #) >> putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l >> a -> a -> State# s -> State# s >> >> are genuinely parametric in l. And the same is true for MutVar#, >> Array#, MutableArray#, etc. >> >> I think data type constructors are actually parametric, too (ignoring >> data with ! in them for the moment; the underlying constructors of >> those). Using a constructor just puts the pointers for the fields in >> the type, and matching on a constructor gives them back. They don't >> need to care whether their fields are lifted or not, they just >> preserve whatever the case is. >> >> But this: >> >> > We use levity polymorphism in the types to get GHC to use its existing >> type inference to infer strictness. By the time type inference is done, we >> must ensure that no levity polymorphism remains, because the code generator >> won't be able to deal with it. >> >> Is not parametric polymorphism; it is ad-hoc polymorphism. It even has >> the defaulting step from type classes. Except the ad-hoc has been >> given the same notation as the genuinely parametric, so you can no >> longer identify the latter. (I'm not sure I'm a great fan of the >> ad-hoc part anyway, to be honest.) >> >> -- Dan >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roma at ro-che.info Fri Sep 11 08:28:13 2015 From: roma at ro-che.info (Roman Cheplyaka) Date: Fri, 11 Sep 2015 11:28:13 +0300 Subject: Unlifted data types In-Reply-To: References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> Message-ID: <55F2909D.60904@ro-che.info> On 11/09/15 06:22, Carter Schonwald wrote: > Would this allow having a strict monoid instance for maybe, given the > right hinting at the use site? That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive. But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?) Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ben at smart-cactus.org Fri Sep 11 12:08:34 2015 From: ben at smart-cactus.org (Ben Gamari) Date: Fri, 11 Sep 2015 14:08:34 +0200 Subject: D1182: Implement improved error messages for ambiguous type variables (#10733) In-Reply-To: <201509032157.38426.jan.stolarek@p.lodz.pl> References: <6bac15f299b2494187fdc47167cae02d@DB4PR30MB030.064d.mgd.msft.net> <55E89962.4020304@gmail.com> <201509032157.38426.jan.stolarek@p.lodz.pl> Message-ID: <87mvwt9n0d.fsf@smart-cactus.org> Jan Stolarek writes: >> In general we shouldn't commit anything that breaks validate, because >> this causes problems for other developers. The right thing to do would >> be to mark it expect_broken before committing. > > Sorry for that. I was actually thinking about marking the test as > expect_broken, but then the problem would be completely hidden> I > wanted to discuss a possible solution with Simon and Edward first but > it looks like Thomas already found a workaround. > Hiding the issue isn't a problem so long as there is a ticket opened to describe the brokeness (and discuss potential solutions). Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From david.feuer at gmail.com Fri Sep 11 16:49:22 2015 From: david.feuer at gmail.com (David Feuer) Date: Fri, 11 Sep 2015 12:49:22 -0400 Subject: Deriving Contravariant and Profunctor Message-ID: Would it be possible to add mechanisms to derive Contravariant and Profunctor instances? As with Functor, each algebraic datatype can only have one sensible instance of each of these. David Feuer From ekmett at gmail.com Fri Sep 11 17:52:35 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 11 Sep 2015 13:52:35 -0400 Subject: Deriving Contravariant and Profunctor In-Reply-To: References: Message-ID: Actually it is trickier than you'd think. With "Functor" you can pretend that contravariance doesn't exist. With both profunctor and contravariant it is necessarily part of the puzzle. data Compose f g a = Compose (f (g a)) * are both f and g contravariant leading to a functor? * is f contravariant and g covariant leading to a contravariant functor? * is f covariant and g contravariant leading to a contravariant functor? data Wat p f a b = Wat (p (f a) b) is p a Profunctor or a Bifunctor? is f Contravariant or a Functor? We investigated adding TH code-generation for the contravariant package, and ultimately rejected it on these grounds. https://github.com/ekmett/contravariant/issues/17 -Edward On Fri, Sep 11, 2015 at 12:49 PM, David Feuer wrote: > Would it be possible to add mechanisms to derive Contravariant and > Profunctor instances? As with Functor, each algebraic datatype can > only have one sensible instance of each of these. > > David Feuer > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.feuer at gmail.com Fri Sep 11 18:22:15 2015 From: david.feuer at gmail.com (David Feuer) Date: Fri, 11 Sep 2015 14:22:15 -0400 Subject: Deriving Contravariant and Profunctor In-Reply-To: References: Message-ID: Oh, I see... you get horrible overlap problems there. Blech! I guess they'll all act the same (modulo optimized <$ and such), but GHC can't know that and will see them as forever incoherent. On Fri, Sep 11, 2015 at 1:52 PM, Edward Kmett wrote: > Actually it is trickier than you'd think. > > With "Functor" you can pretend that contravariance doesn't exist. > > With both profunctor and contravariant it is necessarily part of the puzzle. > > data Compose f g a = Compose (f (g a)) > > * are both f and g contravariant leading to a functor? > * is f contravariant and g covariant leading to a contravariant functor? > * is f covariant and g contravariant leading to a contravariant functor? > > data Wat p f a b = Wat (p (f a) b) > > is p a Profunctor or a Bifunctor? is f Contravariant or a Functor? > > We investigated adding TH code-generation for the contravariant package, and > ultimately rejected it on these grounds. > > https://github.com/ekmett/contravariant/issues/17 > > -Edward > > > > On Fri, Sep 11, 2015 at 12:49 PM, David Feuer wrote: >> >> Would it be possible to add mechanisms to derive Contravariant and >> Profunctor instances? As with Functor, each algebraic datatype can >> only have one sensible instance of each of these. >> >> David Feuer >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > From ekmett at gmail.com Fri Sep 11 19:06:11 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 11 Sep 2015 15:06:11 -0400 Subject: Deriving Contravariant and Profunctor In-Reply-To: References: Message-ID: They'd all act the same assuming any or all of the instances existed, but GHC can't backtrack and figure out which way to get there, it'll only look at the instance head. -Edward On Fri, Sep 11, 2015 at 2:22 PM, David Feuer wrote: > Oh, I see... you get horrible overlap problems there. Blech! I guess > they'll all act the same (modulo optimized <$ and such), but GHC can't > know that and will see them as forever incoherent. > > On Fri, Sep 11, 2015 at 1:52 PM, Edward Kmett wrote: > > Actually it is trickier than you'd think. > > > > With "Functor" you can pretend that contravariance doesn't exist. > > > > With both profunctor and contravariant it is necessarily part of the > puzzle. > > > > data Compose f g a = Compose (f (g a)) > > > > * are both f and g contravariant leading to a functor? > > * is f contravariant and g covariant leading to a contravariant functor? > > * is f covariant and g contravariant leading to a contravariant functor? > > > > data Wat p f a b = Wat (p (f a) b) > > > > is p a Profunctor or a Bifunctor? is f Contravariant or a Functor? > > > > We investigated adding TH code-generation for the contravariant package, > and > > ultimately rejected it on these grounds. > > > > https://github.com/ekmett/contravariant/issues/17 > > > > -Edward > > > > > > > > On Fri, Sep 11, 2015 at 12:49 PM, David Feuer > wrote: > >> > >> Would it be possible to add mechanisms to derive Contravariant and > >> Profunctor instances? As with Functor, each algebraic datatype can > >> only have one sensible instance of each of these. > >> > >> David Feuer > >> _______________________________________________ > >> ghc-devs mailing list > >> ghc-devs at haskell.org > >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Mon Sep 14 08:03:16 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Mon, 14 Sep 2015 09:03:16 +0100 Subject: Converting unboxed sum types in StgCmm In-Reply-To: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55F67F44.5010501@gmail.com> On 10/09/2015 10:37, Simon Peyton Jones wrote: > The problem is that stg is too strongly typed > > It?s not really typed, or at least only in a very half-hearted way. To > be concrete I think you can just use Any for any Pointer arg. All STG > needs to know, really, is which things are pointers. Detailed type info > like ?are you a Char or a Bool? is strictly jam; indeed never used I > think. (I could be wrong but I?m pretty sure I?m not wrong in a > fundamental way. Yes, the only thing the code generator needs to do with types is convert them to PrimReps (see idPrimRep), and all GC pointer types have the same PrimRep (PtrRep). Cheers Simon > > SImon > > *From:*Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* 09 September 2015 23:22 > *To:* Simon Peyton Jones; Simon Marlow; ghc-devs at haskell.org > *Subject:* Converting unboxed sum types in StgCmm > > Hi! > > The original idea for implementing the backend part of the unboxed sums > proposal was to convert from the core representation to the actual data > representation (i.e. a tag followed by some pointer and non-pointer > fields) in the unarise stg-to-stg > > pass. > > I have now realized that this won't work. The problem is that stg is too > strongly typed. When we "desugar" sum types we need to convert functions > receiving a value e.g. from > > f :: (# Bool | Char #) -> ... > > to > > f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... > > Since stg is still typed with normal Haskell types (e.g. Bool, Char, > etc), this is not possible, as we cannot represent an argument which has > two different types. > > It seems to me that we will have to do the conversion in the stg-to-cmm > > pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg > function will have to change from > > idToReg :: DynFlags -> NonVoid Id -> LocalReg > > to > > idToReg :: DynFlags -> NonVoid Id -> [LocalReg] > > to accommodate the fact that we might need more than one register to > store a binder. > > Any ideas for a better solution? > > -- Johan > From alexander at plaimi.net Mon Sep 14 11:59:28 2015 From: alexander at plaimi.net (Alexander Berntsen) Date: Mon, 14 Sep 2015 13:59:28 +0200 Subject: Proposal: Include GHC version target in libraries' description In-Reply-To: <55CDB51D.8070504@plaimi.net> References: <55C1ECCC.5080409@plaimi.net> <55CDB51D.8070504@plaimi.net> Message-ID: <55F6B6A0.3020205@plaimi.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 14/08/15 11:30, Alexander Berntsen wrote: > -make GHC a distribution and use that field, > -use the platform feature, > -make a new field -- bundled-with, bundles, or similar, > -or use the package collection feature. > > [L]et's make a decision. I get the feeling nobody liked my original proposal of just writing it in the .cabal file, so I'm reluctant to put effort into a set of patches that will be rejected. On the other hand I see no progress here. Does nobody care? I am keen on hacking on either solution if someone would just give some sort of acknowledgement here. - -- Alexander alexander at plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCgAGBQJV9ragAAoJENQqWdRUGk8B70YP/iyRI/F9qNJAVpHxxPX3nxA9 Z7sJ8hO/vazF15Zumcer0x8cm42VCZyexrnE/S+0p513t5GVUAcYndB3kgALgqvX 7UnimodWkNwIFttrHgNIMXkrx2ezBeDg17wPND2AbCeLOajUn6KC4bylfg/ymht+ eXVUaitVM9jva/M3LpooPY+gIeuAOsMrPMzW64Dz3c3PzKrTTXm3GxYygs8iWFgD euU/ry1zP/UYKdwdoaLd5h87t4RNuahms6BQKXfVFCxnDGlodDk3HI7mhFebiPg7 aHZh332+goUGTW84OoVd2N7oKfY2ge1MGPJxTJWNwv72x8QhmEyd+0DALMx0Ra8T GTZe1OkZOP+kCrI+CVUeOrxjBvHI7bxedGW2RsJzKO6F6zGf7x4YYNwMcpHlhux+ fK8CxebXDBxmboy0LSjtiNxhX2EiXC1SiqT/Wa3/MnUeM3Z+O5AXcud1MpJXH5qf hWIBD1gLkAaeovZqqKz9Kf9fLzHtPphMMc+EWwojdgoTwXRc5QQMFpIkBU90gsqL uEFhoMF9aCIhkymo+0sReAJZsSNFAlgv8Ka56CLipIIzRGIr2z+VU8a3tbT7Cnvg tnWA9kOpLV2vokxQghEBokSzuTeFUReWYNDvOkm85KqAHn+9jkCzqA5z1B8K7YRy thN93we1J3EE4Xs7Lezw =MEGt -----END PGP SIGNATURE----- From johan.tibell at gmail.com Mon Sep 14 13:21:28 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Mon, 14 Sep 2015 06:21:28 -0700 Subject: Converting unboxed sum types in StgCmm In-Reply-To: <55F67F44.5010501@gmail.com> References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: I took a stab at this but ran into something I don't understand. For recence, the whole implementation of unboxed sums is at https://github.com/ghc/ghc/compare/master...tibbe:unboxed-sums and the implementation of unarisation is at https://github.com/ghc/ghc/compare/master...tibbe:unboxed-sums#diff-f5bc1f9e9c230db4cf882bf18368a818 . Running the compiler on the following file: {-# LANGUAGE UnboxedSums #-} module Test where f :: (# Int | Char #) -> Int f (# x | #) = x {-# NOINLINE f #-} g = f (# 1 | #) Yields an error, like so: ghc-stage2: panic! (the 'impossible' happened) (GHC version 7.11.20150912 for x86_64-apple-darwin): StgCmmEnv: variable not found ds_svq local binds for: ds_gvz ds_gvA I probably got something wrong in UnariseStg, but I can't see what. I printed this debug information to see the stg I'm rewriting: unarise [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = \r srt:SRT:[0e :-> patError] [ds_svq] case ds_svq of _ [Occ=Dead] { (#_|#) x_svs [Occ=Once] -> x_svs; (#|_#) _ [Occ=Dead] -> patError "UnboxedSum.hs:5:1-15|function f"#; };, g :: Int [GblId, Str=DmdType] = \u srt:SRT:[r1 :-> f] [] let { sat_svu [Occ=Once] :: Int [LclId, Str=DmdType] = NO_CCS I#! [1#]; } in case (#_|#) [sat_svu] of sat_svv { __DEFAULT -> f sat_svv; };] unariseAlts [81 :-> [realWorld#], svq :-> [ds_gvz, ds_gvA]] UbxTup 2 wild_svr [((#_|#), [x_svs], [True], x_svs), ((#|_#), [ipv_svt], [False], patError "UnboxedSum.hs:5:1-15|function f"#)] It's ds_svg that's being complained about above. I find that a bit confusing as that variable is never used on any RHS. Some questions that might help me get there: - I added a new RepType for unboxed sums, like so: data RepType = UbxTupleRep [UnaryType] | UbxSumRep [UnaryType] | UnaryRep UnaryType Does this constructor make sense? I store the already flattened representation of the sum in here, rather than having something like [[UnaryType]] and storing each alternative. - In unariseAlts there's a bndr argument. Is that the binder of the scrutinee as a whole (e.g. the 'x' in case e of x { ... -> ... })? Any other idea what I might have gotten wrong? On Mon, Sep 14, 2015 at 1:03 AM, Simon Marlow wrote: > On 10/09/2015 10:37, Simon Peyton Jones wrote: > >> The problem is that stg is too strongly typed >> >> It?s not really typed, or at least only in a very half-hearted way. To >> be concrete I think you can just use Any for any Pointer arg. All STG >> needs to know, really, is which things are pointers. Detailed type info >> like ?are you a Char or a Bool? is strictly jam; indeed never used I >> think. (I could be wrong but I?m pretty sure I?m not wrong in a >> fundamental way. >> > > Yes, the only thing the code generator needs to do with types is convert > them to PrimReps (see idPrimRep), and all GC pointer types have the same > PrimRep (PtrRep). > > Cheers > Simon > > > > >> SImon >> >> *From:*Johan Tibell [mailto:johan.tibell at gmail.com] >> *Sent:* 09 September 2015 23:22 >> *To:* Simon Peyton Jones; Simon Marlow; ghc-devs at haskell.org >> *Subject:* Converting unboxed sum types in StgCmm >> >> Hi! >> >> The original idea for implementing the backend part of the unboxed sums >> proposal was to convert from the core representation to the actual data >> representation (i.e. a tag followed by some pointer and non-pointer >> fields) in the unarise stg-to-stg >> < >> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fghc%2fghc%2fblob%2fmaster%2fcompiler%2fsimplStg%2fUnariseStg.hs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7cca7beffb01494517d75108d2b9652973%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=U%2bFUNsL87iEemajTnAW9SxD9N5b4%2bG8QB1q19%2fX%2bBI4%3d >> > >> pass. >> >> I have now realized that this won't work. The problem is that stg is too >> strongly typed. When we "desugar" sum types we need to convert functions >> receiving a value e.g. from >> >> f :: (# Bool | Char #) -> ... >> >> to >> >> f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... >> >> Since stg is still typed with normal Haskell types (e.g. Bool, Char, >> etc), this is not possible, as we cannot represent an argument which has >> two different types. >> >> It seems to me that we will have to do the conversion in the stg-to-cmm >> < >> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fghc%2fghc%2fblob%2fmaster%2fcompiler%2fcodeGen%2fStgCmm.hs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7cca7beffb01494517d75108d2b9652973%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=aXKZ78eGNKbJ6eZkxZgyJHgsAXpgOBjg3Zvqj%2bq7pk0%3d >> > >> pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg >> function will have to change from >> >> idToReg :: DynFlags -> NonVoid Id -> LocalReg >> >> to >> >> idToReg :: DynFlags -> NonVoid Id -> [LocalReg] >> >> to accommodate the fact that we might need more than one register to >> store a binder. >> >> Any ideas for a better solution? >> >> -- Johan >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.tibell at gmail.com Mon Sep 14 13:23:42 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Mon, 14 Sep 2015 06:23:42 -0700 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: Another question, in need to add something to AltType in StgSyn, would this work data AltType = PolyAlt -- Polymorphic (a type variable) | UbxTupAlt Int -- Unboxed tuple of this arity | UbxSumAlt Int -- Unboxed sum of this arity | AlgAlt TyCon -- Algebraic data type; the AltCons will be DataAlts | PrimAlt TyCon -- Primitive data type; the AltCons will be LitAlts or do I also have to capture which alternative was used here? Why do we capture the arity in *tuple* case here? On Mon, Sep 14, 2015 at 6:21 AM, Johan Tibell wrote: > I took a stab at this but ran into something I don't understand. For > recence, the whole implementation of unboxed sums is at > https://github.com/ghc/ghc/compare/master...tibbe:unboxed-sums and the > implementation of unarisation is at > https://github.com/ghc/ghc/compare/master...tibbe:unboxed-sums#diff-f5bc1f9e9c230db4cf882bf18368a818 > . > > Running the compiler on the following file: > > {-# LANGUAGE UnboxedSums #-} > module Test where > > f :: (# Int | Char #) -> Int > f (# x | #) = x > {-# NOINLINE f #-} > > g = f (# 1 | #) > > Yields an error, like so: > > ghc-stage2: panic! (the 'impossible' happened) > (GHC version 7.11.20150912 for x86_64-apple-darwin): > StgCmmEnv: variable not found > ds_svq > local binds for: > ds_gvz > ds_gvA > > I probably got something wrong in UnariseStg, but I can't see what. I > printed this debug information to see the stg I'm rewriting: > > unarise > [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int > [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = > \r srt:SRT:[0e :-> patError] [ds_svq] > case ds_svq of _ [Occ=Dead] { > (#_|#) x_svs [Occ=Once] -> x_svs; > (#|_#) _ [Occ=Dead] -> patError > "UnboxedSum.hs:5:1-15|function f"#; > };, > g :: Int > [GblId, Str=DmdType] = > \u srt:SRT:[r1 :-> f] [] > let { > sat_svu [Occ=Once] :: Int > [LclId, Str=DmdType] = > NO_CCS I#! [1#]; > } in > case (#_|#) [sat_svu] of sat_svv { __DEFAULT -> f sat_svv; };] > unariseAlts > [81 :-> [realWorld#], svq :-> [ds_gvz, ds_gvA]] > UbxTup 2 > wild_svr > [((#_|#), [x_svs], [True], x_svs), > ((#|_#), > [ipv_svt], > [False], > patError "UnboxedSum.hs:5:1-15|function f"#)] > > It's ds_svg that's being complained about above. I find that a bit > confusing as that variable is never used on any RHS. > > Some questions that might help me get there: > > - I added a new RepType for unboxed sums, like so: > > data RepType = UbxTupleRep [UnaryType] > | UbxSumRep [UnaryType] > | UnaryRep UnaryType > > Does this constructor make sense? I store the already flattened > representation of the sum in here, rather than having something like > [[UnaryType]] and storing each alternative. > - In unariseAlts there's a bndr argument. Is that the binder of the > scrutinee as a whole (e.g. the 'x' in case e of x { ... -> ... })? > > Any other idea what I might have gotten wrong? > > > On Mon, Sep 14, 2015 at 1:03 AM, Simon Marlow wrote: > >> On 10/09/2015 10:37, Simon Peyton Jones wrote: >> >>> The problem is that stg is too strongly typed >>> >>> It?s not really typed, or at least only in a very half-hearted way. To >>> be concrete I think you can just use Any for any Pointer arg. All STG >>> needs to know, really, is which things are pointers. Detailed type info >>> like ?are you a Char or a Bool? is strictly jam; indeed never used I >>> think. (I could be wrong but I?m pretty sure I?m not wrong in a >>> fundamental way. >>> >> >> Yes, the only thing the code generator needs to do with types is convert >> them to PrimReps (see idPrimRep), and all GC pointer types have the same >> PrimRep (PtrRep). >> >> Cheers >> Simon >> >> >> >> >>> SImon >>> >>> *From:*Johan Tibell [mailto:johan.tibell at gmail.com] >>> *Sent:* 09 September 2015 23:22 >>> *To:* Simon Peyton Jones; Simon Marlow; ghc-devs at haskell.org >>> *Subject:* Converting unboxed sum types in StgCmm >>> >>> Hi! >>> >>> The original idea for implementing the backend part of the unboxed sums >>> proposal was to convert from the core representation to the actual data >>> representation (i.e. a tag followed by some pointer and non-pointer >>> fields) in the unarise stg-to-stg >>> < >>> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fghc%2fghc%2fblob%2fmaster%2fcompiler%2fsimplStg%2fUnariseStg.hs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7cca7beffb01494517d75108d2b9652973%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=U%2bFUNsL87iEemajTnAW9SxD9N5b4%2bG8QB1q19%2fX%2bBI4%3d >>> > >>> pass. >>> >>> I have now realized that this won't work. The problem is that stg is too >>> strongly typed. When we "desugar" sum types we need to convert functions >>> receiving a value e.g. from >>> >>> f :: (# Bool | Char #) -> ... >>> >>> to >>> >>> f :: NonPointer {-# tag#-} -> Pointer {-# Bool or Char #-} -> ... >>> >>> Since stg is still typed with normal Haskell types (e.g. Bool, Char, >>> etc), this is not possible, as we cannot represent an argument which has >>> two different types. >>> >>> It seems to me that we will have to do the conversion in the stg-to-cmm >>> < >>> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fghc%2fghc%2fblob%2fmaster%2fcompiler%2fcodeGen%2fStgCmm.hs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7cca7beffb01494517d75108d2b9652973%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=aXKZ78eGNKbJ6eZkxZgyJHgsAXpgOBjg3Zvqj%2bq7pk0%3d >>> > >>> pass, which is quite a bit more involved. For example, StgCmmEnv.idToReg >>> function will have to change from >>> >>> idToReg :: DynFlags -> NonVoid Id -> LocalReg >>> >>> to >>> >>> idToReg :: DynFlags -> NonVoid Id -> [LocalReg] >>> >>> to accommodate the fact that we might need more than one register to >>> store a binder. >>> >>> Any ideas for a better solution? >>> >>> -- Johan >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From austin at well-typed.com Mon Sep 14 13:47:40 2015 From: austin at well-typed.com (Austin Seipp) Date: Mon, 14 Sep 2015 08:47:40 -0500 Subject: HEADS UP (devs, users): 8.0.1 Roadmap Message-ID: Hi *, I've returned from vacation, and last week Simon, Simon and I met up again after a long break, and talked a bit about the upcoming release. The good news is that it is going to be an exciting one! The flip side is, there's a lot of work to be done! The current plan we'd roughly like to follow is... - November: Fork the new `ghc-8.0` STABLE branch - At this point, `master` development will probably slow as we fix bugs. - This gives us 2 months or so until branch, from Today. - This is nice as the branch is close to the first RC. - December: First release candidate - Mid/late February: Final release. Here's our current feature roadmap (in basically the same place as all our previous pages): https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-8.0.1 As time goes on, this page will be updated to reflect Reality? and track it as closely as possible. So keep an eye on it! It's got the roadmap (near top) and large bug list (bottom). Now, there are some things we need, so depending on who you are, please... - *Users*: please look over the bug list! If there's a bug you need fixed that isn't there, set it to the 8.0.1 milestone (updated in Trac). If this bug is critical to you, please let us know! You can bump the priority (if we disagree, or it's workaround-able, it might get bumped down). We just need a way to see what you need, so please let us know somehow! As a reminder, our regular policy is this: if a bug is NOT marked highest or high priority, it is essentially 100% the case we will not look at it. So please make sure this is accurate. Or if you can, write a patch yourself! - *Developers*: double check the roadmap list, _and if you're responsible for something, make sure it is accurate!_ There are some great things planned to land in HEAD, but we'll have to work for it. Onward! - A better LLVM backend for Tier-1 platforms - Types are kinds and kind equality - Overloaded record fields! - Enhancements to DWARF debugging - ApplicativeDo - ... and many more... Thanks everyone! -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From austin at well-typed.com Mon Sep 14 13:53:38 2015 From: austin at well-typed.com (Austin Seipp) Date: Mon, 14 Sep 2015 08:53:38 -0500 Subject: HEADS UP: Need 7.10.3? Message-ID: Hi *, (This is an email primarily aimed at users reading this list and developers who have any interest). As some of you may know, there's currently a 7.10.3 milestone and status page on our wiki: https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.3 The basic summary is best captured on the above page: "We have not yet decided when, or even whether, to release GHC 7.10.3. We will do so if (but only if!) we have documented cases of "show-stoppers" in 7.10.2. Namely, cases from users where - You are unable to use 7.10.2 because of some bug - There is no reasonable workaround, so you are truly stuck - We know how to fix it - The fix is not too disruptive; i.e. does not risk introducing a raft of new bugs" That is, we're currently not fully sold on the need for a release. However, the milestone and issue page serve as a useful guide, and also make it easier to keep track of smaller, point-release worthy issues. So in the wake of the 8.0 roadmap I just sent: If you *need* 7.10.3 because the 7.10.x series has a major regression or problem you can't work around, let us know! - Find or file a bug in Trac - Make sure it's highest priority - Assign it to the 7.10.3 milestone - Follow up on this email if possible, or edit it on the status page text above - it would be nice to get some public feedback in one place about what everyone needs. Currently we have two bugs on the listed page in the 'show stopper category', possibly the same bug, which is a deal-breaker for HERMIT I believe. Knowing of anything else would be very useful. Thanks all! -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From tuncer.ayaz at gmail.com Mon Sep 14 14:15:11 2015 From: tuncer.ayaz at gmail.com (Tuncer Ayaz) Date: Mon, 14 Sep 2015 16:15:11 +0200 Subject: HEADS UP: Need 7.10.3? In-Reply-To: References: Message-ID: On Mon, Sep 14, 2015 at 3:53 PM, Austin Seipp wrote: > Hi *, > > (This is an email primarily aimed at users reading this list and > developers who have any interest). > > As some of you may know, there's currently a 7.10.3 milestone and > status page on our wiki: > > https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.3 > > The basic summary is best captured on the above page: > > "We have not yet decided when, or even whether, to release GHC > 7.10.3. We will do so if (but only if!) we have documented cases of > "show-stoppers" in 7.10.2. Namely, cases from users where > > - You are unable to use 7.10.2 because of some bug > - There is no reasonable workaround, so you are truly stuck > - We know how to fix it > - The fix is not too disruptive; i.e. does not risk introducing a > raft of new bugs" > > That is, we're currently not fully sold on the need for a release. > However, the milestone and issue page serve as a useful guide, and > also make it easier to keep track of smaller, point-release worthy > issues. > > So in the wake of the 8.0 roadmap I just sent: If you *need* 7.10.3 > because the 7.10.x series has a major regression or problem you > can't work around, let us know! > > - Find or file a bug in Trac > - Make sure it's highest priority > - Assign it to the 7.10.3 milestone > - Follow up on this email if possible, or edit it on the status page > text above - it would be nice to get some public feedback in one place > about what everyone needs. > > Currently we have two bugs on the listed page in the 'show stopper > category', possibly the same bug, which is a deal-breaker for HERMIT > I believe. Knowing of anything else would be very useful. Would tracking down and fixing some of the reported time and space regressions qualify as 7.10.3 material? From rrnewton at gmail.com Mon Sep 14 14:27:23 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Mon, 14 Sep 2015 14:27:23 +0000 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: > > > - > data RepType = UbxTupleRep [UnaryType] > | UbxSumRep [UnaryType] > | UnaryRep UnaryType > > Not, fully following, but ... this reptype business is orthogonal to whether you add a normal type to the STG level that models anonymous, untagged unions, right? That is, when using Any for pointer types, they could use indicative phantom types, like "Any (Union Bool Char)", even if there's not full support for doing anything useful with (Union Bool Char) by itself. Maybe the casting machinery could greenlight a cast from Any (Union Bool Char) to Bool at least? There's already the unboxed union itself, (|# #|) , but that's different than a pointer to a union of types... -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Mon Sep 14 14:43:44 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 14 Sep 2015 10:43:44 -0400 Subject: download all of Hackage? Message-ID: Hi devs, Is there an easy way to download (but not compile) all of Hackage? I know of the hackager package, but that's about compiling. I just want a whole big load of Haskell code to play with. I thought I could find a link on Hackage to do this, but failed. Thanks! Richard From allbery.b at gmail.com Mon Sep 14 14:49:55 2015 From: allbery.b at gmail.com (Brandon Allbery) Date: Mon, 14 Sep 2015 10:49:55 -0400 Subject: download all of Hackage? In-Reply-To: References: Message-ID: On Mon, Sep 14, 2015 at 10:43 AM, Richard Eisenberg wrote: > Is there an easy way to download (but not compile) all of Hackage? I know > of the hackager package, but that's about compiling. I just want a whole > big load of Haskell code to play with. I thought I could find a link on > Hackage to do this, but failed. > There's hackage-mirror, but I note it says it mirrors to S3. -- brandon s allbery kf8nh sine nomine associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From hvriedel at gmail.com Mon Sep 14 15:17:14 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Mon, 14 Sep 2015 17:17:14 +0200 Subject: download all of Hackage? In-Reply-To: (Richard Eisenberg's message of "Mon, 14 Sep 2015 10:43:44 -0400") References: Message-ID: <87egi1ghdx.fsf@gmail.com> On 2015-09-14 at 16:43:44 +0200, Richard Eisenberg wrote: > Is there an easy way to download (but not compile) all of Hackage? I > know of the hackager package, but that's about compiling. I just want > a whole big load of Haskell code to play with. I thought I could find > a link on Hackage to do this, but failed. It's quite easy, you can iterate through the list of package names and call 'cabal get' like e.g. (untested, but I've done this already -- you may need to protect against execution errors) for PKG in $(cabal list --simple | awk '{ print $1 }' | uniq); do cabal get $PKG;done another variant is to construct the URLs based on the output; you can also get a list of packages in JSON format via http://hackage.haskell.org/packages/.json there's many ways to accomplish what you want... From alan.zimm at gmail.com Mon Sep 14 15:19:53 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Mon, 14 Sep 2015 17:19:53 +0200 Subject: download all of Hackage? In-Reply-To: <87egi1ghdx.fsf@gmail.com> References: <87egi1ghdx.fsf@gmail.com> Message-ID: You could clone https://github.com/bitemyapp/hackage-packages Alan On Mon, Sep 14, 2015 at 5:17 PM, Herbert Valerio Riedel wrote: > On 2015-09-14 at 16:43:44 +0200, Richard Eisenberg wrote: > > Is there an easy way to download (but not compile) all of Hackage? I > > know of the hackager package, but that's about compiling. I just want > > a whole big load of Haskell code to play with. I thought I could find > > a link on Hackage to do this, but failed. > > It's quite easy, you can iterate through the list of package names and > call 'cabal get' like e.g. (untested, but I've done this already -- you > may need to protect against execution errors) > > for PKG in $(cabal list --simple | awk '{ print $1 }' | uniq); do cabal > get $PKG;done > > another variant is to construct the URLs based on the output; > > you can also get a list of packages in JSON format via > > http://hackage.haskell.org/packages/.json > > there's many ways to accomplish what you want... > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cma at bitemyapp.com Mon Sep 14 15:50:41 2015 From: cma at bitemyapp.com (Christopher Allen) Date: Mon, 14 Sep 2015 10:50:41 -0500 Subject: download all of Hackage? In-Reply-To: References: <87egi1ghdx.fsf@gmail.com> Message-ID: It's a little out of date, but I've been using that repo I made to do surveys of Haskell code and figure out how frequently things are used. I could really do with a Haskell-source-code-aware grep though. Being able to specify type, data, etc. would be really nice! On Mon, Sep 14, 2015 at 10:19 AM, Alan & Kim Zimmerman wrote: > You could clone https://github.com/bitemyapp/hackage-packages > > Alan > > On Mon, Sep 14, 2015 at 5:17 PM, Herbert Valerio Riedel < > hvriedel at gmail.com> wrote: > >> On 2015-09-14 at 16:43:44 +0200, Richard Eisenberg wrote: >> > Is there an easy way to download (but not compile) all of Hackage? I >> > know of the hackager package, but that's about compiling. I just want >> > a whole big load of Haskell code to play with. I thought I could find >> > a link on Hackage to do this, but failed. >> >> It's quite easy, you can iterate through the list of package names and >> call 'cabal get' like e.g. (untested, but I've done this already -- you >> may need to protect against execution errors) >> >> for PKG in $(cabal list --simple | awk '{ print $1 }' | uniq); do cabal >> get $PKG;done >> >> another variant is to construct the URLs based on the output; >> >> you can also get a list of packages in JSON format via >> >> http://hackage.haskell.org/packages/.json >> >> there's many ways to accomplish what you want... >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -- Chris Allen Currently working on http://haskellbook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Mon Sep 14 15:52:07 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 14 Sep 2015 11:52:07 -0400 Subject: download all of Hackage? In-Reply-To: <87egi1ghdx.fsf@gmail.com> References: <87egi1ghdx.fsf@gmail.com> Message-ID: That's perfect -- thanks! It's already humming away. On Sep 14, 2015, at 11:17 AM, Herbert Valerio Riedel wrote: > for PKG in $(cabal list --simple | awk '{ print $1 }' | uniq); do cabal get $PKG;done From eir at cis.upenn.edu Mon Sep 14 15:59:36 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 14 Sep 2015 11:59:36 -0400 Subject: Unlifted data types In-Reply-To: <55F2909D.60904@ro-che.info> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> <55F2909D.60904@ro-che.info> Message-ID: <954902CB-02FF-4935-B0D5-FA8CDED12C82@cis.upenn.edu> On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka wrote: > On 11/09/15 06:22, Carter Schonwald wrote: >> Would this allow having a strict monoid instance for maybe, given the >> right hinting at the use site? > > That's a fantastic idea, especially if it could be generalized to > Applicative functors, where the problem of "inner laziness" is pervasive. > > But that'd be tricky, because functions have the Lifted kind, and so > <*> would have to be crazily levity-polymorphic. (Or is this not crazy?) No more crazy than other things. Right now, we have (<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b Under this proposal, we would have (ignore the Boxity stuff) (<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). Applicative f => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking. I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this. Richard > > Roman > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From johan.tibell at gmail.com Mon Sep 14 16:03:24 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Mon, 14 Sep 2015 09:03:24 -0700 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: I've given this a yet some more thought. Given this simple core program: f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType] f = \ (ds_dmE :: (#|#) Int Char) -> case ds_dmE of _ [Occ=Dead] { (#_|#) x_amy -> x_amy; (#|_#) ipv_smK -> patError @ Int "UnboxedSum.hs:5:1-15|function f"# } We will get this stg pre-unarise: unarise [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = \r srt:SRT:[0e :-> patError] [ds_svm] case ds_svm of _ [Occ=Dead] { (#_|#) x_svo [Occ=Once] -> x_svo; (#|_#) _ [Occ=Dead] -> patError "UnboxedSum.hs:5:1-15|function f"#; };] What do we want it to look like afterwards? I currently, have this, modeled after unboxed tuples: post-unarise: [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = \r srt:SRT:[0e :-> patError] [ds_gvu ds_gvv] case (#_|#) [ds_gvu ds_gvv] of _ [Occ=Dead] { -- <-- WHAT SHOULD GO HERE? (#_|#) x_svo [Occ=Once] -> x_svo; (#|_#) _ [Occ=Dead] -> patError "UnboxedSum.hs:5:1-15|function f"#; };] Here I have performed the same rewriting of the scrutinee in the case statement as for unboxed tuples, but note that this doesn't quite work, as we don't know which data constructor to apply in "..." in case ... of. In the case of tuples it's easy; there is only one. It seems to me that we just want to rewrite the case altogether into something that looks at the tag field of the data constructor. Also, in stg we use the same DataCon as in core, but after stg the unboxed sum case really only has one constructor (one with the union of all the fields), which makes it awkward to reuse the original DataCon. On Mon, Sep 14, 2015 at 7:27 AM, Ryan Newton wrote: > >> - >> data RepType = UbxTupleRep [UnaryType] >> | UbxSumRep [UnaryType] >> | UnaryRep UnaryType >> >> Not, fully following, but ... this reptype business is orthogonal to > whether you add a normal type to the STG level that models anonymous, > untagged unions, right? > > That is, when using Any for pointer types, they could use indicative > phantom types, like "Any (Union Bool Char)", even if there's not full > support for doing anything useful with (Union Bool Char) by itself. Maybe > the casting machinery could greenlight a cast from Any (Union Bool Char) to > Bool at least? > > There's already the unboxed union itself, (|# #|) , but that's different > than a pointer to a union of types... > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niteria at gmail.com Mon Sep 14 17:04:29 2015 From: niteria at gmail.com (Bartosz Nitka) Date: Mon, 14 Sep 2015 18:04:29 +0100 Subject: Making compilation results deterministic (#4012) Message-ID: Hello, For the past couple of weeks I've been working on making compilation results deterministic. What I'm focusing on right now is the interface file determinism, I don't care about binaries being deterministic. I'd like to give a status update and ask for some advice, since I'm running into issues that I don't have a good way of solving. The first question anyone might ask is how did nondeterminism creep into the compiler. If we're compiling with a single thread there's no reason for the computation to proceed in non deterministic way. I'm fairly certain that the issue originates from lazy loading of interface files. Relevant function: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/typecheck/TcRnMonad.hs;12098c2e70b2a432f4ed675ed72b53a396cb2842$1414-1421. What happens is that if you already have an interface file for a target you're trying to build the computation will proceed differently. Why does lazy loading matter? As you load the interface file it needs to get type-checked and that means it needs to pull some Uniques from a global UniqSupply. It does that in different order resulting in different Unique assignment. As far as I can tell, lazy loading is required for performance, so I abandoned the idea of fixing it. I haven't looked at parallel compilation yet, but I'd expect it to result in different Unique assignment as well. I believe up to this point we're ok. Uniques are non-deterministic, but it shouldn't be a big deal. Uniques should be opaque enough to not affect the order of computation, for example the order of binds considered. But they aren't. Uniques are used in different ways throughout the compiler and they end up reordering things: 1) They have an `Ord` instance: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basicTypes/Unique.hs;12098c2e70b2a432f4ed675ed72b53a396cb2842$190-195 . So far the places it impacts the most are places that use `stronglyConnCompFromEdgedVertices`, because Unique is used as a Node key and the result depends on the order of Nodes being considered. Some examples: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/simplCore/OccurAnal.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$183,646,681,846 https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/rename/RnSource.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$1365 (because Ord for Name uses Unique https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basicTypes/Name.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$410-411 ) I've tried to see what removing it would entail and the changes would be far reaching: https://phabricator.haskell.org/P62. 2) VarEnv, NameEnv are implemented in terms of UniqFM, which is just Data.IntMap with keys being the Unique integer values. The way this bites us is that when UniqFM's get converted to a list they end up being sorted on Unique value. This problem is more widespread than the `stronglyConnCompFromEdgedVertices` issue, there's even a place where it's implicitly depended on: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/nativeGen/RegAlloc/Liveness.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$837-842 . I've tried to fix it by making `toList` return the elements in the order of insertion (https://phabricator.haskell.org/P63), but that turned out to have significant cost. My unscientific benchmark on aeson and text showed 10% compilation time increase. It's possible it can be done in less expensive way, I've tried a couple of approaches, but all of them resulted in 10% time increase. I've also considered to split UniqFM to two different types, one that keeps the ordering, and one that can't `toList`, but I suspect that the cut will not be clean, so I haven't tried that. In some cases we got away with ordering things by OccName where needed: ( https://phabricator.haskell.org/D1073, https://phabricator.haskell.org/D1192), but OccName's don't have to be unique in every case and if we try to make them unique that would make them longer and probably result in even greater slowdown. The instance I've recently looked at doesn't look like it can be solved by sorting by OccName. The code that triggers the problem (simplified from haskell-src-exts): data Decl l = Boring l deriving (Eq) data Binds l = BDecls l [Decl l] -- ^ An ordinary binding group | IPBinds l [IPBind l] -- ^ A binding group for implicit parameters deriving (Eq) data IPBind l = Boring2 l deriving (Eq) The end result is: 4449fe3f8368a2c47b2499a1fb033b6a $fEqBinds_$c==$Binds :: Eq l => Binds l -> Binds l -> Bool {- Arity: 1, HasNoCafRefs, Strictness: , Unfolding: (\ @ l $dEq :: Eq l -> let { $dEq1 :: Eq (Decl l) = $fEqDecl @ l $dEq } in let { $dEq2 :: Eq (IPBind l) = $fEqIPBind @ l $dEq } in \ ds :: Binds l ds1 :: Binds l -> case ds of wild { BDecls a1 a2 -> case ds1 of wild1 { BDecls b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (Decl l) $dEq1 a2 b2 } IPBinds ipv ipv1 -> False } IPBinds a1 a2 -> case ds1 of wild1 { BDecls ipv ipv1 -> False IPBinds b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (IPBind l) $dEq2 a2 b2 } } }) -} vs bb525bf8c0145a5379b3c29e8adb4b18 $fEqBinds_$c==$Binds :: Eq l => Binds l -> Binds l -> Bool {- Arity: 1, HasNoCafRefs, Strictness: , Unfolding: (\ @ l $dEq :: Eq l -> let { $dEq1 :: Eq (IPBind l) = $fEqIPBind @ l $dEq } in let { $dEq2 :: Eq (Decl l) = $fEqDecl @ l $dEq } in \ ds :: Binds l ds1 :: Binds l -> case ds of wild { BDecls a1 a2 -> case ds1 of wild1 { BDecls b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (Decl l) $dEq2 a2 b2 } IPBinds ipv ipv1 -> False } IPBinds a1 a2 -> case ds1 of wild1 { BDecls ipv ipv1 -> False IPBinds b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (IPBind l) $dEq1 a2 b2 } } }) -} This happens because when desugaring dictionaries we do an SCC on Uniques that ends up reordering lets ( https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/deSugar/DsBinds.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$835 ) Now all of the dictionaries have OccName of Eq. I could probably reach deeper into the term to extract enough information (in this instance, the constructor name) to get deterministic ordering, but this feels very ad-hoc and I don't expect it to scale to the whole codebase. Another problem with fixing things in this ad-hoc manner is keeping them fixed. There's nothing preventing people from introducing nondeterminism. One idea that makes it more testable is to test it with different UniqSupply allocation patterns. I've found it useful to compare against UniqSupply that starts at a big number and allocates in decreasing order. Gray codes could be used to generate non-sequential order. The reason I'm posting this is to get some ideas, because at this point I feel stuck, I don't see a good way of achieving the end goal. I hope someone with more intimate GHC knowledge can point out a wrong assumption I've made or suggest an approach I haven't thought of. Cheers, Bartosz -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Mon Sep 14 17:46:02 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Mon, 14 Sep 2015 10:46:02 -0700 Subject: Unlifted data types In-Reply-To: <954902CB-02FF-4935-B0D5-FA8CDED12C82@cis.upenn.edu> References: <1441353701-sup-9422@sabre> <6707b31c94d44af89ba2a90580ac46ce@DB4PR30MB030.064d.mgd.msft.net> <6e2bcecf1a284c62a656e80992e9862e@DB4PR30MB030.064d.mgd.msft.net> <0196B07B-156B-4731-B0A1-CE7A892E0680@cis.upenn.edu> <072d804f206c47aeb49ca7d610d120e5@DB4PR30MB030.064d.mgd.msft.net> <3481E4D1-F4DD-47BA-9818-665F22928CAD@cis.upenn.edu> <55F2909D.60904@ro-che.info> <954902CB-02FF-4935-B0D5-FA8CDED12C82@cis.upenn.edu> Message-ID: <1442252466-sup-305@sabre> I'm not so sure how useful an observation this is, but Dunfield had a paper at this very ICFP "Elaborating Evaluation-Order Polymorphism". He argues that polymorphism over evaluation order should be thought of as a form of intersection type. Edward Excerpts from Richard Eisenberg's message of 2015-09-14 08:59:36 -0700: > > On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka wrote: > > > On 11/09/15 06:22, Carter Schonwald wrote: > >> Would this allow having a strict monoid instance for maybe, given the > >> right hinting at the use site? > > > > That's a fantastic idea, especially if it could be generalized to > > Applicative functors, where the problem of "inner laziness" is pervasive. > > > > But that'd be tricky, because functions have the Lifted kind, and so > > <*> would have to be crazily levity-polymorphic. (Or is this not crazy?) > > No more crazy than other things. Right now, we have > > (<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b > > Under this proposal, we would have (ignore the Boxity stuff) > > (<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) > (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). > Applicative f > => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b > > The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking. > > I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this. > > Richard > > > > > Roman > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From rrnewton at gmail.com Mon Sep 14 18:04:09 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Mon, 14 Sep 2015 14:04:09 -0400 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: > > It seems to me that we just want to rewrite the case altogether into > something that looks at the tag field of the data constructor. Also, in stg > we use the same DataCon as in core, but after stg the unboxed sum case > really only has one constructor (one with the union of all the fields), > which makes it awkward to reuse the original DataCon. > Is there a problem with introducing a totally new datatype at this point in the compile to represent the product (tag, wordish1, ..., wordishN, ptr1 ... ptrM)? Or, if it is an anonymous product, why can't it use existing unboxed sum machinery? Also, as an architecture thing, is there a reason this shouldn't be its own stg->stg pass? (P.S. "wordish" above has a weaselly suffix because as Dan pointed out, some unboxed things are > 64 bits.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnw at newartisans.com Mon Sep 14 18:37:27 2015 From: johnw at newartisans.com (John Wiegley) Date: Mon, 14 Sep 2015 11:37:27 -0700 Subject: download all of Hackage? In-Reply-To: (Brandon Allbery's message of "Mon, 14 Sep 2015 10:49:55 -0400") References: Message-ID: >>>>> Brandon Allbery writes: > There's hackage-mirror, but I note it says it mirrors to S3. It mirrors into a directory as well: hackage-mirror --from="http://hackage.haskell.org" --to="/some/dir" Further, it can incrementally update very quickly. John From mike at izbicki.me Mon Sep 14 19:57:02 2015 From: mike at izbicki.me (Mike Izbicki) Date: Mon, 14 Sep 2015 12:57:02 -0700 Subject: download all of Hackage? In-Reply-To: References: Message-ID: Does anyone know approximately how much disk space all of hackage takes up when compiled? And about how long it takes to compile everything? On Mon, Sep 14, 2015 at 11:37 AM, John Wiegley wrote: >>>>>> Brandon Allbery writes: > >> There's hackage-mirror, but I note it says it mirrors to S3. > > It mirrors into a directory as well: > > hackage-mirror --from="http://hackage.haskell.org" --to="/some/dir" > > Further, it can incrementally update very quickly. > > John > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From alan.zimm at gmail.com Mon Sep 14 20:15:45 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Mon, 14 Sep 2015 22:15:45 +0200 Subject: download all of Hackage? In-Reply-To: References: Message-ID: You will probably not be able to compile everything. If you want a compilable subset, you should probably look at stackage Alan On Mon, Sep 14, 2015 at 9:57 PM, Mike Izbicki wrote: > Does anyone know approximately how much disk space all of hackage > takes up when compiled? And about how long it takes to compile > everything? > > On Mon, Sep 14, 2015 at 11:37 AM, John Wiegley > wrote: > >>>>>> Brandon Allbery writes: > > > >> There's hackage-mirror, but I note it says it mirrors to S3. > > > > It mirrors into a directory as well: > > > > hackage-mirror --from="http://hackage.haskell.org" --to="/some/dir" > > > > Further, it can incrementally update very quickly. > > > > John > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmcf125 at openmailbox.org Mon Sep 14 20:32:37 2015 From: jmcf125 at openmailbox.org (=?utf-8?B?Sm/Do28=?= Miguel) Date: Mon, 14 Sep 2015 21:32:37 +0100 Subject: Cannot have GHC in ARMv6 architecture In-Reply-To: <20150909233806.GA25685@jmcf125-Acer-Arch.home> References: <55EFEC9F.6070001@centrum.cz> <20150909111818.GA1439@jmcf125-Acer-Arch.home> <55F026CC.7090001@centrum.cz> <20150909125926.GB1439@jmcf125-Acer-Arch.home> <55F032C7.9020307@centrum.cz> <20150909142114.GG1439@jmcf125-Acer-Arch.home> <55F05464.8090507@centrum.cz> <20150909233333.GA31923@jmcf125-Acer-Arch.home> <20150909233806.GA25685@jmcf125-Acer-Arch.home> Message-ID: <20150914203237.GA1701@jmcf125-Acer-Arch.home> Hello again, I still can't get the stage 2 binary, as previous messages show. Should I manually use the stage 1 compiler to do it, along with GCC cross-compiler, etc.? Because the page https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/RaspberryPi says that ?The build should go successfully all the way to stage 2?. Why isn't it? Hoping you can continue to help me. All the best, Jo?o Miguel From ndmitchell at gmail.com Mon Sep 14 21:07:14 2015 From: ndmitchell at gmail.com (Neil Mitchell) Date: Mon, 14 Sep 2015 22:07:14 +0100 Subject: Using GHC API to compile Haskell file In-Reply-To: <1441649731-sup-8699@sabre> References: <1440368677-sup-472@sabre> <1441649731-sup-8699@sabre> Message-ID: >> 1) Is there any way to do the two compilations sharing some cached >> state, e.g. loaded packages/.hi files, so each compilation goes >> faster. > > You can, using withTempSession in the GhcMonad. The external package > state will be preserved across calls here, but things put in the HPT > will get thrown out. So as far as I can tell, you are suggesting I basically do getSession in one session, grab the cache bits of the HscEnv, and inject them into the start of the next session with setSession? (withTempSession and all the other session functions just seem to be some variation on that pattern). I tried that, but even storing just hsc_EPS between sessions (which seemed like it should both be something that never changes), causes weird compile failures. Is moving things like hsc_EPS between sessions supported? Or were you suggesting I do something else? >> 2) Is there any way to do the link alone through the GHC API. > > I am confused by your code. There are two ways you can do linking: > > 1. Explicitly specify all of the objects to link together. This > works even if the source files aren't available. > > 2. Run ghc --make. This does dependency analysis to figure out what > objects to link together, but since everything is already compiled, > it just links. > > Your code seems to be trying to do (1) and (2) simultaneously (you set > the mode to OneShot, but then you call load which calls into GhcMake). > > If you want to use (1), stop calling load and call 'oneShot' instead. > If you want to use (2), just reuse your working --make code. > > (BTW, how did I figure this all out? By looking at ghc/Main.hs). Thanks. I have now started down that rabbit hole, and am currently peering around. I haven't yet succeeded, but given what I can see, it just seems like a matter of time. Thanks, Neil From lukexipd at gmail.com Mon Sep 14 22:21:02 2015 From: lukexipd at gmail.com (Luke Iannini) Date: Mon, 14 Sep 2015 15:21:02 -0700 Subject: HEADS UP: Need 7.10.3? In-Reply-To: References: Message-ID: My two showstoppers are https://ghc.haskell.org/trac/ghc/ticket/10568 and https://ghc.haskell.org/trac/ghc/ticket/10672, both of which seem to be already on track for 7.10.3, great! On Mon, Sep 14, 2015 at 7:15 AM, Tuncer Ayaz wrote: > On Mon, Sep 14, 2015 at 3:53 PM, Austin Seipp wrote: > > Hi *, > > > > (This is an email primarily aimed at users reading this list and > > developers who have any interest). > > > > As some of you may know, there's currently a 7.10.3 milestone and > > status page on our wiki: > > > > https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.3 > > > > The basic summary is best captured on the above page: > > > > "We have not yet decided when, or even whether, to release GHC > > 7.10.3. We will do so if (but only if!) we have documented cases of > > "show-stoppers" in 7.10.2. Namely, cases from users where > > > > - You are unable to use 7.10.2 because of some bug > > - There is no reasonable workaround, so you are truly stuck > > - We know how to fix it > > - The fix is not too disruptive; i.e. does not risk introducing a > > raft of new bugs" > > > > That is, we're currently not fully sold on the need for a release. > > However, the milestone and issue page serve as a useful guide, and > > also make it easier to keep track of smaller, point-release worthy > > issues. > > > > So in the wake of the 8.0 roadmap I just sent: If you *need* 7.10.3 > > because the 7.10.x series has a major regression or problem you > > can't work around, let us know! > > > > - Find or file a bug in Trac > > - Make sure it's highest priority > > - Assign it to the 7.10.3 milestone > > - Follow up on this email if possible, or edit it on the status page > > text above - it would be nice to get some public feedback in one place > > about what everyone needs. > > > > Currently we have two bugs on the listed page in the 'show stopper > > category', possibly the same bug, which is a deal-breaker for HERMIT > > I believe. Knowing of anything else would be very useful. > > Would tracking down and fixing some of the reported time and space > regressions qualify as 7.10.3 material? > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukexipd at gmail.com Mon Sep 14 22:47:59 2015 From: lukexipd at gmail.com (Luke Iannini) Date: Mon, 14 Sep 2015 15:47:59 -0700 Subject: HEADS UP: Need 7.10.3? In-Reply-To: References: Message-ID: Oh, and I guess it was implied, but I'm hugely in favor of a release : ) #10568 breaks linking with GLFW on Mac, and #10672 breaks linking with C++ libraries on Windows (Bullet Physics, in this case). We're working on and evangelizing Haskell for game and VR dev heavily right now and those are both directly in the critical path. Cheers Luke On Mon, Sep 14, 2015 at 3:21 PM, Luke Iannini wrote: > My two showstoppers are https://ghc.haskell.org/trac/ghc/ticket/10568 and > https://ghc.haskell.org/trac/ghc/ticket/10672, both of which seem to be > already on track for 7.10.3, great! > > On Mon, Sep 14, 2015 at 7:15 AM, Tuncer Ayaz > wrote: > >> On Mon, Sep 14, 2015 at 3:53 PM, Austin Seipp wrote: >> > Hi *, >> > >> > (This is an email primarily aimed at users reading this list and >> > developers who have any interest). >> > >> > As some of you may know, there's currently a 7.10.3 milestone and >> > status page on our wiki: >> > >> > https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.3 >> > >> > The basic summary is best captured on the above page: >> > >> > "We have not yet decided when, or even whether, to release GHC >> > 7.10.3. We will do so if (but only if!) we have documented cases of >> > "show-stoppers" in 7.10.2. Namely, cases from users where >> > >> > - You are unable to use 7.10.2 because of some bug >> > - There is no reasonable workaround, so you are truly stuck >> > - We know how to fix it >> > - The fix is not too disruptive; i.e. does not risk introducing a >> > raft of new bugs" >> > >> > That is, we're currently not fully sold on the need for a release. >> > However, the milestone and issue page serve as a useful guide, and >> > also make it easier to keep track of smaller, point-release worthy >> > issues. >> > >> > So in the wake of the 8.0 roadmap I just sent: If you *need* 7.10.3 >> > because the 7.10.x series has a major regression or problem you >> > can't work around, let us know! >> > >> > - Find or file a bug in Trac >> > - Make sure it's highest priority >> > - Assign it to the 7.10.3 milestone >> > - Follow up on this email if possible, or edit it on the status page >> > text above - it would be nice to get some public feedback in one place >> > about what everyone needs. >> > >> > Currently we have two bugs on the listed page in the 'show stopper >> > category', possibly the same bug, which is a deal-breaker for HERMIT >> > I believe. Knowing of anything else would be very useful. >> >> Would tracking down and fixing some of the reported time and space >> regressions qualify as 7.10.3 material? >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k-bx at k-bx.com Tue Sep 15 07:40:48 2015 From: k-bx at k-bx.com (Kostiantyn Rybnikov) Date: Tue, 15 Sep 2015 10:40:48 +0300 Subject: download all of Hackage? In-Reply-To: References: Message-ID: Stack tool has an option of a docker image with everything compiled. I think they claim whole image is 8 or 10 GB, please check (and don't forget that's with GHC and other tools inside). 14 ???. 2015 22:57 "Mike Izbicki" ????: > Does anyone know approximately how much disk space all of hackage > takes up when compiled? And about how long it takes to compile > everything? > > On Mon, Sep 14, 2015 at 11:37 AM, John Wiegley > wrote: > >>>>>> Brandon Allbery writes: > > > >> There's hackage-mirror, but I note it says it mirrors to S3. > > > > It mirrors into a directory as well: > > > > hackage-mirror --from="http://hackage.haskell.org" --to="/some/dir" > > > > Further, it can incrementally update very quickly. > > > > John > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Tue Sep 15 07:56:54 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Tue, 15 Sep 2015 00:56:54 -0700 Subject: Using GHC API to compile Haskell file In-Reply-To: References: <1440368677-sup-472@sabre> <1441649731-sup-8699@sabre> Message-ID: <1442303215-sup-4790@sabre> Excerpts from Neil Mitchell's message of 2015-09-14 14:07:14 -0700: > >> 1) Is there any way to do the two compilations sharing some cached > >> state, e.g. loaded packages/.hi files, so each compilation goes > >> faster. > > > > You can, using withTempSession in the GhcMonad. The external package > > state will be preserved across calls here, but things put in the HPT > > will get thrown out. > > So as far as I can tell, you are suggesting I basically do getSession > in one session, grab the cache bits of the HscEnv, and inject them > into the start of the next session with setSession? (withTempSession > and all the other session functions just seem to be some variation on > that pattern). I tried that, but even storing just hsc_EPS between > sessions (which seemed like it should both be something that never > changes), causes weird compile failures. Is moving things like hsc_EPS > between sessions supported? Or were you suggesting I do something > else? No, something a bit different: I'm suggesting that you use this functionality to "fork" a session: so you do some work, getting a session, snapshot the session, do some more work, and then use the snapshot to rollback before the work. I haven't actually tested this, however. Edward From mail at nh2.me Tue Sep 15 08:22:33 2015 From: mail at nh2.me (=?UTF-8?Q?Niklas_Hamb=c3=bcchen?=) Date: Tue, 15 Sep 2015 10:22:33 +0200 Subject: download all of Hackage? In-Reply-To: References: <87egi1ghdx.fsf@gmail.com> Message-ID: <55F7D549.2@nh2.me> Thank you, this is exactly what I needed the other day to do a quick survey of language extensions. On 14/09/15 17:19, Alan & Kim Zimmerman wrote: > You could clone https://github.com/bitemyapp/hackage-packages From svenpanne at gmail.com Tue Sep 15 18:09:59 2015 From: svenpanne at gmail.com (Sven Panne) Date: Tue, 15 Sep 2015 20:09:59 +0200 Subject: Making compilation results deterministic (#4012) In-Reply-To: References: Message-ID: 2015-09-14 19:04 GMT+02:00 Bartosz Nitka : > [...] Uniques are non-deterministic [...] > Just out of curiosity: Why is this the case? Naively, I would assume that you can't say much about the value returned by getKey, but at least I would think that in repeated program runs, the same sequence of values would be produced. Well, unless somehow the values depend on pointer values, which will screw this up because of ASLR. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Tue Sep 15 21:25:51 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Tue, 15 Sep 2015 23:25:51 +0200 Subject: utils/hp2ps/Main.c:88: possible missing break ? In-Reply-To: References: Message-ID: On Thu, May 14, 2015 at 1:40 PM, David Binderman wrote: > Hello there, > > [utils/hp2ps/Main.c:88] -> [utils/hp2ps/Main.c:91]: (warning) Variable > 'iflag' is reassigned > a value before the old one has been used. 'break;' missing? > > Source code is > > switch( *(*argv + 1) ) { > case '-': > iflag = -1; > case '+': > default: > iflag = 1; > } > > Suggest add missing break. > Fixed, thanks. https://phabricator.haskell.org/rGHC325efac29827447402ad93fe99578fd791ffb822 -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Wed Sep 16 09:43:47 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 16 Sep 2015 09:43:47 +0000 Subject: Making compilation results deterministic (#4012) In-Reply-To: References: Message-ID: You?ve described the problem well. Indeed I think you should make a wiki page to articulate the problem, and point to this email (perhaps among others) for more detail. It?s so easy to lose track of email, and so helpful to have a wiki page that always reflects the latest understanding. You say a little bit about the problem you are trying to solve, but not enough. (You can do this on the wiki page.) For example: ? You say you don?t care about binaries. (Good, but why?) ? Do you care about multi-threaded GHC? (I think no) ? Do you care about what happens if you recompile GHC, say with different optimisation settings? (I think no) That would affect order of evaluation, and hence the order of allocation of uniques. ? Do you care about recompiling the same source file with different environments; e.g. different compiler flags, changes in imported interface files. (I think no; these changes should require recompilation) So can you characterise exactly what you DO care about? It might be something like ?when using GHC as a library, with a single ?session?, and I recompile an unchanged source file in an unchanged environment I want to get the same result?. But I think even that is wrong. The reason I?m confused is when you say ?What happens is that if you already have an interface file for a target you're trying to build the computation will proceed differently?. But what do you mean by ?you already have an interface file?? In batch mode, we always load interface files; and with the same source and flags we almost certainly load them in the same order. So perhaps you mean in ?make mode? I?ll hypothesise that you mean ? In ?make mode, with unchanged source I want to get the same output from compiling M.hs if M?s imported interface files have not changed. But even then I?m confused. Under those circumstances, why are we recompiling at all? Do you see what I mean about characterising the problem? Depending on exactly what the problem is, one ?solution? might be to not use ?make mode. But I think I?ve gone as far as I can without a clearer understanding of the problem. (I suggest responding by writing the wiki page, and sending a link.) On the UniqFM question, in a finite map, the way in which keys are compared really, really should not matter. When you turn a UniqFM into a list you may need to canonicalise the list in some way But we shouldn?t mess up finite maps themselves in service of this goal. Better to focus on the canonicalization process; which as you point out may be hard or even impossible as things stand. Many thanks Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Bartosz Nitka Sent: 14 September 2015 18:04 To: ghc-devs at haskell.org Subject: Making compilation results deterministic (#4012) Hello, For the past couple of weeks I've been working on making compilation results deterministic. What I'm focusing on right now is the interface file determinism, I don't care about binaries being deterministic. I'd like to give a status update and ask for some advice, since I'm running into issues that I don't have a good way of solving. The first question anyone might ask is how did nondeterminism creep into the compiler. If we're compiling with a single thread there's no reason for the computation to proceed in non deterministic way. I'm fairly certain that the issue originates from lazy loading of interface files. Relevant function: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/typecheck/TcRnMonad.hs;12098c2e70b2a432f4ed675ed72b53a396cb2842$1414-1421. What happens is that if you already have an interface file for a target you're trying to build the computation will proceed differently. Why does lazy loading matter? As you load the interface file it needs to get type-checked and that means it needs to pull some Uniques from a global UniqSupply. It does that in different order resulting in different Unique assignment. As far as I can tell, lazy loading is required for performance, so I abandoned the idea of fixing it. I haven't looked at parallel compilation yet, but I'd expect it to result in different Unique assignment as well. I believe up to this point we're ok. Uniques are non-deterministic, but it shouldn't be a big deal. Uniques should be opaque enough to not affect the order of computation, for example the order of binds considered. But they aren't. Uniques are used in different ways throughout the compiler and they end up reordering things: 1) They have an `Ord` instance: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basicTypes/Unique.hs;12098c2e70b2a432f4ed675ed72b53a396cb2842$190-195. So far the places it impacts the most are places that use `stronglyConnCompFromEdgedVertices`, because Unique is used as a Node key and the result depends on the order of Nodes being considered. Some examples: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/simplCore/OccurAnal.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$183,646,681,846 https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/rename/RnSource.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$1365 (because Ord for Name uses Unique https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basicTypes/Name.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$410-411) I've tried to see what removing it would entail and the changes would be far reaching: https://phabricator.haskell.org/P62. 2) VarEnv, NameEnv are implemented in terms of UniqFM, which is just Data.IntMap with keys being the Unique integer values. The way this bites us is that when UniqFM's get converted to a list they end up being sorted on Unique value. This problem is more widespread than the `stronglyConnCompFromEdgedVertices` issue, there's even a place where it's implicitly depended on: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/nativeGen/RegAlloc/Liveness.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$837-842. I've tried to fix it by making `toList` return the elements in the order of insertion (https://phabricator.haskell.org/P63), but that turned out to have significant cost. My unscientific benchmark on aeson and text showed 10% compilation time increase. It's possible it can be done in less expensive way, I've tried a couple of approaches, but all of them resulted in 10% time increase. I've also considered to split UniqFM to two different types, one that keeps the ordering, and one that can't `toList`, but I suspect that the cut will not be clean, so I haven't tried that. In some cases we got away with ordering things by OccName where needed: (https://phabricator.haskell.org/D1073, https://phabricator.haskell.org/D1192), but OccName's don't have to be unique in every case and if we try to make them unique that would make them longer and probably result in even greater slowdown. The instance I've recently looked at doesn't look like it can be solved by sorting by OccName. The code that triggers the problem (simplified from haskell-src-exts): data Decl l = Boring l deriving (Eq) data Binds l = BDecls l [Decl l] -- ^ An ordinary binding group | IPBinds l [IPBind l] -- ^ A binding group for implicit parameters deriving (Eq) data IPBind l = Boring2 l deriving (Eq) The end result is: 4449fe3f8368a2c47b2499a1fb033b6a $fEqBinds_$c==$Binds :: Eq l => Binds l -> Binds l -> Bool {- Arity: 1, HasNoCafRefs, Strictness: , Unfolding: (\ @ l $dEq :: Eq l -> let { $dEq1 :: Eq (Decl l) = $fEqDecl @ l $dEq } in let { $dEq2 :: Eq (IPBind l) = $fEqIPBind @ l $dEq } in \ ds :: Binds l ds1 :: Binds l -> case ds of wild { BDecls a1 a2 -> case ds1 of wild1 { BDecls b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (Decl l) $dEq1 a2 b2 } IPBinds ipv ipv1 -> False } IPBinds a1 a2 -> case ds1 of wild1 { BDecls ipv ipv1 -> False IPBinds b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (IPBind l) $dEq2 a2 b2 } } }) -} vs bb525bf8c0145a5379b3c29e8adb4b18 $fEqBinds_$c==$Binds :: Eq l => Binds l -> Binds l -> Bool {- Arity: 1, HasNoCafRefs, Strictness: , Unfolding: (\ @ l $dEq :: Eq l -> let { $dEq1 :: Eq (IPBind l) = $fEqIPBind @ l $dEq } in let { $dEq2 :: Eq (Decl l) = $fEqDecl @ l $dEq } in \ ds :: Binds l ds1 :: Binds l -> case ds of wild { BDecls a1 a2 -> case ds1 of wild1 { BDecls b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (Decl l) $dEq2 a2 b2 } IPBinds ipv ipv1 -> False } IPBinds a1 a2 -> case ds1 of wild1 { BDecls ipv ipv1 -> False IPBinds b1 b2 -> case == @ l $dEq a1 b1 of wild2 { False -> False True -> $fEq[]_$c== @ (IPBind l) $dEq1 a2 b2 } } }) -} This happens because when desugaring dictionaries we do an SCC on Uniques that ends up reordering lets (https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/deSugar/DsBinds.hs;12b0bb6f15caa5b4b01d0330a7a8d23e3c10842c$835) Now all of the dictionaries have OccName of Eq. I could probably reach deeper into the term to extract enough information (in this instance, the constructor name) to get deterministic ordering, but this feels very ad-hoc and I don't expect it to scale to the whole codebase. Another problem with fixing things in this ad-hoc manner is keeping them fixed. There's nothing preventing people from introducing nondeterminism. One idea that makes it more testable is to test it with different UniqSupply allocation patterns. I've found it useful to compare against UniqSupply that starts at a big number and allocates in decreasing order. Gray codes could be used to generate non-sequential order. The reason I'm posting this is to get some ideas, because at this point I feel stuck, I don't see a good way of achieving the end goal. I hope someone with more intimate GHC knowledge can point out a wrong assumption I've made or suggest an approach I haven't thought of. Cheers, Bartosz -------------- next part -------------- An HTML attachment was scrubbed... URL: From malcolm.wallace at me.com Wed Sep 16 09:53:04 2015 From: malcolm.wallace at me.com (Malcolm Wallace) Date: Wed, 16 Sep 2015 10:53:04 +0100 Subject: Making compilation results deterministic (#4012) In-Reply-To: References: Message-ID: On 16 Sep 2015, at 10:43, Simon Peyton Jones wrote: > I?ll hypothesise that you mean > ? In ?make mode, with unchanged source I want to get the same output from compiling M.hs if M?s imported interface files have not changed. > But even then I?m confused. Under those circumstances, why are we recompiling at all? My understanding is that currently, if you build a Haskell project from clean sources with ghc --make, then wipe all the .o/.hi files, and rebuild again from clean, with all the same flags and environment, you are unlikely to end up with identical binaries for either the .o or .hi files. This lack of binary reproducibility is a performance problem within a larger build system (of which the Haskell components are only a part): if the larger build system sees that the Haskell .hi or .o has apparently changed (even though the sources+flags have not changed at all), then many other components that depend on them may be triggered for an unnecessary rebuild. Regards, Malcolm From simonpj at microsoft.com Wed Sep 16 11:18:52 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 16 Sep 2015 11:18:52 +0000 Subject: Making compilation results deterministic (#4012) In-Reply-To: References: Message-ID: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> | My understanding is that currently, if you build a Haskell project | from clean sources with ghc --make, then wipe all the .o/.hi files, | and rebuild again from clean, with all the same flags and environment, | you are unlikely to end up with identical binaries for either the .o | or .hi files Is that right Bartosz? If that's the goal, then can we please say that explicitly on the wiki page? Let's hypothesise that it it's the goal. Then I don't understand why, in a single-threaded GHC, you would get non-det results. Presumably not from lazy-interface-file-loading. After all the same things would be loaded in the same order, no? Before we can propose a solution we need to understand the goal; and having understood the goal we need to understand why the results are non-det right now. Otherwise we risk fixing the wrong problem. Anyway, one step at a time :-) Goal first; then a concrete example that demonstrates the problem; then a hypothesis of what is going wrong... Simon From afarmer at ittc.ku.edu Wed Sep 16 14:50:10 2015 From: afarmer at ittc.ku.edu (Andrew Farmer) Date: Wed, 16 Sep 2015 07:50:10 -0700 Subject: HEADS UP: Need 7.10.3? In-Reply-To: References: Message-ID: As you mentioned, the two show stoppers for HERMIT are #10528 (specifically SPJs commit in comment:15 - see [1]) and #10829 (see D1246). The first disables inlining/rule application in the LHS of rules, the second does the same in the RHS. nofib results for the latter are on the ticket. I've set both to 7.10.3 milestone and high priority... thanks for merging them! [1] bc4b64ca5b99bff6b3d5051b57cb2bc52bd4c841 On Mon, Sep 14, 2015 at 6:53 AM, Austin Seipp wrote: > Hi *, > > (This is an email primarily aimed at users reading this list and > developers who have any interest). > > As some of you may know, there's currently a 7.10.3 milestone and > status page on our wiki: > > https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.3 > > The basic summary is best captured on the above page: > > "We have not yet decided when, or even whether, to release GHC 7.10.3. > We will do so if (but only if!) we have documented cases of > "show-stoppers" in 7.10.2. Namely, cases from users where > > - You are unable to use 7.10.2 because of some bug > - There is no reasonable workaround, so you are truly stuck > - We know how to fix it > - The fix is not too disruptive; i.e. does not risk introducing a > raft of new bugs" > > That is, we're currently not fully sold on the need for a release. > However, the milestone and issue page serve as a useful guide, and > also make it easier to keep track of smaller, point-release worthy > issues. > > So in the wake of the 8.0 roadmap I just sent: If you *need* 7.10.3 > because the 7.10.x series has a major regression or problem you can't > work around, let us know! > > - Find or file a bug in Trac > - Make sure it's highest priority > - Assign it to the 7.10.3 milestone > - Follow up on this email if possible, or edit it on the status page > text above - it would be nice to get some public feedback in one place > about what everyone needs. > > Currently we have two bugs on the listed page in the 'show stopper > category', possibly the same bug, which is a deal-breaker for HERMIT I > believe. Knowing of anything else would be very useful. > > Thanks all! > > -- > Regards, > > Austin Seipp, Haskell Consultant > Well-Typed LLP, http://www.well-typed.com/ > _______________________________________________ > Glasgow-haskell-users mailing list > Glasgow-haskell-users at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users > From hengchu.zhang+ghcdev at gmail.com Thu Sep 17 03:34:00 2015 From: hengchu.zhang+ghcdev at gmail.com (Hengchu Zhang) Date: Wed, 16 Sep 2015 23:34:00 -0400 Subject: Is it possible to build ghc with static linkage to its dependencies? Message-ID: Hi GHC Devs, I'd like to build ghc for an environment (linux x86_64) that I don't fully control, and most importantly libgmp.so on that environment is severely outdated. I was wondering if I can build ghc on another linux x86_64 machine but link in all the dependencies as static libraries instead finding them as dynamic libraries? Thank you! Best, Hengchu -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Thu Sep 17 10:18:12 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 17 Sep 2015 12:18:12 +0200 Subject: Is it possible to build ghc with static linkage to its dependencies? In-Reply-To: References: Message-ID: If it's just libgmp.so that is outdated, you could try uncommenting the line in build.mk.sample that says: libraries/integer-gmp_CONFIGURE_OPTS += --configure-option=--with-intree-gmp On Thu, Sep 17, 2015 at 5:34 AM, Hengchu Zhang < hengchu.zhang+ghcdev at gmail.com> wrote: > Hi GHC Devs, > > I'd like to build ghc for an environment (linux x86_64) that I don't fully > control, and most importantly libgmp.so on that environment is severely > outdated. I was wondering if I can build ghc on another linux x86_64 > machine but link in all the dependencies as static libraries instead > finding them as dynamic libraries? > > Thank you! > > Best, > Hengchu > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlowsd at gmail.com Thu Sep 17 12:37:55 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Thu, 17 Sep 2015 13:37:55 +0100 Subject: Making compilation results deterministic (#4012) In-Reply-To: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> References: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55FAB423.6010700@gmail.com> On 16/09/2015 12:18, Simon Peyton Jones wrote: > > | My understanding is that currently, if you build a Haskell project > | from clean sources with ghc --make, then wipe all the .o/.hi files, > | and rebuild again from clean, with all the same flags and environment, > | you are unlikely to end up with identical binaries for either the .o > | or .hi files > > Is that right Bartosz? If that's the goal, then can we please say that explicitly on the wiki page? > > Let's hypothesise that it it's the goal. Then I don't understand why, in a single-threaded GHC, you would get non-det results. Presumably not from lazy-interface-file-loading. After all the same things would be loaded in the same order, no? Bartosz is going to write a wiki page and answer your earlier questions, but I'll try to go into a bit more detail about this point. We don't fully understand why we get non-deterministic results from a single-threaded GHC with a clean build, however there are things that will change from run to run that might influence compilation, e.g. the contents of directories on disk can change and the names of temporary files will change. We know for sure that having an old .hi file from a previous compilation causes uniques to change, because GHC reads the .hi file and assigns some uniques (yet this should clearly not affect the compilation results). Perhaps we could fix these things, but it's fragile, and furthermore we want to handle multithreaded compilation and --make, both of which make it much harder. So we concluded that it was probably futile to aim for fully-deterministic compilation by making the uniques the same every time, and instead we should try to make compilation deterministic in the face of non-deterministic uniques. This also turns out to be really hard, hence Bartosz' long email about the problems and the things he tried. We don't currently have a good way to reproduce the problem from a completely clean build, however it's easy to reproduce by doing two builds and leaving the .hi files from the first build in place while removing the .o files. You could also reproduce it easily by randomizing the order that uniques are generated in some way. Cheers Simon From niteria at gmail.com Thu Sep 17 15:16:35 2015 From: niteria at gmail.com (Bartosz Nitka) Date: Thu, 17 Sep 2015 16:16:35 +0100 Subject: Making compilation results deterministic (#4012) In-Reply-To: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> References: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> Message-ID: 2015-09-16 12:18 GMT+01:00 Simon Peyton Jones : > > | My understanding is that currently, if you build a Haskell project > | from clean sources with ghc --make, then wipe all the .o/.hi files, > | and rebuild again from clean, with all the same flags and environment, > | you are unlikely to end up with identical binaries for either the .o > | or .hi files > > Is that right Bartosz? If that's the goal, then can we please say that > explicitly on the wiki page? > > Let's hypothesise that it it's the goal. Then I don't understand why, in a > single-threaded GHC, you would get non-det results. Presumably not from > lazy-interface-file-loading. After all the same things would be loaded in > the same order, no? > > Before we can propose a solution we need to understand the goal; and > having understood the goal we need to understand why the results are > non-det right now. Otherwise we risk fixing the wrong problem. > > Anyway, one step at a time :-) Goal first; then a concrete example that > demonstrates the problem; then a hypothesis of what is going wrong... There are two related goals here that I'm trying to tackle at the same time: build speed and binary file compatibility. I've created a wiki and added a concrete example of how the same environment leads to different interface files: https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#Aconcreteexample. Simon Marlow has already explained what goes wrong there and the difference is that the first build doesn't have any old interface files that the second build from the same sources does. This happens often in practice in production environments when people switch branches. Another way to trigger it is to remove all the .o files and leave .hi files before rebuilding. | Depending on exactly what the problem is, one ?solution? might be to not use ?make mode. This is not specific to the --make mode. I've answered other questions on the wiki. | Anyway, one step at a time :-) Goal first; then a concrete example that demonstrates the problem; then a hypothesis of what is going wrong... I'm sorry, I unloaded everything I've learned about this in one post without a clear narrative. Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Sep 17 16:30:32 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 17 Sep 2015 16:30:32 +0000 Subject: Making compilation results deterministic (#4012) In-Reply-To: <55FAB423.6010700@gmail.com> References: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> <55FAB423.6010700@gmail.com> Message-ID: <030584d33b6043acb62bca0e3f58bd31@DB4PR30MB030.064d.mgd.msft.net> | We don't currently have a good way to reproduce the problem from a | completely clean build, however it's easy to reproduce by doing two | builds and leaving the .hi files from the first build in place while | removing the .o files Indeed. But is determinism in the face of .hi file changes Part of the Goal? | So we concluded that it was probably futile to aim for | fully-deterministic compilation by making the uniques the same every | time, and instead we should try to make compilation deterministic in the | face of non-deterministic uniques. Well that is bound to be fragile too. IF the uniques could be made deterministic, for the use-cases that are part of The Goal, then it would be a simple solution. Determinism in the face of non-det uniques. I suggest NOT monkeying with UniqFM or Data.Map. Rather, I think you should focuse on every use of ufmToList. There are a lot of these, and many of them will be totally fine to be non-deterministic. On a case-by-case basis (fragile) I guess you can find some canonical ordering. That might be hard in the case, say, of identical OccNames with different uniques (e.g. ds_343, ds_324). I suppose that in every case we'd have to look for some other way to canonicalise. Ugh. Still, one step at a time. Wiki page would be good. Simon From svenpanne at gmail.com Thu Sep 17 16:46:01 2015 From: svenpanne at gmail.com (Sven Panne) Date: Thu, 17 Sep 2015 18:46:01 +0200 Subject: HEADS UP: Need 7.10.3? In-Reply-To: References: Message-ID: Building Haddock documentation on Windows for larger packages (e.g. OpenGLRaw) is broken in 7.10.2, similar to linking: The reason is once again the silly Windows command line length limitation, so we need response files here, too. Haddock 2.16.1 already has support for this, but this seems to be broken (probably https://github.com/haskell/haddock/commit/9affe8f6b3a9b07367c8c14162aecea8b15856a6 is missing), see the corresponding check in cabal ( https://github.com/haskell/cabal/blob/master/Cabal/Distribution/Simple/Haddock.hs#L470 ). So in a nutshell: We would need a new Haddock release (bundled with GHC 7.10.3) and a new cabal release with support for Haddock response files (in cabal's HEAD, but not yet released). Would this be possible? -------------- next part -------------- An HTML attachment was scrubbed... URL: From austin at well-typed.com Thu Sep 17 23:03:24 2015 From: austin at well-typed.com (Austin Seipp) Date: Thu, 17 Sep 2015 18:03:24 -0500 Subject: GHC Weekly News - 2015/09/17 Message-ID: (This post is available online at https://ghc.haskell.org/trac/ghc/blog/weekly20150917) Hi *, Welcome for the latest entry in the GHC Weekly News. It's been a little while, but here we are! And your author has finally returned from his 8 week sabbatical, too! So without any futher ado, lets get going... == 8.0.1 release roadmap == So `HEAD` has been steaming along pretty smoothly for the past few months now. After talking with Simon last week, we decided that the best course of action would be to release 8.0.1 (a super-major release) sometime around late February, which were the plans for 7.10 (modulo a few weeks due to the FTP debates). The current schedule is roughly: - November: Fork the new `ghc-8.0` STABLE branch - At this point, `master` development will probably slow as we fix bugs. - This gives us 2 months or so until branch, from Today. - This is nice as the branch is close to the first RC. - December: First release candidate - Mid/late February: Final release. "'''Why call it 8.0.1?'''", you ask? Because we have a lot of excellent features planned! I'm particularly partial to Richard's work for merging types and kinds (Phab:D808). But there's a lot of other stuff. For all the nitty gritty details, be sure to check [wiki:Status/GHC-8.0.1 8.0.1 status page] to keep track of everything - it will be our prime source of information and coordination. And be sure to [https://mail.haskell.org/pipermail/ghc-devs/2015-September/009952.html read my email to `ghc-devs`] for more info. === ... and a 7.10.3 release perhaps? === On top of this, we've been wondering if another release in the 7.10 branch should be done. Ben did the release shortly after I left, and for the most part looked pretty great. But there have been some snags, as usual. So we're asking: [https://mail.haskell.org/pipermail/ghc-devs/2015-September/009953.html who needs GHC 7.10.3?] We'd really like to know of any major showstoppers you've found with 7.10 that are preventing you from using it. Especially if you're stuck or there's no clear workaround. Currently, we're still not 100% committed to this course of action (since the release will take time away from other things). However, we'll keep the polls open for a while - so ''please'' get in touch with us if you need it! (Be sure to read my email for specific information.) == List chatter == (Over the past two weeks) - Bartosz Nitka writes to `ghc-devs` about the ongoing work to try and fix deterministic compilation in GHC (the dreaded ticket #4012). There's a very detailed breakdown of the current problems and issues in play, with responses from others - https://mail.haskell.org/pipermail/ghc-devs/2015-September/009964.html - Richard Eisenberg wants to know - how can I download all of `Hackage` to play with it? GHC developers are surely interested in this, so they can find regressions quickly - https://mail.haskell.org/pipermail/ghc-devs/2015-September/009956.html - I wrote to the list about the upcoming tentative 7.10.3 plans, as I mentioned above. https://mail.haskell.org/pipermail/ghc-devs/2015-September/009953.html - I ''also'' wrote to the list about the tentative 8.0.1 plans, too. https://mail.haskell.org/pipermail/ghc-devs/2015-September/009952.html - Johan Tibell asks about his ongoing work for implementing unboxed sum types - in particular, converting unboxed sum types in `StgCmm`. https://mail.haskell.org/pipermail/ghc-devs/2015-September/009926.html - Ryan Scott wrote a proposal for the automatic derivation of `Lift` through GHC's deriving mechanism, specifically for `template-hasekll` users. The response was positive and the code is going through review now (Phab:D1168). https://mail.haskell.org/pipermail/ghc-devs/2015-September/009838.html - Andrew Gibiansky writes in with his own proposal for a new "Argument Do" syntax - a change which would allow `do` to appear in positions without `($)` or parenthesis, essentially changing the parser to insert parens as needed. The code is up at Phabricator for brave souls (Phab:D1219). https://mail.haskell.org/pipermail/ghc-devs/2015-September/009821.html - Edward Yang started a monstrous thread after some discussions at ICFP about a future for ''unlifted'' data types in GHC. These currently exist as special magic, but the proposals included would allow users to declare their own types as unlifted, and make unlifted values more flexible (allowing `newtype` for example). See wiki:UnliftedDataTypes and Edward's thread for more. https://mail.haskell.org/pipermail/ghc-devs/2015-September/009799.html == Noteworthy commits == (Over the past several weeks) - Commit 374457809de343f409fbeea0a885877947a133a2 - '''Injective Type Families''' - Commit 8ecf6d8f7dfee9e5b1844cd196f83f00f3b6b879 - '''Applicative Do notation''' - Commit 6740d70d95cb81cea3859ff847afc61ec439db4f - Use IP-based CallStack in `error` and `undefined` - Commit 43eb1dc52a4d3cbba9617f5a26177b8251d84b6a - Show `MINIMAL` complete definition in GHCi's `:info` - Commit 296bc70b5ff6c853f2782e9ec5aa47a52110345e - Use a response file for linker command line arguments - Commit 4356dacb4a2ae29dfbd7126b25b72d89bb9db1b0 - Forbid annotations when Safe Haskell is enabled - Commit 7b211b4e5a38efca437d76ea442495370da7cc9a - Upgrade GCC/binutils to 5.2.0 release for Windows (i386/amd64) == Closed tickets == (Over the past two weeks) #10834, #10830, #10047, #9943, #1851, #1477, #8229, #8926, #8614, #10777, #8596, #10788, #9500, #9087, #10157, #10866, #10806, #10836, #10849, #10869, #10682, #10863, #10880, #10883, #10787, #8552, #10884, #7305, #5757, #9389, #8689, #10105, #8168, #9925, #10305, #4438, #9710, #10889, #10885, #10825, #10821, #10790, #10781, #9855, #9912, #10033, #9782, #10035, #9976, #10847, and #10865. -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ From ezyang at mit.edu Fri Sep 18 03:33:23 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Thu, 17 Sep 2015 20:33:23 -0700 Subject: Renaming InstalledPackageId Message-ID: <1442545148-sup-5662@sabre> Hello friends, During discussions with many people about Nix-like Cabal, it has emerged that InstalledPackageId is /really/ bad name. Consider: the commonly accepted definition of an InstalledPackageId in Nix is that it is morally a hash of all the inputs to compilation: the source code, the dependencies of the package, and the build configuration. However, a Cabal package can have *multiple* components (e.g. the library, the test suite, etc), each of which has its own 'build-depends' field. The concept of the "dependencies of a package" is simply not well-defined! The "simplification" that Cabal has adopted for a long time is to say that the installed package ID always refers to the library component of a package. [1] But the name InstalledPackageId has caused countless misunderstandings about how dependency resolution is done, even in Cabal's code. [2] I propose that we change the name of InstalledPackageId. The new name should have the following properties: 1. It should not say anything about packages, because it is not well-defined for a package, e.g. it should be something like "ComponentId". 2. It should not say anything about installation, because it is well-defined even before a package is even built. 3. It should some how communicate that it is a hash of the transitive source code (e.g. including dependencies) as well as build parameters. SPJ likes "SourceHash" because it's evocative of this (though I don't like it as much); there may also be a Nix-y term like "Derivation" we can use here. My proposed new name is "ComponentBuildHash", however I am open to other suggestions. I might also be convinced by "InstalledComponentId" (which runs aground (2) but is fairly similar to the old name, and gains points for familiarity.) However, I would like to hear your comments: have a better name? Think this is unnecessary? Please let me know. Edward P.S. With Backpack, the ComponentBuildHash won't even be the primary key into the installed package (to be renamed to a component/unit) database, because a single ComponentBuildHash can be rebuilt multiple times with different instantiations of its holes. So GHC will have some identifier, which we will probably continue to call the 'UnitKey', which is the ComponentBuildHash (entirely Cabal generated) plus extra information about how holes are instantiated (entirely GHC generated). [1] Except when it doesn't: cabal-install currently merges all the dependencies of all components that are being built for a package together and treats that as the sum total dependencies of the package. This causes problems when the test suite for text depends on a testing library which in turn depends on text, c.f. https://github.com/haskell/cabal/issues/960 [2] Here are some bugs caused by confusion of package dependencies versus component dependency: https://github.com/haskell/cabal/issues/2802 Specify components when configuring, not building https://github.com/haskell/cabal/issues/2623 `-j` should build package components in parallel https://github.com/haskell/cabal/issues/1893 Use per-component cabal_macros.h https://github.com/haskell/cabal/issues/1575 Do dependency resolution on a per component basis https://github.com/haskell/cabal/issues/1768 The "benchmark" target dependencies conflict with "executable" targets https://github.com/haskell/cabal/issues/960 Can't build with --enable-tests in presence of circular dependencies From dxld at darkboxed.org Fri Sep 18 04:59:46 2015 From: dxld at darkboxed.org (Daniel =?iso-8859-1?Q?Gr=F6ber?=) Date: Fri, 18 Sep 2015 06:59:46 +0200 Subject: How to get types including constraints out of TypecheckedModule Message-ID: <20150918045945.GA13469@grml> Hi ghc-devs, I have a question and some code to ponder on for you all: In ghc-mod we have this very useful command to get the type of something in the middle of a module, i.e. not a toplevel binder. Essentially we just optain a TypecheckedModule and traverse the tm_typechecked_source and then extract the Types from that in various places. This all works nice and well but we have one problem, namely the contraints are missing from all types. Now I'm sure there's a good reason GHC doesn't keep those inline in the syntax tree but my question is how do I get them? I've created a testcase that demonstrates the problem: $ git clone https://gist.github.com/DanielG/1101b8273f945ba14184 $ cd 1101b8273f945ba14184 $ ghc -package ghc -package ghc-paths GhcTestcase.hs $ ./GhcTestcase Type: a Even though the type of the binder I'm looking at has the type `Num a => a`. --Daniel -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: Digital signature URL: From marlowsd at gmail.com Fri Sep 18 08:20:43 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Fri, 18 Sep 2015 09:20:43 +0100 Subject: Making compilation results deterministic (#4012) In-Reply-To: <030584d33b6043acb62bca0e3f58bd31@DB4PR30MB030.064d.mgd.msft.net> References: <84c030f13d764a66aa3637a377405a55@DB4PR30MB030.064d.mgd.msft.net> <55FAB423.6010700@gmail.com> <030584d33b6043acb62bca0e3f58bd31@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <55FBC95B.9070505@gmail.com> On 17/09/2015 17:30, Simon Peyton Jones wrote: > > | We don't currently have a good way to reproduce the problem from a > | completely clean build, however it's easy to reproduce by doing two > | builds and leaving the .hi files from the first build in place while > | removing the .o files > > Indeed. But is determinism in the face of .hi file changes Part of the Goal? Hopefully the wiki should clarify this. I've just edited it a bit to expand on the motivation and a few other details. https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds Cheers Simon From simonpj at microsoft.com Fri Sep 18 09:22:09 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 18 Sep 2015 09:22:09 +0000 Subject: How to get types including constraints out of TypecheckedModule In-Reply-To: <20150918045945.GA13469@grml> References: <20150918045945.GA13469@grml> Message-ID: I have not looked in detail, but I'm confident that all the info you want is there. For let/where/top-level bindings, the polymorphic binders you want are in the 'AbsBinds' construct. | AbsBinds { -- Binds abstraction; TRANSLATION abs_tvs :: [TyVar], abs_ev_vars :: [EvVar], -- ^ Includes equality constraints -- | AbsBinds only gets used when idL = idR after renaming, -- but these need to be idL's for the collect... code in HsUtil -- to have the right type abs_exports :: [ABExport idL], -- | Evidence bindings -- Why a list? See TcInstDcls -- Note [Typechecking plan for instance declarations] abs_ev_binds :: [TcEvBinds], -- | Typechecked user bindings abs_binds :: LHsBinds idL } The info you want is in the abs_exports field: data ABExport id = ABE { abe_poly :: id -- ^ Any INLINE pragmas is attached to this Id , abe_mono :: id , abe_wrap :: HsWrapper -- ^ See Note [AbsBinds wrappers] -- Shape: (forall abs_tvs. abs_ev_vars => abe_mono) ~ abe_poly , abe_prags :: TcSpecPrags -- ^ SPECIALISE pragmas This pairs the "polymorphic" and "monomorphic" versions of the bound Ids. You'll find the monomorphic one in the bindings in the abs_binds field; and you'll find the very same binder in the abe_mono field of one of the ABExport records. Then the corresponding abe_poly Id is the polymorphic one, the one with type foo :: forall a. Num a => a I hope this is of some help. If so, would you like to update the Commentary (on the GHC wiki) to add the information you wish had been there in the first place? Thanks! Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Daniel | Gr?ber | Sent: 18 September 2015 06:00 | To: ghc-devs at haskell.org | Subject: How to get types including constraints out of TypecheckedModule | | Hi ghc-devs, | | I have a question and some code to ponder on for you all: In ghc-mod we | have | this very useful command to get the type of something in the middle of a | module, | i.e. not a toplevel binder. Essentially we just optain a TypecheckedModule | and | traverse the tm_typechecked_source and then extract the Types from that in | various places. | | This all works nice and well but we have one problem, namely the | contraints are | missing from all types. Now I'm sure there's a good reason GHC doesn't | keep | those inline in the syntax tree but my question is how do I get them? | | I've created a testcase that demonstrates the problem: | | $ git clone | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgist.gith | ub.com%2fDanielG%2f1101b8273f945ba14184&data=01%7c01%7csimonpj%40064d.mgd. | microsoft.com%7c07dcc820e45245b16c6108d2bfe60ca0%7c72f988bf86f141af91ab2d7 | cd011db47%7c1&sdata=40WgQQpssauSfbsTPpDVzhEwwhBHUAwEAGf8%2fADzxxg%3d | $ cd 1101b8273f945ba14184 | $ ghc -package ghc -package ghc-paths GhcTestcase.hs | $ ./GhcTestcase | Type: a | | Even though the type of the binder I'm looking at has the type `Num a => | a`. | | --Daniel From simonpj at microsoft.com Fri Sep 18 09:22:12 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 18 Sep 2015 09:22:12 +0000 Subject: Renaming InstalledPackageId In-Reply-To: <1442545148-sup-5662@sabre> References: <1442545148-sup-5662@sabre> Message-ID: <2485471570b745b0aa294804f9f5fae6@DB4PR30MB030.064d.mgd.msft.net> Good idea! | suggestions. I might also be convinced by "InstalledComponentId" (which I dislike the "installed" part for the reasons you say. And I would really love to have some indication that this is a hash or MD5 sum of the entire INPUT to the compilation process (source code, flags...). That's why I liked ComponentSourceHash; the "source" stresses input; the "hash" connotes the idea of summarising all the source code transitively. Simon | -----Original Message----- | From: cabal-devel [mailto:cabal-devel-bounces at haskell.org] On Behalf Of | Edward Z. Yang | Sent: 18 September 2015 04:33 | To: cabal-devel; ghc-devs | Subject: Renaming InstalledPackageId | | Hello friends, | | During discussions with many people about Nix-like Cabal, it has emerged | that InstalledPackageId is /really/ bad name. Consider: the commonly | accepted definition of an InstalledPackageId in Nix is that it is | morally a hash of all the inputs to compilation: the source code, the | dependencies of the package, and the build configuration. However, a | Cabal package can have *multiple* components (e.g. the library, the test | suite, etc), each of which has its own 'build-depends' field. The | concept of the "dependencies of a package" is simply not well-defined! | | The "simplification" that Cabal has adopted for a long time is to say | that the installed package ID always refers to the library component of | a package. [1] But the name InstalledPackageId has caused countless | misunderstandings about how dependency resolution is done, even in Cabal's | code. [2] | | I propose that we change the name of InstalledPackageId. The new | name should have the following properties: | | 1. It should not say anything about packages, because it is not | well-defined for a package, e.g. it should be something like | "ComponentId". | | 2. It should not say anything about installation, because it is | well-defined even before a package is even built. | | 3. It should some how communicate that it is a hash of the transitive | source code (e.g. including dependencies) as well as build parameters. | SPJ likes "SourceHash" because it's evocative of this (though I don't | like it as much); there may also be a Nix-y term like "Derivation" we | can use here. | | My proposed new name is "ComponentBuildHash", however I am open to other | suggestions. I might also be convinced by "InstalledComponentId" (which | runs aground (2) but is fairly similar to the old name, and gains points | for familiarity.) However, I would like to hear your comments: have a | better name? Think this is unnecessary? Please let me know. | | Edward | | P.S. With Backpack, the ComponentBuildHash won't even be the primary key | into the installed package (to be renamed to a component/unit) database, | because a single ComponentBuildHash can be rebuilt multiple times with | different instantiations of its holes. So GHC will have some | identifier, which we will probably continue to call the 'UnitKey', which | is the ComponentBuildHash (entirely Cabal generated) plus extra | information about how holes are instantiated (entirely GHC generated). | | [1] Except when it doesn't: cabal-install currently merges all the | dependencies | of all components that are being built for a package together and treats | that as the sum total dependencies of the package. This causes problems | when the test suite for text depends on a testing library which in turn | depends on text, c.f. | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f960&data=01%7c01%7csimonpj%40064d.mgd.micro | soft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd011 | db47%7c1&sdata=3sQjxpWQccNyIR43FYonKTAQ3ENahciXLKtauzLKyXk%3d | | [2] Here are some bugs caused by confusion of package dependencies | versus component dependency: | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f2802&data=01%7c01%7csimonpj%40064d.mgd.micr | osoft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd01 | 1db47%7c1&sdata=hcGUnJdwYz9GkKILhc8u9qgQrxTkcGUAqpd6VgW7k5I%3d Specify | components when configuring, not building | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f2623&data=01%7c01%7csimonpj%40064d.mgd.micr | osoft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd01 | 1db47%7c1&sdata=QcllkUWL%2bTakINFCnty%2f30UsFDtb6y4NaNHa72Sy28Y%3d `-j` | should build package components in parallel | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f1893&data=01%7c01%7csimonpj%40064d.mgd.micr | osoft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd01 | 1db47%7c1&sdata=NnRigIN%2fI%2fVrFgpjhZr7TaavIIdom1JBldlw4wlvpyA%3d Use | per-component cabal_macros.h | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f1575&data=01%7c01%7csimonpj%40064d.mgd.micr | osoft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd01 | 1db47%7c1&sdata=xakCm5et0uCGy9TxCoL5GfosGQfGdytzFXGX92Tqe3o%3d Do | dependency resolution on a per component basis | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f1768&data=01%7c01%7csimonpj%40064d.mgd.micr | osoft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd01 | 1db47%7c1&sdata=GCMmO3OZbl9RXswxeVd2d5%2bTu9eQ7DqhNyqwePvg9x0%3d The | "benchmark" target dependencies conflict with "executable" targets | https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.co | m%2fhaskell%2fcabal%2fissues%2f960&data=01%7c01%7csimonpj%40064d.mgd.micro | soft.com%7cee169cf81c354b8200d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd011 | db47%7c1&sdata=3sQjxpWQccNyIR43FYonKTAQ3ENahciXLKtauzLKyXk%3d Can't build | with --enable-tests in presence of circular dependencies | _______________________________________________ | cabal-devel mailing list | cabal-devel at haskell.org | https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haske | ll.org%2fcgi-bin%2fmailman%2flistinfo%2fcabal- | devel&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7cee169cf81c354b8200 | d008d2bfd9ec59%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=0k73G6r0wXJ5qA | HcrbUqpUjKaWvLuEXR%2fQAwF2XwH6c%3d From ben at smart-cactus.org Fri Sep 18 15:21:36 2015 From: ben at smart-cactus.org (Ben Gamari) Date: Fri, 18 Sep 2015 17:21:36 +0200 Subject: Haskell Error Messages In-Reply-To: <2ece273850d841c9ada011cb18e3ad9c@DB4PR30MB030.064d.mgd.msft.net> References: <2ece273850d841c9ada011cb18e3ad9c@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <87lhc392in.fsf@smart-cactus.org> Simon Peyton Jones writes: > Is there currently any planned work around making the haskell error > messages able to support something like the ones in IDRIS, as shown in > David Christianson's talk "A Pretty printer that says what it means" > at HIW? > > Not that I know of, but it would be a Good Thing. > I have been interested in this issue myself and have the beginning of a proposal on how we might be able to do this reasonably painlessly here [1]. At one point I had the beginnings of a proper Wiki page describing the proposal but sadly Trac/Firefox ate it and I've not had the time to attempt a rewrite. Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/ticket/8809#comment:3 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From alan.zimm at gmail.com Fri Sep 18 15:25:08 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Fri, 18 Sep 2015 17:25:08 +0200 Subject: Haskell Error Messages In-Reply-To: <87lhc392in.fsf@smart-cactus.org> References: <2ece273850d841c9ada011cb18e3ad9c@DB4PR30MB030.064d.mgd.msft.net> <87lhc392in.fsf@smart-cactus.org> Message-ID: That's great. My interest is more around having it than doing it, so I would be delighted if it received some attention. Alan On Fri, Sep 18, 2015 at 5:21 PM, Ben Gamari wrote: > Simon Peyton Jones writes: > > > Is there currently any planned work around making the haskell error > > messages able to support something like the ones in IDRIS, as shown in > > David Christianson's talk "A Pretty printer that says what it means" > > at HIW? > > > > Not that I know of, but it would be a Good Thing. > > > I have been interested in this issue myself and have the beginning of a > proposal on how we might be able to do this reasonably painlessly here > [1]. At one point I had the beginnings of a proper Wiki page describing > the proposal but sadly Trac/Firefox ate it and I've not had the time to > attempt a rewrite. > > Cheers, > > - Ben > > > [1] https://ghc.haskell.org/trac/ghc/ticket/8809#comment:3 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Sat Sep 19 21:27:35 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Sat, 19 Sep 2015 22:27:35 +0100 Subject: Proposal: Associating pattern synonyms with types Message-ID: Dear devs, There has been a bit of discussion on #10653 about extending exports to allow users to associate pattern synonyms with types (just like ordinary data constructors). There is a section on the wiki which describes the problem and the proposed behaviour. https://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms#Associatingsynonymswithtypes Any comments would be appreciated before I start the implementation. Please read all the way to the bottom past the examples as their are a few sections of clarification which might answer your question. Matt From simonpj at microsoft.com Tue Sep 22 12:43:43 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 22 Sep 2015 12:43:43 +0000 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: Johan Sorry I?ve been buried. Let?s fix a time for a Skype call if you?d like to chat about this stuff. Quick response to the below. I think that afterwards we want it to look like this: post-unarise f = \r [ ds1::Int# ds2::Ptr ] case ds1 of 0# -> 1# -> ds2 is the thing that contains either an Int or a char; ds1 is the tag that distinguishes htem. Simon From: Johan Tibell [mailto:johan.tibell at gmail.com] Sent: 14 September 2015 17:03 To: Ryan Newton; Simon Peyton Jones Cc: Simon Marlow; ghc-devs at haskell.org Subject: Re: Converting unboxed sum types in StgCmm I've given this a yet some more thought. Given this simple core program: f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType] f = \ (ds_dmE :: (#|#) Int Char) -> case ds_dmE of _ [Occ=Dead] { (#_|#) x_amy -> x_amy; (#|_#) ipv_smK -> patError @ Int "UnboxedSum.hs:5:1-15|function f"# } We will get this stg pre-unarise: unarise [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = \r srt:SRT:[0e :-> patError] [ds_svm] case ds_svm of _ [Occ=Dead] { (#_|#) x_svo [Occ=Once] -> x_svo; (#|_#) _ [Occ=Dead] -> patError "UnboxedSum.hs:5:1-15|function f"#; };] What do we want it to look like afterwards? I currently, have this, modeled after unboxed tuples: post-unarise: [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = \r srt:SRT:[0e :-> patError] [ds_gvu ds_gvv] case (#_|#) [ds_gvu ds_gvv] of _ [Occ=Dead] { -- <-- WHAT SHOULD GO HERE? (#_|#) x_svo [Occ=Once] -> x_svo; (#|_#) _ [Occ=Dead] -> patError "UnboxedSum.hs:5:1-15|function f"#; };] Here I have performed the same rewriting of the scrutinee in the case statement as for unboxed tuples, but note that this doesn't quite work, as we don't know which data constructor to apply in "..." in case ... of. In the case of tuples it's easy; there is only one. It seems to me that we just want to rewrite the case altogether into something that looks at the tag field of the data constructor. Also, in stg we use the same DataCon as in core, but after stg the unboxed sum case really only has one constructor (one with the union of all the fields), which makes it awkward to reuse the original DataCon. On Mon, Sep 14, 2015 at 7:27 AM, Ryan Newton > wrote: * data RepType = UbxTupleRep [UnaryType] | UbxSumRep [UnaryType] | UnaryRep UnaryType Not, fully following, but ... this reptype business is orthogonal to whether you add a normal type to the STG level that models anonymous, untagged unions, right? That is, when using Any for pointer types, they could use indicative phantom types, like "Any (Union Bool Char)", even if there's not full support for doing anything useful with (Union Bool Char) by itself. Maybe the casting machinery could greenlight a cast from Any (Union Bool Char) to Bool at least? There's already the unboxed union itself, (|# #|) , but that's different than a pointer to a union of types... -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.tibell at gmail.com Tue Sep 22 12:46:42 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Tue, 22 Sep 2015 10:16:42 -0230 Subject: Converting unboxed sum types in StgCmm In-Reply-To: References: <4cd78a703e1d4b148dbd707b89f74c59@DB4PR30MB030.064d.mgd.msft.net> <55F67F44.5010501@gmail.com> Message-ID: Yup, I think I have it figured out. Will just need to find the time to write the remaining code. On Tue, Sep 22, 2015 at 10:13 AM, Simon Peyton Jones wrote: > Johan > > > > Sorry I?ve been buried. Let?s fix a time for a Skype call if you?d like > to chat about this stuff. > > > > Quick response to the below. I think that afterwards we want it to look > like this: > > > > post-unarise > > > > f = \r [ ds1::Int# ds2::Ptr ] > > case ds1 of > > 0# -> > > 1# -> > > > > ds2 is the thing that contains either an Int or a char; ds1 is the tag > that distinguishes htem. > > > > Simon > > > > *From:* Johan Tibell [mailto:johan.tibell at gmail.com] > *Sent:* 14 September 2015 17:03 > *To:* Ryan Newton; Simon Peyton Jones > *Cc:* Simon Marlow; ghc-devs at haskell.org > *Subject:* Re: Converting unboxed sum types in StgCmm > > > > I've given this a yet some more thought. Given this simple core program: > > > > f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int > > [GblId, Arity=1, Str=DmdType] > > f = > > \ (ds_dmE :: (#|#) Int Char) -> > > case ds_dmE of _ [Occ=Dead] { > > (#_|#) x_amy -> x_amy; > > (#|_#) ipv_smK -> patError @ Int "UnboxedSum.hs:5:1-15|function f"# > > } > > > > We will get this stg pre-unarise: > > > > unarise > > [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int > > [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = > > \r srt:SRT:[0e :-> patError] [ds_svm] > > case ds_svm of _ [Occ=Dead] { > > (#_|#) x_svo [Occ=Once] -> x_svo; > > (#|_#) _ [Occ=Dead] -> patError > "UnboxedSum.hs:5:1-15|function f"#; > > };] > > > > What do we want it to look like afterwards? I currently, have this, > modeled after unboxed tuples: > > > > post-unarise: > > [f [InlPrag=NOINLINE] :: (#|#) Int Char -> Int > > [GblId, Arity=1, Str=DmdType, Unf=OtherCon []] = > > \r srt:SRT:[0e :-> patError] [ds_gvu ds_gvv] > > case (#_|#) [ds_gvu ds_gvv] of _ [Occ=Dead] { -- <-- WHAT > SHOULD GO HERE? > > (#_|#) x_svo [Occ=Once] -> x_svo; > > (#|_#) _ [Occ=Dead] -> patError > "UnboxedSum.hs:5:1-15|function f"#; > > };] > > > > Here I have performed the same rewriting of the scrutinee in the case > statement as for unboxed tuples, but note that this doesn't quite work, as > we don't know which data constructor to apply in "..." in case ... of. In > the case of tuples it's easy; there is only one. > > > > It seems to me that we just want to rewrite the case altogether into > something that looks at the tag field of the data constructor. Also, in stg > we use the same DataCon as in core, but after stg the unboxed sum case > really only has one constructor (one with the union of all the fields), > which makes it awkward to reuse the original DataCon. > > > > > > > > On Mon, Sep 14, 2015 at 7:27 AM, Ryan Newton wrote: > > > - > data RepType = UbxTupleRep [UnaryType] > | UbxSumRep [UnaryType] > | UnaryRep UnaryType > > Not, fully following, but ... this reptype business is orthogonal to > whether you add a normal type to the STG level that models anonymous, > untagged unions, right? > > > > That is, when using Any for pointer types, they could use indicative > phantom types, like "Any (Union Bool Char)", even if there's not full > support for doing anything useful with (Union Bool Char) by itself. Maybe > the casting machinery could greenlight a cast from Any (Union Bool Char) to > Bool at least? > > > > There's already the unboxed union itself, (|# #|) , but that's different > than a pointer to a union of types... > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at izbicki.me Tue Sep 22 15:11:58 2015 From: mike at izbicki.me (Mike Izbicki) Date: Tue, 22 Sep 2015 08:11:58 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Thanks to everyone who helped me on this project! I've released the final result on github at https://github.com/mikeizbicki/HerbiePlugin#herbie-ghc-plugin On Mon, Sep 7, 2015 at 1:26 PM, Mike Izbicki wrote: > I have another question :) This one relates to Andrew Farmer's answer > a while back on how to build dictionaries given a Concrete type. > Everything I have works when I use my own numeric hierarchy, but when > I use the Prelude's numeric hierarchy, GHC can't find the `Num Float` > instance (or any other builtin instance). > > I created the following function (based on HERMIT's buildDictionary > function) to build my dictionaries (for GHC 7.10.1): > > -- | Given a function name and concrete type, get the needed dictionary. > getDictConcrete :: ModGuts -> String -> Type -> CoreM (Maybe (Expr CoreBndr)) > getDictConcrete guts opstr t = trace ("getDictConcrete "++opstr) $ do > hscenv <- getHscEnv > dflags <- getDynFlags > eps <- liftIO $ hscEPS hscenv > let (opname,ParentIs classname) = getNameParent guts opstr > classType = mkTyConTy $ case lookupNameEnv (eps_PTE eps) classname of > Just (ATyCon t) -> t > Just (AnId _) -> error "loopupNameEnv AnId" > Just (AConLike _) -> error "loopupNameEnv AConLike" > Just (ACoAxiom _) -> error "loopupNameEnv ACoAxiom" > Nothing -> error "getNameParent gutsEnv Nothing" > > dictType = mkAppTy classType t > dictVar = mkGlobalVar > VanillaId > (mkSystemName (mkUnique 'z' 1337) (mkVarOcc $ > "magicDictionaryName")) > dictType > vanillaIdInfo > > bnds <- runTcM guts $ do > loc <- getCtLoc $ GivenOrigin UnkSkol > let nonC = mkNonCanonical $ CtWanted > { ctev_pred = dictType > , ctev_evar = dictVar > , ctev_loc = loc > } > wCs = mkSimpleWC [nonC] > (x, evBinds) <- solveWantedsTcM wCs > bnds <- initDsTc $ dsEvBinds evBinds > > liftIO $ do > putStrLn $ "dictType="++showSDoc dflags (ppr dictType) > putStrLn $ "dictVar="++showSDoc dflags (ppr dictVar) > > putStrLn $ "nonC="++showSDoc dflags (ppr nonC) > putStrLn $ "wCs="++showSDoc dflags (ppr wCs) > putStrLn $ "bnds="++showSDoc dflags (ppr bnds) > putStrLn $ "x="++showSDoc dflags (ppr x) > > return bnds > > case bnds of > [NonRec _ dict] -> return $ Just dict > otherwise -> return Nothing > > > When I use my own numeric class hierarchy, this works great! But when > I use the Prelude numeric hierarchy, this doesn't work for some > reason. In particular, if I pass `+` as the operation I want a > dictionary for on the type `Float`, then the function returns > `Nothing` with the following output: > > getDictConcrete + > dictType=Num Float > dictVar=magicDictionaryName_zlz > nonC=[W] magicDictionaryName_zlz :: Num Float (CNonCanonical) > wCs=WC {wc_simple = > [W] magicDictionaryName_zlz :: Num Float (CNonCanonical)} > bnds=[] > x=WC {wc_simple = > [W] magicDictionaryName_zlz :: Num Float (CNonCanonical)} > > > If I change the `solveWantedTcMs` function to `simplifyInteractive`, > then GHC panics with the following message: > > Top level: > No instance for (GHC.Num.Num GHC.Types.Float) arising from UnkSkol > > Why doesn't the TcM monad know about the `Num Float` instance? > > On Fri, Sep 4, 2015 at 9:18 PM, ?mer Sinan A?acan wrote: >> Typo: "You're parsing your code" I mean "You're passing your code" >> >> 2015-09-05 0:16 GMT-04:00 ?mer Sinan A?acan : >>> Hi Mike, >>> >>> I'll try to hack an example for you some time tomorrow(I'm returning from ICFP >>> and have some long flights ahead of me). >>> >>> But in the meantime, here's a working Core code, generated by GHC: >>> >>> f_rjH :: forall a_alz. Ord a_alz => a_alz -> Bool >>> f_rjH = >>> \ (@ a_aCH) ($dOrd_aCI :: Ord a_aCH) (eta_B1 :: a_aCH) -> >>> == @ a_aCH (GHC.Classes.$p1Ord @ a_aCH $dOrd_aCI) eta_B1 eta_B1 >>> >>> You can clearly see here how Eq dictionary is selected from Ord >>> dicitonary($dOrd_aCI in the example), it's just an application of selector to >>> type and dictionary, that's all. >>> >>> This is generated from this code: >>> >>> {-# NOINLINE f #-} >>> f :: Ord a => a -> Bool >>> f x = x == x >>> >>> Compile it with this: >>> >>> ghc --make -fforce-recomp -O0 -ddump-simpl -ddump-to-file Main.hs >>> -dsuppress-idinfo >>> >>>> Can anyone help me figure this out? Is there any chance this is a bug in how >>>> GHC parses Core? >>> >>> This seems unlikely, because GHC doesn't have a Core parser and there's no Core >>> parsing going on here, you're parsing your Code in the form of AST(CoreExpr, >>> CoreProgram etc. defined in CoreSyn.hs). Did you mean something else and am I >>> misunderstanding? >>> >>> 2015-09-04 19:39 GMT-04:00 Mike Izbicki : >>>> I'm still having trouble creating Core code that can extract >>>> superclass dictionaries from a given dictionary. I suspect the >>>> problem is that I don't actually understand what the Core code to do >>>> this is supposed to look like. I keep getting the errors mentioned >>>> above when I try what I think should work. >>>> >>>> Can anyone help me figure this out? Is there any chance this is a bug >>>> in how GHC parses Core? >>>> >>>> On Tue, Aug 25, 2015 at 9:24 PM, Mike Izbicki wrote: >>>>> The purpose of the plugin is to automatically improve the numerical >>>>> stability of Haskell code. It is supposed to identify numeric >>>>> expressions, then use Herbie (https://github.com/uwplse/herbie) to >>>>> generate a numerically stable version, then rewrite the numerically >>>>> stable version back into the code. The first two steps were really >>>>> easy. It's the last step of inserting back into the code that I'm >>>>> having tons of trouble with. Core is a lot more complicated than I >>>>> thought :) >>>>> >>>>> I'm not sure what you mean by the CoreExpr representation? Here's the >>>>> output of the pretty printer you gave: >>>>> App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp >>>>> Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy >>>>> TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] >>>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>>> {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) >>>>> (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp >>>>> Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy >>>>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>>>> "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try >>>>> = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>>>> [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many >>>>> Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many >>>>> Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy >>>>> (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy >>>>> TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = >>>>> "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = >>>>> }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd >>>>> [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), >>>>> absd = Use Many (UProd [Use Many >>>>> Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] >>>>> (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var >>>>> Id{$dFloating,aBM,TyConApp Floating [TyVarTy >>>>> TyVar{a}],VanillaId,Info{0,SpecInfo [] >>>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>>> {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy >>>>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>>> {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy >>>>> TyVar{a},VanillaId,Info{0,SpecInfo [] >>>>> ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma >>>>> {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = >>>>> Nothing, inl_act = AlwaysActive, inl_rule = >>>>> FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD >>>>> {strd = Lazy, absd = Use Many Used},0}}) >>>>> >>>>> You can find my pretty printer (and all the other code for the plugin) >>>>> at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 >>>>> >>>>> The function getDictMap >>>>> (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) >>>>> is where I'm constructing the dictionaries that are getting inserted >>>>> back into the Core. >>>>> >>>>> On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: >>>>>> It seems like in your App syntax you're having a non-function in function >>>>>> position. You can see this by looking at what failing function >>>>>> (splitFunTy_maybe) is doing: >>>>>> >>>>>> splitFunTy_maybe :: Type -> Maybe (Type, Type) >>>>>> -- ^ Attempts to extract the argument and result types from a type >>>>>> ... (definition is not important) ... >>>>>> >>>>>> Then it's used like this at the error site: >>>>>> >>>>>> (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ >>>>>> splitFunTy_maybe fun_ty >>>>>> >>>>>> In your case this function is returning Nothing and then exceptJust is >>>>>> signalling the panic. >>>>>> >>>>>> Your code looked correct to me, I don't see any problems with that. Maybe you're >>>>>> using something wrong as selectors. Could you paste CoreExpr representation of >>>>>> your program? >>>>>> >>>>>> It may also be the case that the panic is caused by something else, maybe your >>>>>> syntax is invalidating some assumptions/invariants in GHC but it's not >>>>>> immediately checked etc. Working at the Core level is frustrating at times. >>>>>> >>>>>> Can I ask what kind of plugin are you working on? >>>>>> >>>>>> (Btw, how did you generate this representation of AST? Did you write it >>>>>> manually? If you have a pretty-printer, would you mind sharing it?) >>>>>> >>>>>> 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >>>>>>> Thanks ?mer! >>>>>>> >>>>>>> I'm able to get dictionaries for the superclasses of a class now, but >>>>>>> I get an error whenever I try to get a dictionary for a >>>>>>> super-superclass. Here's the Haskell expression I'm working with: >>>>>>> >>>>>>> test1 :: Floating a => a -> a >>>>>>> test1 x1 = x1+x1 >>>>>>> >>>>>>> The original core is: >>>>>>> >>>>>>> + @ a $dNum_aJu x1 x1 >>>>>>> >>>>>>> But my plugin is replacing it with the core: >>>>>>> >>>>>>> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >>>>>>> >>>>>>> The only difference is the way I'm getting the Num dictionary. The >>>>>>> corresponding AST (annotated with variable names and types) is: >>>>>>> >>>>>>> App >>>>>>> (App >>>>>>> (App >>>>>>> (App >>>>>>> (Var +::forall a. Num a => a -> a -> a) >>>>>>> (Type a) >>>>>>> ) >>>>>>> (App >>>>>>> (Var $p1Fractional::forall a. Fractional a => Num a) >>>>>>> (App >>>>>>> (Var $p1Floating::forall a. Floating a => Fractional a) >>>>>>> (Var $dFloating_aJq::Floating a) >>>>>>> ) >>>>>>> ) >>>>>>> ) >>>>>>> (Var x1::'a') >>>>>>> ) >>>>>>> (Var x1::'a') >>>>>>> >>>>>>> When I insert, GHC gives the following error: >>>>>>> >>>>>>> ghc: panic! (the 'impossible' happened) >>>>>>> (GHC version 7.10.1 for x86_64-unknown-linux): >>>>>>> expectJust cpeBody:collect_args >>>>>>> >>>>>>> What am I doing wrong with extracting these super-superclass >>>>>>> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >>>>>>> figure out what it's trying to do, so I'm not sure why it's failing on >>>>>>> my core. >>>>>>> >>>>>>> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>>>>>>> Mike, here's a piece of code that may be helpful to you: >>>>>>>> >>>>>>>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>>>>>>> >>>>>>>> Copy this module to your plugin, it doesn't have any dependencies other than >>>>>>>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>>>>>>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>>>>>>> >>>>>>>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>>>>>>> else we can learn what Core is generated for a given code, and reverse-engineer >>>>>>>> to figure out details. >>>>>>>> >>>>>>>> Hope it helps. >>>>>>>> >>>>>>>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>>>>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>>>>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>>>>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>>>>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>>>>>>> How can I get access to these superclass dictionaries? >>>>>>>>> >>>>>>>>> I don't have a working code for this but this should get you started: >>>>>>>>> >>>>>>>>> let ord_dictionary :: Id = ... >>>>>>>>> ord_class :: Class = ... >>>>>>>>> in >>>>>>>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>>>>>>> >>>>>>>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>>>>>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>>>>>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>>>>>>> >>>>>>>>> I assumed you already have ord_dictionary, it should be passed to your function >>>>>>>>> already if you had `(Ord a) => ` in your function. >>>>>>>>> >>>>>>>>> >>>>>>>>> Now I realized you asked for getting Num from Floating. I think you should >>>>>>>>> follow a similar path except you need two applications, first to get Fractional >>>>>>>>> from Floating and second to get Num from Fractional: >>>>>>>>> >>>>>>>>> mkApps (Var (head (classSCSels fractional_class))) >>>>>>>>> [mkApps (Var (head (classSCSels floating_class))) >>>>>>>>> [Var floating_dictionary]] >>>>>>>>> >>>>>>>>> Return value should be a Num dictionary. >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ben at well-typed.com Wed Sep 23 08:02:56 2015 From: ben at well-typed.com (Ben Gamari) Date: Wed, 23 Sep 2015 10:02:56 +0200 Subject: [RFC] Moving user's guide to ReStructuredText Message-ID: <87pp197ebz.fsf@smart-cactus.org> Hello all! Last year there was a brief thread [1] on this list discussing the choice of markup language used for GHC's users guide. At this point DocBook is showing signs of age, * The format's documentation is ancient and isn't terribly approachable. * Writing XML by hand is a terrible, terrible experience. For this reason a shocking fraction of the users' guide isn't even valid DocBook (although is still accepted by the tools) * The tooling surrounding the format is challenging to bring up on non-Linux platforms * Getting even a simple image displayed consistently in the PDF and HTML output is an exercise in futility [2] There are a few alternatives that we could switch to, * Markdown: While ubiquitous, its syntax isn't nearly expressive enough to accomodate the users guide. * asciidoc: This was the front-runner in [1]. Unfortunately, when I tried to use it in anger on the users guide things pretty quickly fell apart start to come apart. The syntax is sadly not very composable: tasks like nesting a code block inside a list becomes fragile and quite unreadable due to the need for continuation characters (as delimited blocks like code blocks must begin at column 0). Despite this I did manage to get much of the way through an asciidoc-ification of the users guide [2] but only through a great deal of manual fixing. While asciidoc does strive to map one-to-one onto DocBook, my experience is that the converse is not true; a conversion to asciidoc require that we drop some of the finer distinctions between code-like inline elements. For an example of the continuation character issue, see [3]. * ReStructuredText: this was a close second-place in the thread and has a fairly wide user base. The primary implementation, Sphinx, is used by Python, MathJAX, LLVM, Ubuntu, Ceph, Blender, and others. The syntax is fairly similar to Markdown and is at least as expressive as asciidoc. I have converted the entire users guide to ReStructuredText with a modified Pandoc. While some tweaking is still certainly necessary the output from the most-mechanical conversion looks quite good, * HTML (using a modified version of LLVM's theme), http://smart-cactus.org/~ben/ghc-user-manual/html/index.html * PDF produced by xetex (used to get convenient Unicode support), http://smart-cactus.org/~ben/ghc-user-manual/xetex/GHCUsersGuide.pdf * ePub (I know nothing about this format) http://smart-cactus.org/~ben/ghc-user-manual/epub/GHCUsersGuide.epub * Even Github's rendering of the source looks reasonable good, https://github.com/bgamari/ghc/blob/doc-rst/docs/users_guide/ghci.rst Of course, there are a few annoyances: the doctree construct doesn't quite work how one might expect, requiring one to split up files a bit more than one might like. Like asciidoc, there is no good way to express nested inlines, so we still lose some of the expressiveness of DocBook. Another nice advantage here is that Trac has native support [5] for rendering RST which could come in handy when pasting between documents. At this point we are leaning towards going with ReStructuredText: the tooling is much better the DocBook, the format reasonably easy to grok and expressive enough to accomodate the majority of the users guide unmodified. However, we would love to hear what others think. Are there any formats we have overlooked? Are there any objections to going ahead with this? If we want to move forward with ReStructuredText I think we will want to move quickly. While the conversion is mostly automated, there is some amount of manual fiddling necessary to get things formatted nicely. There are a few open Differentials that would need to be amended after the change but I would be happy to help authors through this if necessary. Cheers, - Ben [1] https://mail.haskell.org/pipermail/ghc-devs/2014-October/006599.html [2] https://ghc.haskell.org/trac/ghc/ticket/10416 [3] https://github.com/bgamari/ghc/blob/asciidoc/docs/users_guide/ [4] https://github.com/bgamari/ghc/blame/asciidoc/docs/users_guide/ghci.asciidoc#L2162 [5] http://trac.edgewall.org/wiki/WikiRestructuredText -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From hvr at gnu.org Wed Sep 23 11:17:22 2015 From: hvr at gnu.org (Herbert Valerio Riedel) Date: Wed, 23 Sep 2015 13:17:22 +0200 Subject: ANN: CfN for new Haskell Prime language committee Message-ID: <87k2rh8jwd.fsf@gnu.org> Dear Haskell Community, In short, it's time to assemble a new Haskell Prime language committee. Please refer to the CfN at https://mail.haskell.org/pipermail/haskell-prime/2015-September/003936.html for more details. Cheers, hvr -- PGP fingerprint: 427C B69A AC9D 00F2 A43C AF1C BA3C BA3F FE22 B574 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From eir at cis.upenn.edu Wed Sep 23 12:03:08 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 23 Sep 2015 08:03:08 -0400 Subject: [RFC] Moving user's guide to ReStructuredText In-Reply-To: <87pp197ebz.fsf@smart-cactus.org> References: <87pp197ebz.fsf@smart-cactus.org> Message-ID: <824967DC-EABB-4FC9-AA75-7FF9E7B62B50@cis.upenn.edu> I know nothing about these different formats. But I dislike what we have now, and I have confidence that you *do* know something about these formats. So I'm very happy to take your recommendation on this. +1 Thanks! Richard On Sep 23, 2015, at 4:02 AM, Ben Gamari wrote: > Hello all! > > Last year there was a brief thread [1] on this list discussing the > choice of markup language used for GHC's users guide. At this point > DocBook is showing signs of age, > > * The format's documentation is ancient and isn't terribly approachable. > > * Writing XML by hand is a terrible, terrible experience. For this > reason a shocking fraction of the users' guide isn't even valid > DocBook (although is still accepted by the tools) > > * The tooling surrounding the format is challenging to bring up on > non-Linux platforms > > * Getting even a simple image displayed consistently in the PDF and > HTML output is an exercise in futility [2] > > There are a few alternatives that we could switch to, > > * Markdown: While ubiquitous, its syntax isn't nearly expressive > enough to accomodate the users guide. > > * asciidoc: This was the front-runner in [1]. Unfortunately, when I > tried to use it in anger on the users guide things pretty quickly > fell apart start to come apart. The syntax is sadly not very > composable: tasks like nesting a code block inside a list becomes > fragile and quite unreadable due to the need for continuation > characters (as delimited blocks like code blocks must begin at > column 0). > > Despite this I did manage to get much of the way through an > asciidoc-ification of the users guide [2] but only through a great > deal of manual fixing. While asciidoc does strive to map one-to-one > onto DocBook, my experience is that the converse is not true; a > conversion to asciidoc require that we drop some of the finer > distinctions between code-like inline elements. For an example of > the continuation character issue, see [3]. > > * ReStructuredText: this was a close second-place in the thread and > has a fairly wide user base. The primary implementation, Sphinx, is > used by Python, MathJAX, LLVM, Ubuntu, Ceph, Blender, and others. > The syntax is fairly similar to Markdown and is at least as > expressive as asciidoc. > > I have converted the entire users guide to ReStructuredText with a > modified Pandoc. While some tweaking is still certainly necessary > the output from the most-mechanical conversion looks quite good, > > * HTML (using a modified version of LLVM's theme), > > http://smart-cactus.org/~ben/ghc-user-manual/html/index.html > > * PDF produced by xetex (used to get convenient Unicode support), > > http://smart-cactus.org/~ben/ghc-user-manual/xetex/GHCUsersGuide.pdf > > * ePub (I know nothing about this format) > > http://smart-cactus.org/~ben/ghc-user-manual/epub/GHCUsersGuide.epub > > * Even Github's rendering of the source looks reasonable good, > > https://github.com/bgamari/ghc/blob/doc-rst/docs/users_guide/ghci.rst > > Of course, there are a few annoyances: the doctree construct doesn't > quite work how one might expect, requiring one to split up files a > bit more than one might like. Like asciidoc, there is no good way to > express nested inlines, so we still lose some of the expressiveness > of DocBook. > > Another nice advantage here is that Trac has native support [5] for > rendering RST which could come in handy when pasting between > documents. > > At this point we are leaning towards going with ReStructuredText: the > tooling is much better the DocBook, the format reasonably easy to grok > and expressive enough to accomodate the majority of the users guide > unmodified. > > However, we would love to hear what others think. Are there any formats > we have overlooked? Are there any objections to going ahead with this? > > If we want to move forward with ReStructuredText I think we will want to > move quickly. While the conversion is mostly automated, there is some > amount of manual fiddling necessary to get things formatted nicely. > There are a few open Differentials that would need to be amended after > the change but I would be happy to help authors through this if > necessary. > > Cheers, > > - Ben > > > [1] https://mail.haskell.org/pipermail/ghc-devs/2014-October/006599.html > [2] https://ghc.haskell.org/trac/ghc/ticket/10416 > [3] https://github.com/bgamari/ghc/blob/asciidoc/docs/users_guide/ > [4] https://github.com/bgamari/ghc/blame/asciidoc/docs/users_guide/ghci.asciidoc#L2162 > [5] http://trac.edgewall.org/wiki/WikiRestructuredText > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ben at well-typed.com Wed Sep 23 12:09:42 2015 From: ben at well-typed.com (Ben Gamari) Date: Wed, 23 Sep 2015 14:09:42 +0200 Subject: [RFC] Moving user's guide to ReStructuredText In-Reply-To: <87pp197ebz.fsf@smart-cactus.org> References: <87pp197ebz.fsf@smart-cactus.org> Message-ID: <87mvwd72wp.fsf@smart-cactus.org> Ben Gamari writes: > Hello all! > I have collected the details of this proposal into a Wiki page[1]. Feel free to leave thoughts there as well as this thread. Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/wiki/UsersGuide/MoveFromDocBook -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From harold_carr at hotmail.com Wed Sep 23 14:32:18 2015 From: harold_carr at hotmail.com (harold_carr at hotmail.com) Date: Wed, 23 Sep 2015 07:32:18 -0700 Subject: [RFC] Moving user's guide to ReStructuredText Message-ID: You might consider the Racket documentation ecosystem: http://pkg-build.racket-lang.org/doc/scribble/index.html or http://pkg-build.racket-lang.org/doc/pollen/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cma at bitemyapp.com Wed Sep 23 16:57:38 2015 From: cma at bitemyapp.com (Christopher Allen) Date: Wed, 23 Sep 2015 11:57:38 -0500 Subject: [RFC] Moving user's guide to ReStructuredText In-Reply-To: References: Message-ID: The GHC docs in general could use some love. The navigation isn't that nice. On Wed, Sep 23, 2015 at 9:32 AM, wrote: > You might consider the Racket documentation ecosystem: > > http://pkg-build.racket-lang.org/doc/scribble/index.html > > or > > http://pkg-build.racket-lang.org/doc/pollen/ > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -- Chris Allen Currently working on http://haskellbook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dluposchainsky at googlemail.com Wed Sep 23 17:04:56 2015 From: dluposchainsky at googlemail.com (David Luposchainsky) Date: Wed, 23 Sep 2015 19:04:56 +0200 Subject: [RFC] Moving user's guide to ReStructuredText In-Reply-To: References: Message-ID: <5602DBB8.5040903@gmail.com> I really like the idea of having a readable raw format that renders nice. Editing is awful, yes, but we don't have to do it that often. But what matters much more to me, and in fact most GHC users, is searchability. I don't want to enter the user's guide via Google, and grepping through HTML is not very fruitful. A markdown-ish format gives us a lot of our usual tools to work with the manual. Huge +1 from me. David From iavor.diatchki at gmail.com Wed Sep 23 17:32:15 2015 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Wed, 23 Sep 2015 10:32:15 -0700 Subject: [RFC] Moving user's guide to ReStructuredText In-Reply-To: <87mvwd72wp.fsf@smart-cactus.org> References: <87pp197ebz.fsf@smart-cactus.org> <87mvwd72wp.fsf@smart-cactus.org> Message-ID: Hello, thanks for doing this Ben! +1 for me too. Like others, I don't like editing XML by hand. As to which other format to use, I trust your judgement---I've used a bunch of them, and in my mind they are all fairly similar, and I always have to look up how exactly things work anyway, so whatever seems reasonable. -Iavor On Wed, Sep 23, 2015 at 5:09 AM, Ben Gamari wrote: > Ben Gamari writes: > > > Hello all! > > > I have collected the details of this proposal into a Wiki page[1]. Feel > free to leave thoughts there as well as this thread. > > Cheers, > > - Ben > > > [1] https://ghc.haskell.org/trac/ghc/wiki/UsersGuide/MoveFromDocBook > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From drquigl at tycho.nsa.gov Thu Sep 24 14:22:58 2015 From: drquigl at tycho.nsa.gov (drquigl) Date: Thu, 24 Sep 2015 10:22:58 -0400 Subject: Attempting to Create a semi-official Haskell layer for Open Embedded/Yocto Message-ID: <56040742.60508@tycho.nsa.gov> Hello Everyone, I had asked about cross compilers about 5 months back and other priorities have put that work on hold. However I am once again tasked with trying to get Haskell to build properly within an embedded environment we use. I found the additional pages on the Haskell build system and its architecture and everything it uses and read through all of them. I was wondering if anyone would be interested in helping me work through the problems I'm having with the build of Haskell in Open Embedded. I want to use the most up to date release of Haskell and if we need to push any changes to the build system to facilitate getting this working for everyone I'm happy to do so. I can set up a repository on github with all the appropriate sub modules to make it easy for someone to check out and start trying to build GHC. If anyone is interested in tackling this problem with me I'm happy to arrange a phone call or some other form of meeting to go over the specifics of OE and fill that person or persons in on what I hope to accomplish. If you are interested please let me know and we can try to work on getting better build support for Embedded toolchains. Dave From simonpj at microsoft.com Thu Sep 24 16:17:23 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 24 Sep 2015 16:17:23 +0000 Subject: [GHC] #10487: DeriveGeneric breaks when the same data name is used in different modules In-Reply-To: <066.aff5943af5188901ea5a65928e89ef02@haskell.org> References: <051.f823b016355f38c348182e98da9d4ae1@haskell.org> <066.aff5943af5188901ea5a65928e89ef02@haskell.org> Message-ID: <9fb5a64c837d42c4818024f7d35a08f2@DB4PR30MB030.064d.mgd.msft.net> Can someone fill in the regression-test test-case on the ticket? I assume there is one?? Simon | -----Original Message----- | From: ghc-tickets [mailto:ghc-tickets-bounces at haskell.org] On Behalf Of | GHC | Sent: 24 September 2015 08:51 | Cc: ghc-tickets at haskell.org | Subject: Re: [GHC] #10487: DeriveGeneric breaks when the same data name is | used in different modules | | #10487: DeriveGeneric breaks when the same data name is used in different | modules | -------------------------------------+------------------------------------ | - | Reporter: andreas.abel | Owner: osa1 | Type: bug | Status: closed | Priority: highest | Milestone: 8.0.1 | Component: Compiler | Version: 7.10.1 | Resolution: fixed | Keywords: | Operating System: Unknown/Multiple | Architecture: | | Unknown/Multiple | Type of failure: None/Unknown | Test Case: | Blocked By: | Blocking: | Related Tickets: | Differential Revisions: | Phab:D1081 | -------------------------------------+------------------------------------ | - | Changes (by ezyang): | | * status: new => closed | * resolution: => fixed | | | Comment: | | Pushed. I assume we aren't backporting to 7.10? | | -- | Ticket URL: | | GHC | | The Glasgow Haskell Compiler | _______________________________________________ | ghc-tickets mailing list | ghc-tickets at haskell.org | https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haske | ll.org%2fcgi-bin%2fmailman%2flistinfo%2fghc- | tickets&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c82714a5e33ae4f38 | b09908d2c4b539d7%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Y08eOOpKx37o | TYsnGW2m1pvQGW31Ssq%2fwwBPAwt3nUo%3d From simonpj at microsoft.com Thu Sep 24 16:17:23 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 24 Sep 2015 16:17:23 +0000 Subject: [GHC] #10904: C finalizer may be called on re-used memory In-Reply-To: <061.092d9ef95af1d0eef55cdad3f3059208@haskell.org> References: <046.82d7a624ac5d38e3fd81c14129ab8c8d@haskell.org> <061.092d9ef95af1d0eef55cdad3f3059208@haskell.org> Message-ID: <90bdf4f39b094ddaaa0e31cef2850ce2@DB4PR30MB030.064d.mgd.msft.net> Is there a test case for this? "Test Case" is not filled in Simon | -----Original Message----- | From: ghc-tickets [mailto:ghc-tickets-bounces at haskell.org] On Behalf Of | GHC | Sent: 24 September 2015 10:00 | Cc: ghc-tickets at haskell.org | Subject: Re: [GHC] #10904: C finalizer may be called on re-used memory | | #10904: C finalizer may be called on re-used memory | -------------------------------------+------------------------------------ | - | Reporter: bherzog | Owner: | Type: bug | Status: closed | Priority: normal | Milestone: 7.10.3 | Component: Runtime System | Version: 7.4.1 | Resolution: fixed | Keywords: | Operating System: Linux | Architecture: | | Unknown/Multiple | Type of failure: Runtime crash | Test Case: | Blocked By: | Blocking: | Related Tickets: | Differential Revisions: | Phab:D1275 | -------------------------------------+------------------------------------ | - | Changes (by bgamari): | | * status: merge => closed | * resolution: => fixed | | | -- | Ticket URL: | | GHC | | The Glasgow Haskell Compiler | _______________________________________________ | ghc-tickets mailing list | ghc-tickets at haskell.org | https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haske | ll.org%2fcgi-bin%2fmailman%2flistinfo%2fghc- | tickets&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1ba2f315ae054316 | 9de308d2c4bee103%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=7pPc9z4f3OrC | 3ZFCoItJBxNlDXH5w41tXqTTCGw06kQ%3d From omeragacan at gmail.com Thu Sep 24 16:47:18 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Thu, 24 Sep 2015 12:47:18 -0400 Subject: [GHC] #10487: DeriveGeneric breaks when the same data name is used in different modules In-Reply-To: <9fb5a64c837d42c4818024f7d35a08f2@DB4PR30MB030.064d.mgd.msft.net> References: <051.f823b016355f38c348182e98da9d4ae1@haskell.org> <066.aff5943af5188901ea5a65928e89ef02@haskell.org> <9fb5a64c837d42c4818024f7d35a08f2@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Done. It's be the best if we could add a test case that uses multiple packages, but as far as I could see current test runner isn't supporting this setup. 2015-09-24 12:17 GMT-04:00 Simon Peyton Jones : > Can someone fill in the regression-test test-case on the ticket? I assume there is one?? > > Simon > > | -----Original Message----- > | From: ghc-tickets [mailto:ghc-tickets-bounces at haskell.org] On Behalf Of > | GHC > | Sent: 24 September 2015 08:51 > | Cc: ghc-tickets at haskell.org > | Subject: Re: [GHC] #10487: DeriveGeneric breaks when the same data name is > | used in different modules > | > | #10487: DeriveGeneric breaks when the same data name is used in different > | modules > | -------------------------------------+------------------------------------ > | - > | Reporter: andreas.abel | Owner: osa1 > | Type: bug | Status: closed > | Priority: highest | Milestone: 8.0.1 > | Component: Compiler | Version: 7.10.1 > | Resolution: fixed | Keywords: > | Operating System: Unknown/Multiple | Architecture: > | | Unknown/Multiple > | Type of failure: None/Unknown | Test Case: > | Blocked By: | Blocking: > | Related Tickets: | Differential Revisions: > | Phab:D1081 > | -------------------------------------+------------------------------------ > | - > | Changes (by ezyang): > | > | * status: new => closed > | * resolution: => fixed > | > | > | Comment: > | > | Pushed. I assume we aren't backporting to 7.10? > | > | -- > | Ticket URL: > | | ll.org%2ftrac%2fghc%2fticket%2f10487%23comment%3a19&data=01%7c01%7csimonpj > | %40064d.mgd.microsoft.com%7c82714a5e33ae4f38b09908d2c4b539d7%7c72f988bf86f > | 141af91ab2d7cd011db47%7c1&sdata=CtjXUTaEpVHihCDcvgifJ%2fTSw30niJxDDFsFqE3m > | ykY%3d> > | GHC > | | ll.org%2fghc%2f&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c82714a5e > | 33ae4f38b09908d2c4b539d7%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=76x0 > | GO1GY8YfHiI7vNNS7U%2b9XTkUVU72nnq76N4V87o%3d> > | The Glasgow Haskell Compiler > | _______________________________________________ > | ghc-tickets mailing list > | ghc-tickets at haskell.org > | https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haske > | ll.org%2fcgi-bin%2fmailman%2flistinfo%2fghc- > | tickets&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c82714a5e33ae4f38 > | b09908d2c4b539d7%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Y08eOOpKx37o > | TYsnGW2m1pvQGW31Ssq%2fwwBPAwt3nUo%3d > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From dxld at darkboxed.org Fri Sep 25 05:40:18 2015 From: dxld at darkboxed.org (Daniel =?iso-8859-1?Q?Gr=F6ber?=) Date: Fri, 25 Sep 2015 07:40:18 +0200 Subject: How to get types including constraints out of TypecheckedModule In-Reply-To: References: <20150918045945.GA13469@grml> Message-ID: <20150925054017.GA7907@grml> (whoops, didn't send this to the list) On Fri, Sep 18, 2015 at 09:22:09AM +0000, Simon Peyton Jones wrote: > I have not looked in detail, but I'm confident that all the info you > want is there. For let/where/top-level bindings, the polymorphic > binders you want are in the 'AbsBinds' construct. > > | AbsBinds { -- Binds abstraction; TRANSLATION > abs_tvs :: [TyVar], > abs_ev_vars :: [EvVar], -- ^ Includes equality constraints > > -- | AbsBinds only gets used when idL = idR after renaming, > -- but these need to be idL's for the collect... code in HsUtil > -- to have the right type > abs_exports :: [ABExport idL], > > -- | Evidence bindings > -- Why a list? See TcInstDcls > -- Note [Typechecking plan for instance declarations] > abs_ev_binds :: [TcEvBinds], > > -- | Typechecked user bindings > abs_binds :: LHsBinds idL > } Awesome that's exactly what I was looking for. I noticed the constructor before but it looked like that's just for type declarations and not for inferred types coming from the typechecker. > The info you want is in the abs_exports field: > > data ABExport id > = ABE { abe_poly :: id -- ^ Any INLINE pragmas is attached to this Id > , abe_mono :: id > , abe_wrap :: HsWrapper -- ^ See Note [AbsBinds wrappers] > -- Shape: (forall abs_tvs. abs_ev_vars => abe_mono) ~ abe_poly > , abe_prags :: TcSpecPrags -- ^ SPECIALISE pragmas > > This pairs the "polymorphic" and "monomorphic" versions of the bound > Ids. You'll find the monomorphic one in the bindings in the > abs_binds field; and you'll find the very same binder in the abe_mono > field of one of the ABExport records. Then the corresponding abe_poly > Id is the polymorphic one, the one with type > foo :: forall a. Num a => a What if I have a binder: (foo, bar) = (show, show) I can now get the polymorphic types of foo and bar respectively but how do I'm not sure how I'm meant to zip up the monomorphic type of `pat_rhs_ty`, i.e. the type of the whole rhs, with the polymorphic types in `abs_exports`. I mean I could just match up the type variables and pull the constraints in by hand but it just seems like there already ought to be a way to do this somewhere? Alternatively my strategy to do this would be to build a map from TyVar to the constraint and then going over the monomorphic types and for each TyVar that's mentioned in the monomorphic type pull in the relevant constraint and apply those to the monomorphic type. Does that sound sensible at all? I have no idea how equality constraints and all that fancy stuff are encoded so I hope that doesn't need any special handling ;) > I hope this is of some help. If so, would you like to update the > Commentary (on the GHC wiki) to add the information you wish had been > there in the first place? Thanks! Very much so, thanks :) I'd love to put this in the commentary somewhere but looking at https://ghc.haskell.org/trac/ghc/wiki/Commentary I can't actually find anywhere this would fit in and it just seems like an awefully specific thing to put there. Maybe it would be better to improve the comments instead for the next person that goes looking thoguth the source trying to figure out where the types end up. For example the comment on AbsBinds could mention that this is where the typechecker stores inferred constrainst or something like that. --Daniel -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: Digital signature URL: From simonpj at microsoft.com Fri Sep 25 07:42:29 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 25 Sep 2015 07:42:29 +0000 Subject: How to get types including constraints out of TypecheckedModule In-Reply-To: <20150925043133.GA5754@grml> References: <20150918045945.GA13469@grml> <20150925043133.GA5754@grml> Message-ID: <68b03b3244ea43afb51933c4819c3ad1@DB4PR30MB030.064d.mgd.msft.net> | What if I have a binder: | | (foo, bar) = (show, show) | | I can now get the polymorphic types of foo and bar respectively but | how do I'm not sure how I'm meant to zip up the monomorphic type of | `pat_rhs_ty`, i.e. the type of the whole rhs, with the polymorphic | types in `abs_exports`. | | I mean I could just match up the type variables and pull the | constraints in by hand but it just seems like there already ought to | be a way to do this somewhere? I'm sorry I don't understand the question. For the above pattern binding you'll get something like AbsBinds { abe_tvs = [a,b] abe_ev_vars = [ d1 :: Show a, d2 :: Show b ] abe_exports = [ ABE { abe_poly = foo :: forall a. Show a => a->String , abe_mono = foo_m :: a->String , abe_wrap = ... } ABE { abe_poly :: bar :: forall b. Show b => b->String abe_mono = bar_m :: b->String ... } ] abe_binds = { (foo_m, bar_m) = (show a d1, show b d2) } } Simon | | Alternatively my strategy to do this would be to build a map from | TyVar to the constraint and then going over the monomorphic types and | for each TyVar that's mentioned in the monomorphic type pull in the | relevant constraint and apply those to the monomorphic type. Does that | sound sensible at all? I have no idea how equality constraints and all | that fancy stuff are encoded so I hope that doesn't need any special | handling ;) | | > I hope this is of some help. If so, would you like to update the | > Commentary (on the GHC wiki) to add the information you wish had | been | > there in the first place? Thanks! | | Very much so, thanks :) | | I'd love to put this in the commentary somewhere but looking at | https://ghc.haskell.org/trac/ghc/wiki/Commentary I can't actually find | anywhere this would fit in and it just seems like an awefully specific | thing to put there. Maybe it would be better to improve the comments | instead for the next person that goes looking thoguth the source | trying to figure out where the types end up. | | For example the comment on AbsBinds could mention that this is where | the typechecker stores inferred constrainst or something like that. | | --Daniel From david.feuer at gmail.com Fri Sep 25 16:06:22 2015 From: david.feuer at gmail.com (David Feuer) Date: Fri, 25 Sep 2015 12:06:22 -0400 Subject: MIN_VERSION macros Message-ID: Cabal defines MIN_VERSION_* macros that allow CPP in a Haskell source file to get information about the versions of the packages that module is being compiled against. Unfortunately, these macros are not available when not compiling with cabal, so packages must either 1. Insist on cabal compilation. This is not very friendly to developers. 2. Make "pessimistic" assumptions, assuming that all the packages are old. This makes it annoying to test new features while also leading to compilation or run-time failures when packages have removed it changed features. 3. Attempt to guess the version based on the GHC version. This works reasonably well for base, ghc-prim, containers, etc., but not so well/at all for others. Would there be some way to get GHC itself to provide these macros to all modules that request CPP? David Feuer -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at seidel.io Fri Sep 25 16:18:31 2015 From: eric at seidel.io (Eric Seidel) Date: Fri, 25 Sep 2015 09:18:31 -0700 Subject: MIN_VERSION macros In-Reply-To: References: Message-ID: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> I've been meaning to ask about this as well. It also forces tools like ghc-mod and hdevtools to be cabal-aware, which is an unnecessary source of complexity IMO. GHC certainly has enough information to generate these macros, as it knows which packages (and versions) it's compiling against. I think it's just a matter of adding the logic. I would love to see the MIN_VERSION macros moved to GHC. Eric On Fri, Sep 25, 2015, at 09:06, David Feuer wrote: > Cabal defines MIN_VERSION_* macros that allow CPP in a Haskell source > file > to get information about the versions of the packages that module is > being > compiled against. Unfortunately, these macros are not available when not > compiling with cabal, so packages must either > > 1. Insist on cabal compilation. This is not very friendly to developers. > 2. Make "pessimistic" assumptions, assuming that all the packages are > old. > This makes it annoying to test new features while also leading to > compilation or run-time failures when packages have removed it changed > features. > 3. Attempt to guess the version based on the GHC version. This works > reasonably well for base, ghc-prim, containers, etc., but not so well/at > all for others. > > Would there be some way to get GHC itself to provide these macros to all > modules that request CPP? > > David Feuer > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Fri Sep 25 18:48:52 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Fri, 25 Sep 2015 14:48:52 -0400 Subject: MIN_VERSION macros In-Reply-To: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> References: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> Message-ID: I've run into this issue, too. Post a feature request! I can't imagine it's too hard to implement. Richard On Sep 25, 2015, at 12:18 PM, Eric Seidel wrote: > I've been meaning to ask about this as well. It also forces tools like > ghc-mod and hdevtools to be cabal-aware, which is an unnecessary source > of complexity IMO. > > GHC certainly has enough information to generate these macros, as it > knows which packages (and versions) it's compiling against. I think it's > just a matter of adding the logic. > > I would love to see the MIN_VERSION macros moved to GHC. > > Eric > > On Fri, Sep 25, 2015, at 09:06, David Feuer wrote: >> Cabal defines MIN_VERSION_* macros that allow CPP in a Haskell source >> file >> to get information about the versions of the packages that module is >> being >> compiled against. Unfortunately, these macros are not available when not >> compiling with cabal, so packages must either >> >> 1. Insist on cabal compilation. This is not very friendly to developers. >> 2. Make "pessimistic" assumptions, assuming that all the packages are >> old. >> This makes it annoying to test new features while also leading to >> compilation or run-time failures when packages have removed it changed >> features. >> 3. Attempt to guess the version based on the GHC version. This works >> reasonably well for base, ghc-prim, containers, etc., but not so well/at >> all for others. >> >> Would there be some way to get GHC itself to provide these macros to all >> modules that request CPP? >> >> David Feuer >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From hvriedel at gmail.com Fri Sep 25 19:06:58 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Fri, 25 Sep 2015 21:06:58 +0200 Subject: MIN_VERSION macros In-Reply-To: (Richard Eisenberg's message of "Fri, 25 Sep 2015 14:48:52 -0400") References: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> Message-ID: <87bncqgvxp.fsf@gmail.com> On 2015-09-25 at 20:48:52 +0200, Richard Eisenberg wrote: > I've run into this issue, too. Post a feature request! I can't imagine > it's too hard to implement. For the current external CPP we'll probably have to create a temporary file with the definitions (just like cabal does) to -include (as afaik you can't easily pass function macros via `-D` to cpp). However, Anthony has mentioned he's working on an embeddable CPP impl over at https://www.reddit.com/r/haskell/comments/3m1wcs/call_for_nominations_haskell_prime_language/cvbp5ym which would allow to keep the currently in-scope MIN_VERSION_...() definitions in-memory w/o a temporary file. Finally, Cabal would have to be adapted to wrap the definitions with `#ifndef`s (or omit those altogether when calling a recent enough GHC) From rwbarton at gmail.com Fri Sep 25 19:36:48 2015 From: rwbarton at gmail.com (Reid Barton) Date: Fri, 25 Sep 2015 15:36:48 -0400 Subject: MIN_VERSION macros In-Reply-To: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> References: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> Message-ID: On Fri, Sep 25, 2015 at 12:18 PM, Eric Seidel wrote: > I've been meaning to ask about this as well. It also forces tools like > ghc-mod and hdevtools to be cabal-aware, which is an unnecessary source > of complexity IMO. > This would certainly be nice, but... GHC certainly has enough information to generate these macros, as it > knows which packages (and versions) it's compiling against. > It knows at some point, but it doesn't necessarily know before parsing the module, at which point it is too late. I can have two versions of a package A, and two other packages B and C that depend on different versions of A, and depending on whether a module M uses package B or package C, M will see different versions of package A automatically. This is all slightly magical, and I have to say I don't entirely understand how GHC decides which versions to expose in general, but that's how GHC works today and it's quite convenient. GHC could provide MIN_VERSION_* macros for packages that have had their versions specified with -package or similar flags (which is how Cabal invokes GHC). That would go only a small way towards the original goals though. (Also, I wonder how MIN_VERSION_* fits into a Backpack world...) Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Fri Sep 25 20:09:58 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 25 Sep 2015 13:09:58 -0700 Subject: MIN_VERSION macros In-Reply-To: References: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> Message-ID: <1443211530-sup-302@sabre> Excerpts from Reid Barton's message of 2015-09-25 12:36:48 -0700: > It knows at some point, but it doesn't necessarily know before parsing the > module, at which point it is too late. I can have two versions of a package > A, and two other packages B and C that depend on different versions of A, > and depending on whether a module M uses package B or package C, M will see > different versions of package A automatically. This is all slightly > magical, and I have to say I don't entirely understand how GHC decides > which versions to expose in general, but that's how GHC works today and > it's quite convenient. Well, this is half true. The main "problem" is that GHC is actually a lot more flexible than Cabal's model allows: Cabal enforces that for any package, there is only one version of it in a program. But GHC can link any combination of packages (in GHC 7.8, it could link one instance of a package per package name and version; in GHC 7.10, it can link arbitrary instances together as long as they have distinct version names.) But I don't think this is a problem... > GHC could provide MIN_VERSION_* macros for packages that have had their > versions specified with -package or similar flags (which is how Cabal > invokes GHC). That would go only a small way towards the original goals > though. This is exactly what the MIN_VERSION_* macros should do, and you can generalize it to work even without -package: you get macros for EXPOSED packages which are available for import. This says *nothing* about the transitive dependencies of the packages you're depending on, but it's more reasonable to have "one package, one version" invariant, because having multiple versions of the package exposed would cause a module name to be ambiguous (and unusable.) > (Also, I wonder how MIN_VERSION_* fits into a Backpack world...) We have to support version bounds for BC, so... as well as BC can be :) Edward From rwbarton at gmail.com Fri Sep 25 22:15:09 2015 From: rwbarton at gmail.com (Reid Barton) Date: Fri, 25 Sep 2015 18:15:09 -0400 Subject: MIN_VERSION macros In-Reply-To: <1443211530-sup-302@sabre> References: <1443197911.1703862.393595953.081AB0FF@webmail.messagingengine.com> <1443211530-sup-302@sabre> Message-ID: On Fri, Sep 25, 2015 at 4:09 PM, Edward Z. Yang wrote: > Excerpts from Reid Barton's message of 2015-09-25 12:36:48 -0700: > > GHC could provide MIN_VERSION_* macros for packages that have had their > > versions specified with -package or similar flags (which is how Cabal > > invokes GHC). That would go only a small way towards the original goals > > though. > > This is exactly what the MIN_VERSION_* macros should do, and you can > generalize it to work even without -package: you get macros for EXPOSED > packages which are available for import. This says *nothing* about > the transitive dependencies of the packages you're depending on, but > it's more reasonable to have "one package, one version" invariant, > because having multiple versions of the package exposed would cause > a module name to be ambiguous (and unusable.) Oh, I see. I had always assumed that GHC had some kind of solver to try to pick compatible versions of packages, but having done some experiments, I see that it always picks the newest exposed version of each direct dependency. So we can indeed define MIN_VERSION_* macros in accordance with the newest exposed version of each package. There are still some edge cases, notably: if package foo reexports the contents of some modules from package bar, and the API of these modules changes between two versions of package bar, then you cannot reliably use MIN_VERSION_bar to detect these API changes in a module that imports the reexports from package foo (since the newest installed foo might not be built against the newest installed bar). In the more restrictive Cabal model, you can reliably do this of course. So it could break in an existing project. However this kind of situation (where the API of a package depends on the version of its dependencies) should hopefully be fairly rare in practice. Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From hvr at gnu.org Mon Sep 28 07:07:15 2015 From: hvr at gnu.org (Herbert Valerio Riedel) Date: Mon, 28 Sep 2015 09:07:15 +0200 Subject: Status of deprecation/warning features (was: Monad of no `return` Proposal (MRP): Moving `return` out of `Monad`) In-Reply-To: (David Feuer's message of "Sun, 27 Sep 2015 22:56:13 -0400") References: <87io6zmr2x.fsf@gnu.org> <56086C23.2060101@gmail.com> <8F1DB6DE-DFC3-4EAD-AD2C-00A95A50A028@cis.upenn.edu> Message-ID: <87si5zvx7g.fsf_-_@gnu.org> On 2015-09-28 at 04:56:13 +0200, David Feuer wrote: [...] > That's an excellent idea, and I think it makes sense to offer it at the > module level as well as the class level. Just change DEPRECATED to REMOVED > when it's actually removed. Speaking of such, has the deprecated export > proposal made any headway? As far as the in-flight library proposals are concerned, I plan to rather have hardwired specific warnings implemented in the style of the AMP warnings as it's crucial to have them available for the GHC 8.0 release to have the foundations for the migration plans in place. We can generalize those feature later-on (maybe even in time for GHC 8.0 if we're lucky). ---- As for the more general facilities the short answer is: They're stalled! Volunteers wanted! There's the specification over at - https://ghc.haskell.org/trac/ghc/wiki/Design/MethodDeprecations and there are 3 somewhat connected/related warning-feature tickets which ought to be co-designed to avoid obstructing each other (if there's any risk there). - https://ghc.haskell.org/trac/ghc/ticket/10071 - https://ghc.haskell.org/trac/ghc/ticket/2119 - https://ghc.haskell.org/trac/ghc/ticket/4879 For #10071 there's a very modest start at implementing this, which just modifies the parser but doesn't go much farther (see wip/T10071 branch) There's an old patch for #4879 which still works over at - https://phabricator.haskell.org/D638 Cheers, hvr From chrisaupshaw at gmail.com Mon Sep 28 20:54:59 2015 From: chrisaupshaw at gmail.com (Chris Upshaw) Date: Mon, 28 Sep 2015 20:54:59 +0000 Subject: [GHC] #10913: deprecate and then remove -fwarn-hi-shadowing Message-ID: It seems like we lost the code that -fwarn-hi-shadow switched during the add of hierarchical modules in 5.0. And while the hierarchical system makes the common way .hi files could get shadowed impossible, you can still get .hi file shadowing if you use the -i flag on multiple folders. Do we care? It is a potential source of very hard to find bugs, but it is also kind of a pathological case with how the modern module system works. -------------- next part -------------- An HTML attachment was scrubbed... URL: