From alain.odea at gmail.com Sat Aug 1 15:49:33 2015 From: alain.odea at gmail.com (Alain O'Dea) Date: Sat, 1 Aug 2015 13:19:33 -0230 Subject: Building GHC master with GHC 7.6 In-Reply-To: <87615da49i.fsf@smart-cactus.org> References: <87615da49i.fsf@smart-cactus.org> Message-ID: On 21 July 2015 at 13:31, Ben Gamari wrote: > > Hello everyone, > > Earlier today I merged a clean-up [1] to the master branch which > removed some #ifdefs which ensured that the tree could be built with GHC > 7.6 > > Thomas Miedema correctly pointed out that some nightly builders may not > be on GHC 7.8 yet. Do any of the nightly builders call in this category? > If so, would it be possible to upgrade? > > Cheers, > > - Ben > > > [1] https://phabricator.haskell.org/D904 Hi Ben, Thomas notified me and I've upgraded the bootsrap GHC on my SmartOS builders to 7.8.4. The next nightly build runs on SmartOS should run as expected. Best, Alain -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at smart-cactus.org Sun Aug 2 09:41:29 2015 From: ben at smart-cactus.org (Ben Gamari) Date: Sun, 02 Aug 2015 11:41:29 +0200 Subject: Building GHC master with GHC 7.6 In-Reply-To: References: <87615da49i.fsf@smart-cactus.org> Message-ID: <87h9oixc1i.fsf@smart-cactus.org> Alain O'Dea writes: > On 21 July 2015 at 13:31, Ben Gamari wrote: > >> >> Hello everyone, >> >> Earlier today I merged a clean-up [1] to the master branch which >> removed some #ifdefs which ensured that the tree could be built with GHC >> 7.6 >> >> Thomas Miedema correctly pointed out that some nightly builders may not >> be on GHC 7.8 yet. Do any of the nightly builders call in this category? >> If so, would it be possible to upgrade? >> >> Cheers, >> >> - Ben >> >> >> [1] https://phabricator.haskell.org/D904 > > > Hi Ben, > > Thomas notified me and I've upgraded the bootsrap GHC on my SmartOS > builders to 7.8.4. The next nightly build runs on SmartOS should run as > expected. > Great, thanks! Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From bergey at teallabs.org Sun Aug 2 16:58:25 2015 From: bergey at teallabs.org (Daniel Bergey) Date: Sun, 02 Aug 2015 16:58:25 +0000 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <87bnepd3v2.fsf@chladni.home> On 2015-07-31 at 08:59, Simon Peyton Jones wrote: > Daniel Bergey wrote: > | How hard would it be to give a different error message instead of "No > | instance ..." when the type variable is ambiguous? I always find this > | error slightly misleading, since it seems to me that there are > | multiple valid instances, not that there is "no instance". > > What would you like it to say? I think it likely we could make it say that! Great! I'd like it to say "Multiple instances for ..." or "No unique instance for ...". I have a slight preference for the former. Daniel From rwbarton at gmail.com Sun Aug 2 23:17:41 2015 From: rwbarton at gmail.com (Reid Barton) Date: Sun, 2 Aug 2015 19:17:41 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: <87bnepd3v2.fsf@chladni.home> References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: On Sun, Aug 2, 2015 at 12:58 PM, Daniel Bergey wrote: > On 2015-07-31 at 08:59, Simon Peyton Jones wrote: > > Daniel Bergey wrote: > > | How hard would it be to give a different error message instead of "No > > | instance ..." when the type variable is ambiguous? I always find this > > | error slightly misleading, since it seems to me that there are > > | multiple valid instances, not that there is "no instance". > > > > What would you like it to say? I think it likely we could make it say > that! > > Great! I'd like it to say "Multiple instances for ..." or "No unique > instance for ...". I have a slight preference for the former. > It may be worth noting that the existing error message is actually technically correct, in the sense that what would be needed for the program to compile is exactly an instance of the form "instance Foldable t where ...". Then the compiler would know that the ambiguity in the type variable t0 doesn't matter. It doesn't make any difference whether there are zero, one, or multiple instances of Foldable for more specific types. (Except in that if there is at least one such instance, then there can't also be an "instance Foldable t" assuming that OverlappingInstances is not enabled.) Once you understand this, the error message makes perfect sense. But it is often confusing to beginners. "Multiple instances for (C t)" seems bad because there might not be any instances for C at all. "No unique instance for (C t)" is better most of the time, but it doesn't exactly get to the core of the issue, since there could be just one instance of C, for a specific type, and then it is no better than "No instance for (C t)". If I were to explain the situation, I would say "there is no single instance (C t) that applies for every type t", but it seems a bit wordy for a compiler error... Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From rf at rufflewind.com Mon Aug 3 04:43:24 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Mon, 3 Aug 2015 00:43:24 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: I think the error message could be made clearer simply by emphasizing the fact that type ambiguity over the lack of instances. Ambiguous type variable 't0' arising from a use of elem :: a -> t0 a -> Bool caused by the lack of an instance 'Data.String.IsString (t0 Char)' Either add a type annotation to dictate what 't0' should be based on one of the potential instances: instance Foldable (Either a) -- Defined in ?Data.Foldable? instance Foldable Data.Proxy.Proxy -- Defined in ?Data.Foldable? instance GHC.Arr.Ix i => Foldable (GHC.Arr.Array i) -- Defined in ?Data.Foldable? ...plus three others) or define the required instance 'Data.String.IsString (t0 Char)'. From eir at cis.upenn.edu Mon Aug 3 13:26:27 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 3 Aug 2015 09:26:27 -0400 Subject: can't validate on Mac Message-ID: Hi devs, In a (almost) clean validate on my MacOS 10.8 machine, I see this: {{{ rts/posix/OSMem.c: In function 'my_mmap': rts/posix/OSMem.c:109:15: error: error: variable 'flags' set but not used [-Werror=unused-but-set-variable] int prot, flags; ^ rts/posix/OSMem.c:109:9: error: error: variable 'prot' set but not used [-Werror=unused-but-set-variable] int prot, flags; ^ cc1: all warnings being treated as errors }}} Help? Thanks, Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggreif at gmail.com Mon Aug 3 14:28:51 2015 From: ggreif at gmail.com (Gabor Greif) Date: Mon, 3 Aug 2015 16:28:51 +0200 Subject: can't validate on Mac In-Reply-To: References: Message-ID: what about this loittle patch? diff --git a/rts/posix/OSMem.c b/rts/posix/OSMem.c index 125ae10..edb240a 100644 --- a/rts/posix/OSMem.c +++ b/rts/posix/OSMem.c @@ -145,6 +145,7 @@ my_mmap (void *addr, W_ size, int operation) kern_return_t err = 0; ret = addr; + (void)prot; if(operation & MEM_RESERVE) { On 8/3/15, Richard Eisenberg wrote: > Hi devs, > > In a (almost) clean validate on my MacOS 10.8 machine, I see this: > > {{{ > rts/posix/OSMem.c: In function 'my_mmap': > > rts/posix/OSMem.c:109:15: error: > error: variable 'flags' set but not used > [-Werror=unused-but-set-variable] > int prot, flags; > ^ > > rts/posix/OSMem.c:109:9: error: > error: variable 'prot' set but not used > [-Werror=unused-but-set-variable] > int prot, flags; > ^ > cc1: all warnings being treated as errors > }}} > > Help? > > Thanks, > Richard From bergey at teallabs.org Mon Aug 3 16:45:34 2015 From: bergey at teallabs.org (Daniel Bergey) Date: Mon, 03 Aug 2015 16:45:34 +0000 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: <87614wcocx.fsf@chladni.home> On 2015-08-02 at 23:17, Reid Barton wrote: > It may be worth noting that the existing error message is actually > technically > correct, in the sense that what would be needed for the program to compile > is exactly an instance of the form "instance Foldable t where ...". Then the > compiler would know that the ambiguity in the type variable t0 doesn't > matter. > It doesn't make any difference whether there are zero, one, or multiple > instances > of Foldable for more specific types. (Except in that if there is at least > one > such instance, then there can't also be an "instance Foldable t" assuming > that OverlappingInstances is not enabled.) Once you understand this, the > error > message makes perfect sense. I thought GHC would infer the type when only one instance is in scope, at least in some cases, like IsString. But I could well be wrong about that. > But it is often confusing to beginners. I think it is beginners who are most affected by the wording of error messages. With time, the errors are familiar - I know how I fixed the last dozen similar errors, and can fix the next one the same way. But for beginners, the messages serve as explanations of what is wrong, and they ought to be worded to make sense to beginners. > "Multiple instances for (C t)" seems bad because there might not be any > instances for C at all. My initial question was whether GHC can give a different message when there are multiple instances than when there is none. I appreciate your point that these are not so different, but that's an insight that helps me today, not a Haskell newcomer I was several years ago. (Though GHC today lists several matching instances, which is a great improvement over the behavior in 7.4 when I was learning this.) > "No unique instance for (C t)" is better most of > the time, > but it doesn't exactly get to the core of the issue, since there could be > just one > instance of C, for a specific type, and then it is no better than "No > instance for > (C t)". If I were to explain the situation, I would say "there is no single > instance > (C t) that applies for every type t", but it seems a bit wordy for a > compiler error... > > Regards, > Reid Barton From bergey at teallabs.org Mon Aug 3 16:47:25 2015 From: bergey at teallabs.org (Daniel Bergey) Date: Mon, 03 Aug 2015 16:47:25 +0000 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: <87zj28b9pe.fsf@chladni.home> On 2015-08-03 at 04:43, Phil Ruffwind wrote: > I think the error message could be made clearer simply by emphasizing the fact > that type ambiguity over the lack of instances. > > Ambiguous type variable 't0' arising from a use of > elem :: a -> t0 a -> Bool > caused by the lack of an instance 'Data.String.IsString (t0 Char)' > Either add a type annotation to dictate what 't0' should be > based on one of the potential instances: > instance Foldable (Either a) -- Defined in ?Data.Foldable? > instance Foldable Data.Proxy.Proxy -- Defined in ?Data.Foldable? > instance GHC.Arr.Ix i => Foldable (GHC.Arr.Array i) > -- Defined in ?Data.Foldable? > ...plus three others) > or define the required instance 'Data.String.IsString (t0 Char)'. Yes, I think that message would be fine. From allbery.b at gmail.com Mon Aug 3 16:47:44 2015 From: allbery.b at gmail.com (Brandon Allbery) Date: Mon, 3 Aug 2015 12:47:44 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: <87614wcocx.fsf@chladni.home> References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> <87614wcocx.fsf@chladni.home> Message-ID: On Mon, Aug 3, 2015 at 12:45 PM, Daniel Bergey wrote: > I thought GHC would infer the type when only one instance is in scope, > at least in some cases, like IsString. But I could well be wrong about > that. > Typeclasses are open-world; this is not a safe assumption, since instances are global and an instance added elsewhere at some point in the future could therefore break your program. -- brandon s allbery kf8nh sine nomine associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From iavor.diatchki at gmail.com Mon Aug 3 17:15:05 2015 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Mon, 3 Aug 2015 10:15:05 -0700 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> <87614wcocx.fsf@chladni.home> Message-ID: Hello, what Reid says is exactly right---the issue is not really about what instances are present, the problem is that GHC can't determine how to instantiate `t0`. Perhaps a more direct way to describe this is as follows: Failed to infer type `t0` while solving constraint `Data.String.IsString (t0 Char)` arising from the use of: elem :: a -> t0 a -> Bool -Iavor On Mon, Aug 3, 2015 at 9:47 AM, Brandon Allbery wrote: > On Mon, Aug 3, 2015 at 12:45 PM, Daniel Bergey > wrote: > >> I thought GHC would infer the type when only one instance is in scope, >> at least in some cases, like IsString. But I could well be wrong about >> that. >> > > Typeclasses are open-world; this is not a safe assumption, since instances > are global and an instance added elsewhere at some point in the future > could therefore break your program. > > -- > brandon s allbery kf8nh sine nomine > associates > allbery.b at gmail.com > ballbery at sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwbarton at gmail.com Mon Aug 3 17:19:56 2015 From: rwbarton at gmail.com (Reid Barton) Date: Mon, 3 Aug 2015 13:19:56 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: On Mon, Aug 3, 2015 at 12:43 AM, Phil Ruffwind wrote: > I think the error message could be made clearer simply by emphasizing the > fact > that type ambiguity over the lack of instances. > > Ambiguous type variable 't0' arising from a use of > elem :: a -> t0 a -> Bool > caused by the lack of an instance 'Data.String.IsString (t0 Char)' > Either add a type annotation to dictate what 't0' should be > based on one of the potential instances: > instance Foldable (Either a) -- Defined in ?Data.Foldable? > instance Foldable Data.Proxy.Proxy -- Defined in ?Data.Foldable? > instance GHC.Arr.Ix i => Foldable (GHC.Arr.Array i) > -- Defined in ?Data.Foldable? > ...plus three others) > or define the required instance 'Data.String.IsString (t0 Char)'. > I like this style of error message since it points to the most likely fix first. If there are no "potential instances" (instances for specializations of the type we need an instance for) in scope, then we can produce the old "No instance for C t0" error, which suggests that the user write (or import) such an instance. If there is at least one "potential instance" in scope, then (assuming that the user wants to keep their existing instances, and not use overlapping instances) they in fact must specify the type variable somehow. The only case that may still cause confusion is when there is exactly one "potential instance" in scope. Then the user is likely to wonder why the type is ambiguous. It might help to phrase the error message text in a way that implies that the list of instances it displays is not necessarily exhaustive. Regards, Reid Barton -------------- next part -------------- An HTML attachment was scrubbed... URL: From rf at rufflewind.com Mon Aug 3 19:12:59 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Mon, 3 Aug 2015 15:12:59 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> Message-ID: Like this? Either use a type annotation to specify what 't0' should be based on these potential instance(s): instance Foo Bar -- Defined in 'Foo.Bar' ... and possibly more from other modules that the compiler has not yet encountered or define the required instance 'Foo t0' Not sure how best to present this. To explain this properly it's going to take several lines :\ --- Some other more general suggestions: it'd be nice to have - a unique tag for each GHC error, like 'ambiguous-type-variable' to improve searchability of error messages from GHC. The tag would also remain constant while the message may change over time. - a wiki that documents all the GHC errors. Not merely beginner-level advice, but also explanations of what causes them in all its gory details (so discussions like this could be pasted into that, for example). There is a stub already on https://wiki.haskell.org/GHC/Error_messages but it looks largely abandoned :( From simonpj at microsoft.com Mon Aug 3 20:08:42 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Mon, 3 Aug 2015 20:08:42 +0000 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: <87zj28b9pe.fsf@chladni.home> References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> <87zj28b9pe.fsf@chladni.home> Message-ID: <6c32ecf2503f41258d468ce2496cc3a8@DB4PR30MB030.064d.mgd.msft.net> Would someone feel able to open a Trac ticket summarising this thread (as well as pointing to it), and making a proposal? Thanks Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Daniel | Bergey | Sent: 03 August 2015 17:47 | To: Phil Ruffwind; Reid Barton | Cc: ghc-devs | Subject: Re: Typechecker / OverloadedStrings question 7.8 vs. 7.10 | | On 2015-08-03 at 04:43, Phil Ruffwind wrote: | > I think the error message could be made clearer simply by emphasizing | the fact | > that type ambiguity over the lack of instances. | > | > Ambiguous type variable 't0' arising from a use of | > elem :: a -> t0 a -> Bool | > caused by the lack of an instance 'Data.String.IsString (t0 Char)' | > Either add a type annotation to dictate what 't0' should be | > based on one of the potential instances: | > instance Foldable (Either a) -- Defined in ?Data.Foldable? | > instance Foldable Data.Proxy.Proxy -- Defined in ?Data.Foldable? | > instance GHC.Arr.Ix i => Foldable (GHC.Arr.Array i) | > -- Defined in ?Data.Foldable? | > ...plus three others) | > or define the required instance 'Data.String.IsString (t0 Char)'. | | Yes, I think that message would be fine. | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Mon Aug 3 22:25:43 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 3 Aug 2015 18:25:43 -0400 Subject: can't validate on Mac In-Reply-To: References: Message-ID: I'm sure that would work, but I'm worried this is a symptom of a deeper problem. Does anyone know what's going on here? This is holding up two patches of mine. Thanks! Richard On Aug 3, 2015, at 10:28 AM, Gabor Greif wrote: > what about this loittle patch? > > diff --git a/rts/posix/OSMem.c b/rts/posix/OSMem.c > index 125ae10..edb240a 100644 > --- a/rts/posix/OSMem.c > +++ b/rts/posix/OSMem.c > @@ -145,6 +145,7 @@ my_mmap (void *addr, W_ size, int operation) > > kern_return_t err = 0; > ret = addr; > + (void)prot; > > if(operation & MEM_RESERVE) > { > > > On 8/3/15, Richard Eisenberg wrote: >> Hi devs, >> >> In a (almost) clean validate on my MacOS 10.8 machine, I see this: >> >> {{{ >> rts/posix/OSMem.c: In function 'my_mmap': >> >> rts/posix/OSMem.c:109:15: error: >> error: variable 'flags' set but not used >> [-Werror=unused-but-set-variable] >> int prot, flags; >> ^ >> >> rts/posix/OSMem.c:109:9: error: >> error: variable 'prot' set but not used >> [-Werror=unused-but-set-variable] >> int prot, flags; >> ^ >> cc1: all warnings being treated as errors >> }}} >> >> Help? >> >> Thanks, >> Richard From rf at rufflewind.com Tue Aug 4 01:04:20 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Mon, 3 Aug 2015 21:04:20 -0400 Subject: Typechecker / OverloadedStrings question 7.8 vs. 7.10 In-Reply-To: <6c32ecf2503f41258d468ce2496cc3a8@DB4PR30MB030.064d.mgd.msft.net> References: <841f2790eb8b4bb99541e241c0009924@DB4PR30MB030.064d.mgd.msft.net> <87y4hxce7t.fsf@chladni.home> <02d94e2366ab49aaaaf079fc9fb729a6@DB4PR30MB030.064d.mgd.msft.net> <87bnepd3v2.fsf@chladni.home> <87zj28b9pe.fsf@chladni.home> <6c32ecf2503f41258d468ce2496cc3a8@DB4PR30MB030.064d.mgd.msft.net> Message-ID: > Would someone feel able to open a Trac ticket summarising this thread (as well as pointing to it), and making a proposal? Done: https://ghc.haskell.org/trac/ghc/ticket/10733 From alexander at plaimi.net Wed Aug 5 11:00:28 2015 From: alexander at plaimi.net (Alexander Berntsen) Date: Wed, 5 Aug 2015 13:00:28 +0200 Subject: Proposal: Include GHC version target in libraries' description Message-ID: <55C1ECCC.5080409@plaimi.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Problem: A frequent usecase for especially base is browsing the haddocks online on hackage (I do this myself sometimes). A frequent associated pain with this is remembering which base version corresponds to which GHC version (I feel this myself sometimes). Current solution: Consult the boot library version chart[0]. An additional step -- where there needn't be one -- may annoy the user. Furthermore, if they don't know about the chart then they're essentially lost; which usually leads to them consulting another user - -- and now we've wasted a little time for *two* of our users for no real good reason. I conjecture that this happens quite often. Which means our culmination of GHC-related time waste eventually adds up quite a bit, detracting from the overall experience. Proposed improvement: Add "[library]-[library version] is bundled with ghc-[GHC version]." in the description string of [library].cabal. This immediately clears up the confusion when looking at [library]'s hackage page. [0] - -- Alexander alexander at plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCgAGBQJVwezFAAoJENQqWdRUGk8B8jEQAMuAYjn25VB/aG5VWr8IdPoZ OSoUu+26XdJhmbOcYzHWje7gUs8FSfqxHyflasiO22uWOHHO3kfyzwOEZX0Nqc0b WboMRuNZ5dEPGAGFqxi/DWzPoaQ9shA+EUmCl8r2Qn9W7ZximaP3b1ORO0UvvgBo YK3T0f4TCoYKOYS2Txl5gjO3FKK6STdNxcELs3p6g0+paf5crEf3hcw2WZCcsdSc 0yU3dl5iFin4ofaByDmx7tj1ACYn7MusEH4zF+jjGYb4XbIV+WCy78Nsmlqt3AF9 P++HD4DMmh+rXW9tkqAbTgxn3LAgrkB0F3vYAEfiH4xFmWlQyzANtjILN6WYDOt0 5Gnpz/Meg6aGJY0U5y/XAPxzpar7oc/HPbdjS747iOTLj4dsOHeswmw++goG8ugw j/Yq00T17/3RJlZ03OdaRyl+HrXiNCq0J48LUUJu9jY9LCjr57hKZa3fFFM++fIU UaIMllxCiOdGDaTJCCZ/HEvZBtvxaeWWEc/Zi2XrtMI4PcAS/QACxlJJsHfnb/1B mCi5fIWjzX0UPo6YeRI2MeUlasKp8fLPm8AGKHQJGx2XHNHToNFR5X75VpCWWlp3 oboVm1qN46qpwnkrxWrIvlK40fLjmiZZMb0HS4cm+4ofNFc+CQMBqQ5fZxrwAO7e CTzw91s+6tbcp43Y+lgI =rmBv -----END PGP SIGNATURE----- From svenpanne at gmail.com Wed Aug 5 14:57:08 2015 From: svenpanne at gmail.com (Sven Panne) Date: Wed, 5 Aug 2015 16:57:08 +0200 Subject: Strange Changelog.md formatting on Hackage Message-ID: The formatting of https://hackage.haskell.org/package/StateVar-1.1.0.1/changelog is garbled, while the corresponding GitHub page https://github.com/haskell-opengl/StateVar/blob/master/CHANGELOG.md looks OK. Can somebody give me a hint why this happens? https://hackage.haskell.org/package/lens-4.12.3/changelog e.g. looks nice, but the markdown seems to be similar. Cheers, S. -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Wed Aug 5 16:02:14 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 5 Aug 2015 16:02:14 +0000 Subject: arc patch Message-ID: Friends I wanted to build a Phab ticket, so I tried arc patch D1069 but it failed, as below. What do I do now? Thanks Simon simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 You have untracked files in this working copy. Working copy: /home/simonpj/code/HEAD-5/ Untracked files in working copy: Foo compiler/basicTypes/T7287.stderr foo libraries/integer-gmp2/GNUmakefile libraries/integer-gmp2/ghc.mk spj-patch testsuite/tests/deriving/should_fail/T2604.hs testsuite/tests/deriving/should_fail/T2604.stderr testsuite/tests/deriving/should_fail/T5863a.hs testsuite/tests/deriving/should_fail/T5863a.stderr testsuite/tests/deriving/should_fail/T7800.hs testsuite/tests/deriving/should_fail/T7800.stderr testsuite/tests/typecheck/should_compile/T9999.hs typeable-msg Since you don't have '.gitignore' rules for these files and have not listed them in '.git/info/exclude', you may have forgotten to 'git add' them to your commit. Do you want to add these files to the commit? [y/N] N N Created and checked out branch arcpatch-D1069. Exception ERR-CONDUIT-CALL: API Method "differential.query" does not define these parameters: 'arcanistProjects'. (Run with --trace for a full exception trace.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Wed Aug 5 16:09:45 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Wed, 5 Aug 2015 18:09:45 +0200 Subject: arc patch In-Reply-To: References: Message-ID: Try running 'arc update' anytime you get such kind of error. Austin upgrades GHC's Phabricator instance every now and then. Sometimes this requires also an update to `arc` for things to work again. On Wed, Aug 5, 2015 at 6:02 PM, Simon Peyton Jones wrote: > Friends > > I wanted to build a Phab ticket, so I tried > > arc patch D1069 > > but it failed, as below. What do I do now? > > Thanks > > Simon > > simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 > > You have untracked files in this working copy. > > > > Working copy: /home/simonpj/code/HEAD-5/ > > > > Untracked files in working copy: > > Foo > > compiler/basicTypes/T7287.stderr > > foo > > libraries/integer-gmp2/GNUmakefile > > libraries/integer-gmp2/ghc.mk > > spj-patch > > testsuite/tests/deriving/should_fail/T2604.hs > > testsuite/tests/deriving/should_fail/T2604.stderr > > testsuite/tests/deriving/should_fail/T5863a.hs > > testsuite/tests/deriving/should_fail/T5863a.stderr > > testsuite/tests/deriving/should_fail/T7800.hs > > testsuite/tests/deriving/should_fail/T7800.stderr > > testsuite/tests/typecheck/should_compile/T9999.hs > > typeable-msg > > > > Since you don't have '.gitignore' rules for these files and have not listed > > them in '.git/info/exclude', you may have forgotten to 'git add' them to > your > > commit. > > > > > > Do you want to add these files to the commit? [y/N] N > > N > > > > Created and checked out branch arcpatch-D1069. > > Exception > > ERR-CONDUIT-CALL: API Method "differential.query" does not define these > parameters: 'arcanistProjects'. > > (Run with --trace for a full exception trace.) > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwbarton at gmail.com Wed Aug 5 16:13:11 2015 From: rwbarton at gmail.com (Reid Barton) Date: Wed, 5 Aug 2015 12:13:11 -0400 Subject: arc patch In-Reply-To: References: Message-ID: It's actually "arc upgrade". I added a mention of this command to the wiki: https://ghc.haskell.org/trac/ghc/wiki/Phabricator#HelpImgettingastrangeerrorwhenrunningarcthatIdidntgetyesterday Regards, Reid Barton On Wed, Aug 5, 2015 at 12:09 PM, Thomas Miedema wrote: > Try running 'arc update' anytime you get such kind of error. > > Austin upgrades GHC's Phabricator instance every now and then. Sometimes > this requires also an update to `arc` for things to work again. > > On Wed, Aug 5, 2015 at 6:02 PM, Simon Peyton Jones > wrote: > >> Friends >> >> I wanted to build a Phab ticket, so I tried >> >> arc patch D1069 >> >> but it failed, as below. What do I do now? >> >> Thanks >> >> Simon >> >> simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 >> >> You have untracked files in this working copy. >> >> >> >> Working copy: /home/simonpj/code/HEAD-5/ >> >> >> >> Untracked files in working copy: >> >> Foo >> >> compiler/basicTypes/T7287.stderr >> >> foo >> >> libraries/integer-gmp2/GNUmakefile >> >> libraries/integer-gmp2/ghc.mk >> >> spj-patch >> >> testsuite/tests/deriving/should_fail/T2604.hs >> >> testsuite/tests/deriving/should_fail/T2604.stderr >> >> testsuite/tests/deriving/should_fail/T5863a.hs >> >> testsuite/tests/deriving/should_fail/T5863a.stderr >> >> testsuite/tests/deriving/should_fail/T7800.hs >> >> testsuite/tests/deriving/should_fail/T7800.stderr >> >> testsuite/tests/typecheck/should_compile/T9999.hs >> >> typeable-msg >> >> >> >> Since you don't have '.gitignore' rules for these files and have not >> listed >> >> them in '.git/info/exclude', you may have forgotten to 'git add' them to >> your >> >> commit. >> >> >> >> >> >> Do you want to add these files to the commit? [y/N] N >> >> N >> >> >> >> Created and checked out branch arcpatch-D1069. >> >> Exception >> >> ERR-CONDUIT-CALL: API Method "differential.query" does not define these >> parameters: 'arcanistProjects'. >> >> (Run with --trace for a full exception trace.) >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.zimm at gmail.com Wed Aug 5 16:13:56 2015 From: alan.zimm at gmail.com (Alan & Kim Zimmerman) Date: Wed, 5 Aug 2015 18:13:56 +0200 Subject: arc patch In-Reply-To: References: Message-ID: Is your arc fully up to date? Running "arc upgrade" will do it if not. Alan On Wed, Aug 5, 2015 at 6:02 PM, Simon Peyton Jones wrote: > Friends > > I wanted to build a Phab ticket, so I tried > > arc patch D1069 > > but it failed, as below. What do I do now? > > Thanks > > Simon > > simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 > > You have untracked files in this working copy. > > > > Working copy: /home/simonpj/code/HEAD-5/ > > > > Untracked files in working copy: > > Foo > > compiler/basicTypes/T7287.stderr > > foo > > libraries/integer-gmp2/GNUmakefile > > libraries/integer-gmp2/ghc.mk > > spj-patch > > testsuite/tests/deriving/should_fail/T2604.hs > > testsuite/tests/deriving/should_fail/T2604.stderr > > testsuite/tests/deriving/should_fail/T5863a.hs > > testsuite/tests/deriving/should_fail/T5863a.stderr > > testsuite/tests/deriving/should_fail/T7800.hs > > testsuite/tests/deriving/should_fail/T7800.stderr > > testsuite/tests/typecheck/should_compile/T9999.hs > > typeable-msg > > > > Since you don't have '.gitignore' rules for these files and have not listed > > them in '.git/info/exclude', you may have forgotten to 'git add' them to > your > > commit. > > > > > > Do you want to add these files to the commit? [y/N] N > > N > > > > Created and checked out branch arcpatch-D1069. > > Exception > > ERR-CONDUIT-CALL: API Method "differential.query" does not define these > parameters: 'arcanistProjects'. > > (Run with --trace for a full exception trace.) > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Wed Aug 5 16:21:49 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 5 Aug 2015 16:21:49 +0000 Subject: arc patch In-Reply-To: References: Message-ID: <7d4da7ade9ca440086ec22fbc2ff2511@DB4PR30MB030.064d.mgd.msft.net> Thanks that worked. Now it fails in a new way! ?Cherry pick failed?. What now? Simon simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 arc patch D1069 You have untracked files in this working copy. Working copy: /home/simonpj/code/HEAD-5/ Untracked changes in working copy: (To ignore these changes, add them to ".git/info/exclude".) D1069-patch Foo compiler/basicTypes/T7287.stderr foo libraries/integer-gmp2/GNUmakefile libraries/integer-gmp2/ghc.mk spj-patch testsuite/tests/deriving/should_fail/T2604.hs testsuite/tests/deriving/should_fail/T2604.stderr testsuite/tests/deriving/should_fail/T5863a.hs testsuite/tests/deriving/should_fail/T5863a.stderr testsuite/tests/deriving/should_fail/T7800.hs testsuite/tests/deriving/should_fail/T7800.stderr testsuite/tests/typecheck/should_compile/T9999.hs typeable-msg Ignore these untracked files and continue? [y/N] y y Branch name arcpatch-D1069 already exists; trying a new name. Branch name arcpatch-D1069_1 already exists; trying a new name. Created and checked out branch arcpatch-D1069_2. Created and checked out branch arcpatch-D1033. Checking patch utils/haddock... Checking patch testsuite/tests/driver/T4437.hs... Checking patch testsuite/tests/deSugar/should_run/all.T... Checking patch testsuite/tests/deSugar/should_run/DsStrictData.stdout... Checking patch testsuite/tests/deSugar/should_run/DsStrictData.hs... Checking patch docs/users_guide/glasgow_exts.xml... Checking patch docs/users_guide/flags.xml... Checking patch compiler/vectorise/Vectorise/Generic/PData.hs... Checking patch compiler/typecheck/TcTyClsDecls.hs... Checking patch compiler/typecheck/TcSplice.hs... Checking patch compiler/typecheck/TcRnDriver.hs... Checking patch compiler/typecheck/TcExpr.hs... Checking patch compiler/prelude/TysWiredIn.hs... Checking patch compiler/parser/RdrHsSyn.hs... Checking patch compiler/parser/Parser.y... Checking patch compiler/main/DynFlags.hs... Checking patch compiler/iface/TcIface.hs... Checking patch compiler/iface/MkIface.hs... Checking patch compiler/iface/BuildTyCl.hs... Checking patch compiler/hsSyn/HsTypes.hs... Checking patch compiler/hsSyn/Convert.hs... Checking patch compiler/deSugar/DsMeta.hs... Checking patch compiler/basicTypes/MkId.hs... Checking patch compiler/basicTypes/DataCon.hs... warning: unable to rmdir utils/haddock: Directory not empty Applied patch utils/haddock cleanly. Applied patch testsuite/tests/driver/T4437.hs cleanly. Applied patch testsuite/tests/deSugar/should_run/all.T cleanly. Applied patch testsuite/tests/deSugar/should_run/DsStrictData.stdout cleanly. Applied patch testsuite/tests/deSugar/should_run/DsStrictData.hs cleanly. Applied patch docs/users_guide/glasgow_exts.xml cleanly. Applied patch docs/users_guide/flags.xml cleanly. Applied patch compiler/vectorise/Vectorise/Generic/PData.hs cleanly. Applied patch compiler/typecheck/TcTyClsDecls.hs cleanly. Applied patch compiler/typecheck/TcSplice.hs cleanly. Applied patch compiler/typecheck/TcRnDriver.hs cleanly. Applied patch compiler/typecheck/TcExpr.hs cleanly. Applied patch compiler/prelude/TysWiredIn.hs cleanly. Applied patch compiler/parser/RdrHsSyn.hs cleanly. Applied patch compiler/parser/Parser.y cleanly. Applied patch compiler/main/DynFlags.hs cleanly. Applied patch compiler/iface/TcIface.hs cleanly. Applied patch compiler/iface/MkIface.hs cleanly. Applied patch compiler/iface/BuildTyCl.hs cleanly. Applied patch compiler/hsSyn/HsTypes.hs cleanly. Applied patch compiler/hsSyn/Convert.hs cleanly. Applied patch compiler/deSugar/DsMeta.hs cleanly. Applied patch compiler/basicTypes/MkId.hs cleanly. Applied patch compiler/basicTypes/DataCon.hs cleanly. Submodule 'libffi-tarballs' () registered for path 'libffi-tarballs' Submodule 'libraries/Cabal' () registered for path 'libraries/Cabal' Submodule 'libraries/Win32' () registered for path 'libraries/Win32' Submodule 'libraries/array' () registered for path 'libraries/array' Submodule 'libraries/binary' () registered for path 'libraries/binary' Submodule 'libraries/bytestring' () registered for path 'libraries/bytestring' Submodule 'libraries/containers' () registered for path 'libraries/containers' Submodule 'libraries/deepseq' () registered for path 'libraries/deepseq' Submodule 'libraries/directory' () registered for path 'libraries/directory' Submodule 'libraries/dph' () registered for path 'libraries/dph' Submodule 'libraries/filepath' () registered for path 'libraries/filepath' Submodule 'libraries/haskeline' () registered for path 'libraries/haskeline' Submodule 'libraries/hoopl' () registered for path 'libraries/hoopl' Submodule 'libraries/hpc' () registered for path 'libraries/hpc' Submodule 'libraries/parallel' () registered for path 'libraries/parallel' Submodule 'libraries/pretty' () registered for path 'libraries/pretty' Submodule 'libraries/primitive' () registered for path 'libraries/primitive' Submodule 'libraries/process' () registered for path 'libraries/process' Submodule 'libraries/random' () registered for path 'libraries/random' Submodule 'libraries/stm' () registered for path 'libraries/stm' Submodule 'libraries/terminfo' () registered for path 'libraries/terminfo' Submodule 'libraries/time' () registered for path 'libraries/time' Submodule 'libraries/transformers' () registered for path 'libraries/transformers' Submodule 'libraries/unix' () registered for path 'libraries/unix' Submodule 'libraries/vector' () registered for path 'libraries/vector' Submodule 'libraries/xhtml' () registered for path 'libraries/xhtml' Submodule 'nofib' () registered for path 'nofib' Submodule 'utils/haddock' () registered for path 'utils/haddock' Submodule 'utils/hsc2hs' () registered for path 'utils/hsc2hs' Submodule path 'libraries/array': checked out '604afd531aba4a96b066f6e59a08813107a9eed3' Submodule path 'libraries/parallel': checked out 'e4e4228ba94178cf31b97fe81b94bff3de6fce03' Submodule path 'utils/haddock': checked out '5eb0785cde60997f072c3bdfefaf8c389c96d42e' Cherry Pick Failed! Exception Command failed with error #1! COMMAND git cherry-pick 'arcpatch-D1033' STDOUT # On branch arcpatch-D1069_2 # Changes not staged for commit: # (use "git add ..." to update what will be committed) # (use "git checkout -- ..." to discard changes in working directory) # # modified: libraries/array (new commits) # modified: libraries/parallel (new commits) # modified: utils/haddock (new commits) # # Untracked files: # (use "git add ..." to include in what will be committed) # # D1069-patch # Foo # compiler/basicTypes/T7287.stderr # foo # libraries/integer-gmp2/ # spj-patch # testsuite/tests/deriving/should_fail/T2604.hs # testsuite/tests/deriving/should_fail/T2604.stderr # testsuite/tests/deriving/should_fail/T5863a.hs # testsuite/tests/deriving/should_fail/T5863a.stderr # testsuite/tests/deriving/should_fail/T7800.hs # testsuite/tests/deriving/should_fail/T7800.stderr # testsuite/tests/typecheck/should_compile/T9999.hs # typeable-msg no changes added to commit (use "git add" and/or "git commit -a") STDERR The previous cherry-pick is now empty, possibly due to conflict resolution. If you wish to commit it anyway, use: git commit --allow-empty Otherwise, please use 'git reset' (Run with `--trace` for a full exception trace.) 49simonpj at cam-05-unx:~/code/HEAD-5$ From: Reid Barton [mailto:rwbarton at gmail.com] Sent: 05 August 2015 17:13 To: Thomas Miedema Cc: Simon Peyton Jones; ghc-devs Subject: Re: arc patch It's actually "arc upgrade". I added a mention of this command to the wiki: https://ghc.haskell.org/trac/ghc/wiki/Phabricator#HelpImgettingastrangeerrorwhenrunningarcthatIdidntgetyesterday Regards, Reid Barton On Wed, Aug 5, 2015 at 12:09 PM, Thomas Miedema > wrote: Try running 'arc update' anytime you get such kind of error. Austin upgrades GHC's Phabricator instance every now and then. Sometimes this requires also an update to `arc` for things to work again. On Wed, Aug 5, 2015 at 6:02 PM, Simon Peyton Jones > wrote: Friends I wanted to build a Phab ticket, so I tried arc patch D1069 but it failed, as below. What do I do now? Thanks Simon simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 You have untracked files in this working copy. Working copy: /home/simonpj/code/HEAD-5/ Untracked files in working copy: Foo compiler/basicTypes/T7287.stderr foo libraries/integer-gmp2/GNUmakefile libraries/integer-gmp2/ghc.mk spj-patch testsuite/tests/deriving/should_fail/T2604.hs testsuite/tests/deriving/should_fail/T2604.stderr testsuite/tests/deriving/should_fail/T5863a.hs testsuite/tests/deriving/should_fail/T5863a.stderr testsuite/tests/deriving/should_fail/T7800.hs testsuite/tests/deriving/should_fail/T7800.stderr testsuite/tests/typecheck/should_compile/T9999.hs typeable-msg Since you don't have '.gitignore' rules for these files and have not listed them in '.git/info/exclude', you may have forgotten to 'git add' them to your commit. Do you want to add these files to the commit? [y/N] N N Created and checked out branch arcpatch-D1069. Exception ERR-CONDUIT-CALL: API Method "differential.query" does not define these parameters: 'arcanistProjects'. (Run with --trace for a full exception trace.) _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Wed Aug 5 18:06:59 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 05 Aug 2015 11:06:59 -0700 Subject: arc diff problems Message-ID: <1438797888-sup-9@sabre> Hello friends, I too am having Arcanist problems (my libphutil and arcanist are at the latest head). Here's my error: [ezyang at hs01 ghc-quick]$ arc diff HEAD~ You have untracked files in this working copy. Working copy: /home/hs01/ezyang/ghc-quick/ Untracked changes in working copy: (To ignore these changes, add them to ".git/info/exclude".) testsuite/tests/driver/T9938 testsuite/tests/driver/T9938B testsuite/tests/driver/dynamicToo/dynamicToo001/d testsuite/tests/driver/dynamicToo/dynamicToo001/s testsuite/tests/driver/dynamicToo/dynamicToo005/A005.dyn_o-boot Ignore these untracked files and continue? [y/N] y Exception Field "testPlan" occurs twice in commit message! (Run with `--trace` for a full exception trace.) Here is the commit message in question: commit 80ef08619c315e35e439e50724afc5d3b3203895 Author: Edward Z. Yang Date: Fri Jul 24 15:13:49 2015 -0700 Unify hsig and hs-boot; add preliminary "hs-boot" merging. This patch drops the file level distinction between hs-boot and hsig; we figure out which one we are compiling based on whether or not there is a corresponding hs file lying around. To make the "import A" syntax continue to work for bare hs-boot files, we also introduce hs-boot merging, which takes an A.hi-boot and converts it to an A.hi when there is no A.hs file in scope. This will be generalized in Backpack to merge multiple A.hi files together; which means we can jettison the "load multiple interface files" functionality. This works automatically for --make, but for one-shot compilation we need a new mode: ghc --merge-requirements A will generate an A.hi/A.o from a local A.hi-boot file; Backpack will extend this mechanism further. Has Haddock submodule update to deal with change in msHsFilePath behavior. - This commit drops support for the hsig extension. Can we support it? It's annoying because the finder code is written with the assumption that where there's an hs-boot file, there's always an hs file too. To support hsig, you'd have to probe two locations. Easier to just not support it. - #10333 affects us, modifying an hs-boot still doesn't trigger recomp. - See compiler/main/Finder.hs: this diff is very skeevy, but it seems to work. - This code cunningly doesn't drop hs-boot files from the "drop hs-boot files" module graph, if they don't have a corresponding hs file. I have no idea if this actually is useful. Signed-off-by: Edward Z. Yang Test Plan: validate Reviewers: simonpj, austin, bgamari, spinda Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1098 It doesn't work even if I delete "Test Plan". Edward From adam at sandbergericsson.se Wed Aug 5 18:19:35 2015 From: adam at sandbergericsson.se (Adam Sandberg Eriksson) Date: Wed, 05 Aug 2015 20:19:35 +0200 Subject: arc patch In-Reply-To: <7d4da7ade9ca440086ec22fbc2ff2511@DB4PR30MB030.064d.mgd.msft.net> References: <7d4da7ade9ca440086ec22fbc2ff2511@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <1438798775.3063348.348688489.5D0D5342@webmail.messagingengine.com> Hi, I rebased D1069 on master from this morning. arc patch D1069 works for me. It seems there were some strange interactions with it depending on another patch that has already been merged. You might need to run `git submodule sync` and `git submodule update` after patching to update the haddock submodule. --Adam On Wed, 5 Aug 2015, at 06:21 PM, Simon Peyton Jones wrote: > Thanks that worked.? Now it fails in a new way! ??Cherry pick failed?. > > What now? > > Simon > > simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 > arc patch D1069 > You have untracked files in this working copy. > > Working copy: /home/simonpj/code/HEAD-5/ > > Untracked changes in working copy: > (To ignore these changes, add them to ".git/info/exclude".) > D1069-patch > Foo > compiler/basicTypes/T7287.stderr > foo > libraries/integer-gmp2/GNUmakefile > libraries/integer-gmp2/ghc.mk > spj-patch > testsuite/tests/deriving/should_fail/T2604.hs > testsuite/tests/deriving/should_fail/T2604.stderr > testsuite/tests/deriving/should_fail/T5863a.hs > testsuite/tests/deriving/should_fail/T5863a.stderr > testsuite/tests/deriving/should_fail/T7800.hs > testsuite/tests/deriving/should_fail/T7800.stderr > testsuite/tests/typecheck/should_compile/T9999.hs > typeable-msg > > Ignore these untracked files and continue? [y/N] y > y > > Branch name arcpatch-D1069 already exists; trying a new name. > Branch name arcpatch-D1069_1 already exists; trying a new name. > Created and checked out branch arcpatch-D1069_2. > Created and checked out branch arcpatch-D1033. > Checking patch utils/haddock... > Checking patch testsuite/tests/driver/T4437.hs... > Checking patch testsuite/tests/deSugar/should_run/all.T... > Checking patch > testsuite/tests/deSugar/should_run/DsStrictData.stdout... > Checking patch testsuite/tests/deSugar/should_run/DsStrictData.hs... > Checking patch docs/users_guide/glasgow_exts.xml... > Checking patch docs/users_guide/flags.xml... > Checking patch compiler/vectorise/Vectorise/Generic/PData.hs... > Checking patch compiler/typecheck/TcTyClsDecls.hs... > Checking patch compiler/typecheck/TcSplice.hs... > Checking patch compiler/typecheck/TcRnDriver.hs... > Checking patch compiler/typecheck/TcExpr.hs... > Checking patch compiler/prelude/TysWiredIn.hs... > Checking patch compiler/parser/RdrHsSyn.hs... > Checking patch compiler/parser/Parser.y... > Checking patch compiler/main/DynFlags.hs... > Checking patch compiler/iface/TcIface.hs... > Checking patch compiler/iface/MkIface.hs... > Checking patch compiler/iface/BuildTyCl.hs... > Checking patch compiler/hsSyn/HsTypes.hs... > Checking patch compiler/hsSyn/Convert.hs... > Checking patch compiler/deSugar/DsMeta.hs... > Checking patch compiler/basicTypes/MkId.hs... > Checking patch compiler/basicTypes/DataCon.hs... > warning: unable to rmdir utils/haddock: Directory not empty > Applied patch utils/haddock cleanly. > Applied patch testsuite/tests/driver/T4437.hs cleanly. > Applied patch testsuite/tests/deSugar/should_run/all.T cleanly. > Applied patch testsuite/tests/deSugar/should_run/DsStrictData.stdout > cleanly. > Applied patch testsuite/tests/deSugar/should_run/DsStrictData.hs > cleanly. > Applied patch docs/users_guide/glasgow_exts.xml cleanly. > Applied patch docs/users_guide/flags.xml cleanly. > Applied patch compiler/vectorise/Vectorise/Generic/PData.hs cleanly. > Applied patch compiler/typecheck/TcTyClsDecls.hs cleanly. > Applied patch compiler/typecheck/TcSplice.hs cleanly. > Applied patch compiler/typecheck/TcRnDriver.hs cleanly. > Applied patch compiler/typecheck/TcExpr.hs cleanly. > Applied patch compiler/prelude/TysWiredIn.hs cleanly. > Applied patch compiler/parser/RdrHsSyn.hs cleanly. > Applied patch compiler/parser/Parser.y cleanly. > Applied patch compiler/main/DynFlags.hs cleanly. > Applied patch compiler/iface/TcIface.hs cleanly. > Applied patch compiler/iface/MkIface.hs cleanly. > Applied patch compiler/iface/BuildTyCl.hs cleanly. > Applied patch compiler/hsSyn/HsTypes.hs cleanly. > Applied patch compiler/hsSyn/Convert.hs cleanly. > Applied patch compiler/deSugar/DsMeta.hs cleanly. > Applied patch compiler/basicTypes/MkId.hs cleanly. > Applied patch compiler/basicTypes/DataCon.hs cleanly. > Submodule 'libffi-tarballs' () registered for path 'libffi-tarballs' > Submodule 'libraries/Cabal' () registered for path 'libraries/Cabal' > Submodule 'libraries/Win32' () registered for path 'libraries/Win32' > Submodule 'libraries/array' () registered for path 'libraries/array' > Submodule 'libraries/binary' () registered for path 'libraries/binary' > Submodule 'libraries/bytestring' () registered for path > 'libraries/bytestring' > Submodule 'libraries/containers' () registered for path > 'libraries/containers' > Submodule 'libraries/deepseq' () registered for path > 'libraries/deepseq' > Submodule 'libraries/directory' () registered for path > 'libraries/directory' > Submodule 'libraries/dph' () registered for path 'libraries/dph' > Submodule 'libraries/filepath' () registered for path > 'libraries/filepath' > Submodule 'libraries/haskeline' () registered for path > 'libraries/haskeline' > Submodule 'libraries/hoopl' () registered for path 'libraries/hoopl' > Submodule 'libraries/hpc' () registered for path 'libraries/hpc' > Submodule 'libraries/parallel' () registered for path > 'libraries/parallel' > Submodule 'libraries/pretty' () registered for path 'libraries/pretty' > Submodule 'libraries/primitive' () registered for path > 'libraries/primitive' > Submodule 'libraries/process' () registered for path > 'libraries/process' > Submodule 'libraries/random' () registered for path 'libraries/random' > Submodule 'libraries/stm' () registered for path 'libraries/stm' > Submodule 'libraries/terminfo' () registered for path > 'libraries/terminfo' > Submodule 'libraries/time' () registered for path 'libraries/time' > Submodule 'libraries/transformers' () registered for path > 'libraries/transformers' > Submodule 'libraries/unix' () registered for path 'libraries/unix' > Submodule 'libraries/vector' () registered for path 'libraries/vector' > Submodule 'libraries/xhtml' () registered for path 'libraries/xhtml' > Submodule 'nofib' () registered for path 'nofib' > Submodule 'utils/haddock' () registered for path 'utils/haddock' > Submodule 'utils/hsc2hs' () registered for path 'utils/hsc2hs' > Submodule path 'libraries/array': checked out > '604afd531aba4a96b066f6e59a08813107a9eed3' > Submodule path 'libraries/parallel': checked out > 'e4e4228ba94178cf31b97fe81b94bff3de6fce03' > Submodule path 'utils/haddock': checked out > '5eb0785cde60997f072c3bdfefaf8c389c96d42e' > > Cherry Pick Failed! > Exception > Command failed with error #1! > COMMAND > git cherry-pick 'arcpatch-D1033' > > STDOUT > # On branch arcpatch-D1069_2 > # Changes not staged for commit: > #(use "git add ..." to update what will be committed) > #(use "git checkout -- ..." to discard changes in working > #directory) > # > #modified:?? libraries/array (new commits) > #modified:?? libraries/parallel (new commits) > #modified:?? utils/haddock (new commits) > # > # Untracked files: > #(use "git add ..." to include in what will be committed) > # > #D1069-patch > #Foo > #compiler/basicTypes/T7287.stderr > #foo > #libraries/integer-gmp2/ > #spj-patch > #testsuite/tests/deriving/should_fail/T2604.hs > #testsuite/tests/deriving/should_fail/T2604.stderr > #testsuite/tests/deriving/should_fail/T5863a.hs > #testsuite/tests/deriving/should_fail/T5863a.stderr > #testsuite/tests/deriving/should_fail/T7800.hs > #testsuite/tests/deriving/should_fail/T7800.stderr > #testsuite/tests/typecheck/should_compile/T9999.hs > #typeable-msg > no changes added to commit (use "git add" and/or "git commit -a") > > > STDERR > The previous cherry-pick is now empty, possibly due to conflict > resolution. > If you wish to commit it anyway, use: > > git commit --allow-empty > > Otherwise, please use 'git reset' > > (Run with `--trace` for a full exception trace.) > 49simonpj at cam-05-unx:~/code/HEAD-5$ > > *From:* Reid Barton [mailto:rwbarton at gmail.com] > > *Sent:* 05 August 2015 17:13 *To:* Thomas Miedema *Cc:* Simon Peyton > Jones; ghc-devs *Subject:* Re: arc patch > > It's actually "arc upgrade". I added a mention of this command to the > wiki: > https://ghc.haskell.org/trac/ghc/wiki/Phabricator#HelpImgettingastrangeerrorwhenrunningarcthatIdidntgetyesterday > Regards, > Reid Barton > > On Wed, Aug 5, 2015 at 12:09 PM, Thomas Miedema > wrote: >> Try running 'arc update' anytime you get such kind of error. >> >> Austin upgrades GHC's Phabricator instance every now and then. >> Sometimes this requires also an update to `arc` for things to >> work again. >> >> On Wed, Aug 5, 2015 at 6:02 PM, Simon Peyton Jones >> wrote: >>> Friends >>> I wanted to build a Phab ticket, so I tried >>> arc patch D1069 >>> but it failed, as below. What do I do now? >>> Thanks >>> Simon >>> simonpj at cam-05-unx:~/code/HEAD-5$ arc patch D1069 >>> You have untracked files in this working copy. >>> >>> Working copy: /home/simonpj/code/HEAD-5/ >>> >>> Untracked files in working copy: >>> Foo >>> compiler/basicTypes/T7287.stderr >>> foo >>> libraries/integer-gmp2/GNUmakefile >>> libraries/integer-gmp2/ghc.mk >>> spj-patch >>> testsuite/tests/deriving/should_fail/T2604.hs >>> testsuite/tests/deriving/should_fail/T2604.stderr >>> testsuite/tests/deriving/should_fail/T5863a.hs >>> testsuite/tests/deriving/should_fail/T5863a.stderr >>> testsuite/tests/deriving/should_fail/T7800.hs >>> testsuite/tests/deriving/should_fail/T7800.stderr >>> testsuite/tests/typecheck/should_compile/T9999.hs >>> typeable-msg >>> >>> Since you don't have '.gitignore' rules for these files and have not >>> listed >>> them in '.git/info/exclude', you may have forgotten to 'git add' >>> them to your >>> commit. >>> >>> >>> Do you want to add these files to the commit? [y/N] N >>> N >>> >>> Created and checked out branch arcpatch-D1069. >>> Exception >>> ERR-CONDUIT-CALL: API Method "differential.query" does not define >>> these parameters: 'arcanistProjects'. >>> (Run with --trace for a full exception trace.) >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > _________________________________________________ > ghc-devs mailing list ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Wed Aug 5 19:53:47 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 05 Aug 2015 12:53:47 -0700 Subject: arc diff problems In-Reply-To: <1438797888-sup-9@sabre> References: <1438797888-sup-9@sabre> Message-ID: <1438804394-sup-1440@sabre> OK, I've resolved the problem; the problem was that I had copypasted an updated commit message into the Phabricator description field which contained special fields, which caused problems for Phabricator. Edward Excerpts from Edward Z. Yang's message of 2015-08-05 11:06:59 -0700: > Hello friends, > > I too am having Arcanist problems (my libphutil and arcanist > are at the latest head). Here's my error: > > [ezyang at hs01 ghc-quick]$ arc diff HEAD~ > You have untracked files in this working copy. > > Working copy: /home/hs01/ezyang/ghc-quick/ > > Untracked changes in working copy: > (To ignore these changes, add them to ".git/info/exclude".) > testsuite/tests/driver/T9938 > testsuite/tests/driver/T9938B > testsuite/tests/driver/dynamicToo/dynamicToo001/d > testsuite/tests/driver/dynamicToo/dynamicToo001/s > testsuite/tests/driver/dynamicToo/dynamicToo005/A005.dyn_o-boot > > Ignore these untracked files and continue? [y/N] y > > Exception > Field "testPlan" occurs twice in commit message! > (Run with `--trace` for a full exception trace.) > > Here is the commit message in question: > > commit 80ef08619c315e35e439e50724afc5d3b3203895 > Author: Edward Z. Yang > Date: Fri Jul 24 15:13:49 2015 -0700 > > Unify hsig and hs-boot; add preliminary "hs-boot" merging. > > This patch drops the file level distinction between hs-boot and hsig; > we figure out which one we are compiling based on whether or not there > is a corresponding hs file lying around. > > To make the "import A" syntax continue to work for bare hs-boot > files, we also introduce hs-boot merging, which takes an A.hi-boot > and converts it to an A.hi when there is no A.hs file in scope. > This will be generalized in Backpack to merge multiple A.hi files together; > which means we can jettison the "load multiple interface files" functionality. > > This works automatically for --make, but for one-shot compilation > we need a new mode: ghc --merge-requirements A will generate an A.hi/A.o > from a local A.hi-boot file; Backpack will extend this mechanism further. > > Has Haddock submodule update to deal with change in msHsFilePath behavior. > > - This commit drops support for the hsig extension. Can > we support it? It's annoying because the finder code is > written with the assumption that where there's an hs-boot > file, there's always an hs file too. To support hsig, you'd > have to probe two locations. Easier to just not support it. > > - #10333 affects us, modifying an hs-boot still doesn't trigger > recomp. > > - See compiler/main/Finder.hs: this diff is very skeevy, but > it seems to work. > > - This code cunningly doesn't drop hs-boot files from the > "drop hs-boot files" module graph, if they don't have a > corresponding hs file. I have no idea if this actually is useful. > > Signed-off-by: Edward Z. Yang > > Test Plan: validate > > Reviewers: simonpj, austin, bgamari, spinda > > Subscribers: thomie > > Differential Revision: https://phabricator.haskell.org/D1098 > > It doesn't work even if I delete "Test Plan". > > Edward From ezyang at mit.edu Wed Aug 5 20:02:51 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 05 Aug 2015 13:02:51 -0700 Subject: arc diff problems In-Reply-To: <1438804394-sup-1440@sabre> References: <1438797888-sup-9@sabre> <1438804394-sup-1440@sabre> Message-ID: <1438804964-sup-1757@sabre> And here's the existing bug report: https://secure.phabricator.com/T6050 Excerpts from Edward Z. Yang's message of 2015-08-05 12:53:47 -0700: > OK, I've resolved the problem; the problem was that I had > copypasted an updated commit message into the Phabricator > description field which contained special fields, which caused > problems for Phabricator. > > Edward > > Excerpts from Edward Z. Yang's message of 2015-08-05 11:06:59 -0700: > > Hello friends, > > > > I too am having Arcanist problems (my libphutil and arcanist > > are at the latest head). Here's my error: > > > > [ezyang at hs01 ghc-quick]$ arc diff HEAD~ > > You have untracked files in this working copy. > > > > Working copy: /home/hs01/ezyang/ghc-quick/ > > > > Untracked changes in working copy: > > (To ignore these changes, add them to ".git/info/exclude".) > > testsuite/tests/driver/T9938 > > testsuite/tests/driver/T9938B > > testsuite/tests/driver/dynamicToo/dynamicToo001/d > > testsuite/tests/driver/dynamicToo/dynamicToo001/s > > testsuite/tests/driver/dynamicToo/dynamicToo005/A005.dyn_o-boot > > > > Ignore these untracked files and continue? [y/N] y > > > > Exception > > Field "testPlan" occurs twice in commit message! > > (Run with `--trace` for a full exception trace.) > > > > Here is the commit message in question: > > > > commit 80ef08619c315e35e439e50724afc5d3b3203895 > > Author: Edward Z. Yang > > Date: Fri Jul 24 15:13:49 2015 -0700 > > > > Unify hsig and hs-boot; add preliminary "hs-boot" merging. > > > > This patch drops the file level distinction between hs-boot and hsig; > > we figure out which one we are compiling based on whether or not there > > is a corresponding hs file lying around. > > > > To make the "import A" syntax continue to work for bare hs-boot > > files, we also introduce hs-boot merging, which takes an A.hi-boot > > and converts it to an A.hi when there is no A.hs file in scope. > > This will be generalized in Backpack to merge multiple A.hi files together; > > which means we can jettison the "load multiple interface files" functionality. > > > > This works automatically for --make, but for one-shot compilation > > we need a new mode: ghc --merge-requirements A will generate an A.hi/A.o > > from a local A.hi-boot file; Backpack will extend this mechanism further. > > > > Has Haddock submodule update to deal with change in msHsFilePath behavior. > > > > - This commit drops support for the hsig extension. Can > > we support it? It's annoying because the finder code is > > written with the assumption that where there's an hs-boot > > file, there's always an hs file too. To support hsig, you'd > > have to probe two locations. Easier to just not support it. > > > > - #10333 affects us, modifying an hs-boot still doesn't trigger > > recomp. > > > > - See compiler/main/Finder.hs: this diff is very skeevy, but > > it seems to work. > > > > - This code cunningly doesn't drop hs-boot files from the > > "drop hs-boot files" module graph, if they don't have a > > corresponding hs file. I have no idea if this actually is useful. > > > > Signed-off-by: Edward Z. Yang > > > > Test Plan: validate > > > > Reviewers: simonpj, austin, bgamari, spinda > > > > Subscribers: thomie > > > > Differential Revision: https://phabricator.haskell.org/D1098 > > > > It doesn't work even if I delete "Test Plan". > > > > Edward From mail at joachim-breitner.de Thu Aug 6 07:48:32 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 06 Aug 2015 09:48:32 +0200 Subject: Strange Changelog.md formatting on Hackage In-Reply-To: References: Message-ID: <1438847312.8278.5.camel@joachim-breitner.de> Dear Sven, Am Mittwoch, den 05.08.2015, 16:57 +0200 schrieb Sven Panne: > The formatting of > https://hackage.haskell.org/package/StateVar-1.1.0.1/changelog is > garbled, while the corresponding GitHub page > https://github.com/haskell-opengl/StateVar/blob/master/CHANGELOG.md l > ooks OK. Can somebody give me a hint why this happens? > https://hackage.haskell.org/package/lens-4.12.3/changelog e.g. looks > nice, but the markdown seems to be similar. one difference is that StateVar?s changelog has CRLF-terminated lines, while lenses? does not. This is likely a bug in hackage-server, you might want to open a issue there. Also, it is technically off-topic on ghc-dev; haskell-cafe might have been more suited for this question. Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From mail at joachim-breitner.de Thu Aug 6 08:08:17 2015 From: mail at joachim-breitner.de (Joachim Breitner) Date: Thu, 06 Aug 2015 10:08:17 +0200 Subject: Recent performance regressions Message-ID: <1438848497.8278.16.camel@joachim-breitner.de> Dear Developers, yesterday, there were three commits showing up at https://perf.haskell.org/ghc/: https://perf.haskell.org/ghc/#revision/60297486fddd5c9695e2263c2ae46fa9 0f0feb9e 60297486fddd5c9695e2263c2ae46fa90f0feb9e Author: Ben Gamari Drop custom mapM impl for [] nofib/allocs/cryptarithm2 + 64.55% nofib/allocs/k-nucleotide - 5.03% nofib/time/fannkuch-redux - 3.62% Did someone forget to run nofib here, or is this serious drop in performance in one case expected and something we condone? In that case, it should have been clearly noted in the commit message! https://perf.haskell.org/ghc/#revision/b12dba7829742de98a483645142c7962 b9dd9f3f b12dba7829742de98a483645142c7962b9dd9f3f Author: RyanGlScott Make Exception datatypes into newtypes nofib/time/cryptarithm1 + 7.22% If you look at the graph, this is quite reproducible: https://perf.haskell.org/ghc/#graph/nofib/time/cryptarithm1;hl=b12dba78 29742de98a483645142c7962b9dd9f3f but I have no idea how that change could have that effect. Maybe some weird cache effect due to binary sizes or different code layout? It would be great to understand that! https://perf.haskell.org/ghc/#revision/22bbc1cf209d44b8bb8897ae7a35f9eb af411b10 commit 22bbc1cf209d44b8bb8897ae7a35f9ebaf411b10 Author: Takano Akio Make sure that `all`, `any`, `and`, and `or` fuse (#9848) nofib/allocs/circsim - 4.02% nofib/allocs/multiplier - 8.70% At least I do not only have bad news... Greetings, Joachim -- Joachim ?nomeata? Breitner mail at joachim-breitner.de ? http://www.joachim-breitner.de/ Jabber: nomeata at joachim-breitner.de ? GPG-Key: 0xF0FBF51F Debian Developer: nomeata at debian.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From svenpanne at gmail.com Thu Aug 6 10:04:04 2015 From: svenpanne at gmail.com (Sven Panne) Date: Thu, 6 Aug 2015 12:04:04 +0200 Subject: Strange Changelog.md formatting on Hackage In-Reply-To: <1438847312.8278.5.camel@joachim-breitner.de> References: <1438847312.8278.5.camel@joachim-breitner.de> Message-ID: 2015-08-06 9:48 GMT+02:00 Joachim Breitner : > Dear Sven, > > Am Mittwoch, den 05.08.2015, 16:57 +0200 schrieb Sven Panne: > > The formatting of > > https://hackage.haskell.org/package/StateVar-1.1.0.1/changelog is > > garbled, while the corresponding GitHub page > > https://github.com/haskell-opengl/StateVar/blob/master/CHANGELOG.md l > > ooks OK. Can somebody give me a hint why this happens? > > https://hackage.haskell.org/package/lens-4.12.3/changelog e.g. looks > > nice, but the markdown seems to be similar. > > one difference is that StateVar?s changelog has CRLF-terminated lines, > while lenses? does not. This is likely a bug in hackage-server, you > might want to open a issue there. > That's a likely explanation, because I did the 'cabal sdist' on Windows. It's a bit funny that nobody noticed that so far, Windows seems to be highly under-represented for Haskell developers compared to Linux/Mac (I regularly switch). :-/ I've opened https://github.com/haskell/hackage-server/issues/402, I wasn't even aware of that GitHub project. > Also, it is technically off-topic on ghc-dev; haskell-cafe might have > been more suited for this question. > Granted, but to me the distinction between haskell/haskell-cafe/ghc-users/ghc-dev is always a bit blurry and IMHO there are too many fragmented lists, especially given that more and more tools are involved in the Haskell ecosystem, so this will get worse. But that's just my personal view.... :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Aug 6 10:22:49 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 6 Aug 2015 10:22:49 +0000 Subject: Master is not validating Message-ID: | Indeed it appears that there is trouble on the `master` branch at the | moment. I'll be looking into this. I find it difficult to interpret Phab results. Eg https://phabricator.haskell.org/B5172 says both "failed" and "passed". Also the build logs displayed eg here https://phabricator.haskell.org/harbormaster/build/5394/?l=0 are truncated so you can't see what the exact failure was. But from the latter I'm guessing that the problem is: - I added a new warning - And I think validate uses -Wall and fails if any warning happens (which seems a bit brutal) The warnings are, I think legitimate. But I certainly don't want them to break the build. How to fix? * Remove -fwarn-all-missed-specialisations from -Wall. * Or add -fno-warn-all-missed-specialisations to validate I don't have a strong opinion. Can you do one or t'other? -fwarn-all-missed-specialisations says that there are overloaded functions being called without being specialised. That's quite interesting, and it's the kind of thing you'd like to see with "all warnings". But it's also perfectly reasonable to say "that's fine, I don't care that overloaded functions are being called". Are there any other warnings that -Wall does not include? That is, do we expect -Wall to mean "all" or just "nearly all"? I have no strong opinion. Apologies for causing this fuss. Simon | -----Original Message----- | From: noreply at phabricator.haskell.org | [mailto:noreply at phabricator.haskell.org] | Sent: 06 August 2015 10:56 | To: Simon Peyton Jones | Subject: [Differential] [Requested Changes To] D1133: Make derived | names deterministic | | bgamari requested changes to this revision. | bgamari added a comment. | This revision now requires changes to proceed. | | Indeed it appears that there is trouble on the `master` branch at the | moment. I'll be looking into this. | | The idea here looks reasonable, but what effect does this have on the | size of the build artifacts (e.g. object and interface files). It | seems like the names may grow substantially with this change. Could | you characterize this? | | Also, I would really like to see some haddocks on the top-level | bindings that are introduced here. | | | REPOSITORY | rGHC Glasgow Haskell Compiler | | REVISION DETAIL | https://phabricator.haskell.org/D1133 | | EMAIL PREFERENCES | https://phabricator.haskell.org/settings/panel/emailpreferences/ | | To: niteria, simonmar, simonpj, austin, bgamari | Cc: thomie From thomasmiedema at gmail.com Thu Aug 6 11:00:53 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 6 Aug 2015 13:00:53 +0200 Subject: Master is not validating In-Reply-To: References: Message-ID: Simon, your two most recent commits are not passing validate, see https://phabricator.haskell.org/diffusion/GHC/history The first commit fails with a type error, though it seems you fixed this in the second commit. The second commit fails with: *** unexpected pass for T8968-3(normal) *** unexpected pass for T8968-1(normal) When master is not passing validate, patch submissions to Phabricator will not pass validate either. When master is passing validate again, click 'Restart build' on the Phabricator Build you were investigating, and the results should be much clearer. Thomas On Thu, Aug 6, 2015 at 12:22 PM, Simon Peyton Jones wrote: > | Indeed it appears that there is trouble on the `master` branch at the > | moment. I'll be looking into this. > > I find it difficult to interpret Phab results. Eg > https://phabricator.haskell.org/B5172 > says both "failed" and "passed". > > Also the build logs displayed eg here > https://phabricator.haskell.org/harbormaster/build/5394/?l=0 > are truncated so you can't see what the exact failure was. > > But from the latter I'm guessing that the problem is: > - I added a new warning > - And I think validate uses -Wall and fails if any warning happens > (which seems a bit brutal) > > The warnings are, I think legitimate. But I certainly don't want them to > break the build. How to fix? > * Remove -fwarn-all-missed-specialisations from -Wall. > * Or add -fno-warn-all-missed-specialisations to validate > > I don't have a strong opinion. Can you do one or t'other? > > -fwarn-all-missed-specialisations says that there are overloaded functions > being called without being specialised. That's quite interesting, and it's > the kind of thing you'd like to see with "all warnings". But it's also > perfectly reasonable to say "that's fine, I don't care that overloaded > functions are being called". > > Are there any other warnings that -Wall does not include? That is, do we > expect -Wall to mean "all" or just "nearly all"? > > I have no strong opinion. > > Apologies for causing this fuss. > > Simon > > | -----Original Message----- > | From: noreply at phabricator.haskell.org > | [mailto:noreply at phabricator.haskell.org] > | Sent: 06 August 2015 10:56 > | To: Simon Peyton Jones > | Subject: [Differential] [Requested Changes To] D1133: Make derived > | names deterministic > | > | bgamari requested changes to this revision. > | bgamari added a comment. > | This revision now requires changes to proceed. > | > | Indeed it appears that there is trouble on the `master` branch at the > | moment. I'll be looking into this. > | > | The idea here looks reasonable, but what effect does this have on the > | size of the build artifacts (e.g. object and interface files). It > | seems like the names may grow substantially with this change. Could > | you characterize this? > | > | Also, I would really like to see some haddocks on the top-level > | bindings that are introduced here. > | > | > | REPOSITORY > | rGHC Glasgow Haskell Compiler > | > | REVISION DETAIL > | https://phabricator.haskell.org/D1133 > | > | EMAIL PREFERENCES > | https://phabricator.haskell.org/settings/panel/emailpreferences/ > | > | To: niteria, simonmar, simonpj, austin, bgamari > | Cc: thomie > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Thu Aug 6 11:06:14 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Thu, 6 Aug 2015 13:06:14 +0200 Subject: Master is not validating In-Reply-To: References: Message-ID: > your two most recent commits are not passing validate, see > https://phabricator.haskell.org/diffusion/GHC/history > That link was apparently missing a trailing slash. This is the correct link: https://phabricator.haskell.org/diffusion/GHC/history/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Fri Aug 7 22:38:27 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 7 Aug 2015 22:38:27 +0000 Subject: Simon away Message-ID: <9a261447298d4267a2c4d2029a3d36fb@DB4PR30MB030.064d.mgd.msft.net> Friends, I'm going to be on holiday for 2 weeks. Back in action on 24 August. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at izbicki.me Fri Aug 7 22:40:30 2015 From: mike at izbicki.me (Mike Izbicki) Date: Fri, 7 Aug 2015 15:40:30 -0700 Subject: question about GHC API on GHC plugin Message-ID: I'm trying to write a GHC plugin. The purpose of the plugin is to provide Haskell bindings to Herbie. Herbie (https://github.com/uwplse/herbie) is a program that takes a mathematical statement as input, and gives you a numerically stable formula to compute it as output. The plugin is supposed to automate this process for Haskell programs. I can convert the core expressions into a format for Herbie just fine. Where I'm having trouble is converting the output from Herbie back into core. Given a string that represents a numeric operator (e.g. "log" or "+"), I can get that converted into a Name that matches the Name of the version of that operator in scope at the location. But in order to create an Expr, I need to convert the Name into a Var. All the functions that I can find for this (e.g. mkGlobalVar) also require the type of the variable. But I can't find a way to figure out the Type given a Name. How can I do this? From mike at izbicki.me Sat Aug 8 01:33:03 2015 From: mike at izbicki.me (Mike Izbicki) Date: Fri, 7 Aug 2015 18:33:03 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: Message-ID: I've figured out a hack that partially works. If the Name of the function I'm trying to call is stored in `name`, then the following code puts its type in `t`: ``` hscenv <- getHscEnv t <- liftIO $ do eps <- hscEPS hscenv let i = fromJust $ lookupNameEnv (eps_PTE eps) name return varType i ``` This, however, only works if the function has actually been used in the module before. That makes it more fragile than I'd like. Also, once I have the type `t`, I'm having trouble creating the appropriate dictionary to pass as the first argument in core. I can get the name of the class constraint doing something like: ``` let (ClassPred c _) = classifyPredType . snd . splitAppTy . fst . splitAppTy $ dropForAlls t ``` But I can't find a function for creating a dictionary (of type Var) given a Class. On Fri, Aug 7, 2015 at 3:40 PM, Mike Izbicki wrote: > I'm trying to write a GHC plugin. The purpose of the plugin is to > provide Haskell bindings to Herbie. Herbie > (https://github.com/uwplse/herbie) is a program that takes a > mathematical statement as input, and gives you a numerically stable > formula to compute it as output. The plugin is supposed to automate > this process for Haskell programs. > > I can convert the core expressions into a format for Herbie just fine. > Where I'm having trouble is converting the output from Herbie back > into core. Given a string that represents a numeric operator (e.g. > "log" or "+"), I can get that converted into a Name that matches the > Name of the version of that operator in scope at the location. But in > order to create an Expr, I need to convert the Name into a Var. All > the functions that I can find for this (e.g. mkGlobalVar) also require > the type of the variable. But I can't find a way to figure out the > Type given a Name. How can I do this? From ezyang at mit.edu Sat Aug 8 06:20:39 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 07 Aug 2015 23:20:39 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: Message-ID: <1439014742-sup-2126@sabre> Hello Mike, Give importDecl from LoadIface a try, or maybe tcLookupGlobal if you're in TcM. Edward Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: > I'm trying to write a GHC plugin. The purpose of the plugin is to > provide Haskell bindings to Herbie. Herbie > (https://github.com/uwplse/herbie) is a program that takes a > mathematical statement as input, and gives you a numerically stable > formula to compute it as output. The plugin is supposed to automate > this process for Haskell programs. > > I can convert the core expressions into a format for Herbie just fine. > Where I'm having trouble is converting the output from Herbie back > into core. Given a string that represents a numeric operator (e.g. > "log" or "+"), I can get that converted into a Name that matches the > Name of the version of that operator in scope at the location. But in > order to create an Expr, I need to convert the Name into a Var. All > the functions that I can find for this (e.g. mkGlobalVar) also require > the type of the variable. But I can't find a way to figure out the > Type given a Name. How can I do this? From eir at cis.upenn.edu Sat Aug 8 18:23:31 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Sat, 8 Aug 2015 14:23:31 -0400 Subject: Linking issue on Windows: #10672 Message-ID: <8839B562-9A83-4137-A0D6-5DC15137C9C6@cis.upenn.edu> Hi devs, Bug #10672 was posted two weeks ago and has not received any response. It concerns linking issues on Windows with Template Haskell. This is very far out of my depth, but I don't recognize the name of the poster (lukexi), and so it would be great to let that person know we've received the report and have at least triaged it. Who are the Windows task force? I can't seem to find that listed on https://ghc.haskell.org/trac/ghc/wiki/TeamGHC, but perhaps there's a better place to look. If you have any idea of what's going on, would you mind at least commenting on the bug? Thanks! Richard Template Haskell czar but linking buffoon and Windows ignoramus From johnw at newartisans.com Sat Aug 8 19:35:25 2015 From: johnw at newartisans.com (John Wiegley) Date: Sat, 08 Aug 2015 12:35:25 -0700 Subject: Linking issue on Windows: #10672 In-Reply-To: <8839B562-9A83-4137-A0D6-5DC15137C9C6@cis.upenn.edu> (Richard Eisenberg's message of "Sat, 8 Aug 2015 14:23:31 -0400") References: <8839B562-9A83-4137-A0D6-5DC15137C9C6@cis.upenn.edu> Message-ID: >>>>> Richard Eisenberg writes: > If you have any idea of what's going on, would you mind at least commenting > on the bug? I would also like to know who knows what about which linking platforms we support. John From johnrbowman at gmail.com Sun Aug 9 02:56:00 2015 From: johnrbowman at gmail.com (Jack Bowman) Date: Sat, 8 Aug 2015 22:56:00 -0400 Subject: Type checker error messages question Message-ID: Hi all, I'm new to the GHC codebase and am looking to contribute. I started trying to implement https://ghc.haskell.org/trac/ghc/ticket/9173 , "better type error messages". I'm having some difficulty and was hoping someone more experienced could point me in the right direction. Part of the proposed change is that the "inferred:" line lists both the expression ("Just 5") and its type ("Maybe a0"). That's good because it's more clear that way. Currently those are in different parts of the error message. The function misMatchMsg (TcErrors.hs) creates the text "couldn't match expected type...". The text "In the second argument of..." is created by funAppCtxt (TcExpr.hs) during type checking. Bringing these together seems tricky. When we're generating a message like "couldn't match expected type..." I don't see how to access the expression text, like the "LHsExpr Name"s available to funAppCtxt. Is there an easy way to get that? We have the constraint and the two types, but I don't think those include this info. It seems fundamental to the error messages that the context ("in the _ argument..., in the expression ...") is built up as *text* while we do type checking, which means it's hard to customize when building the final error message. Please let me know if I'm missing something. There's a lot of type checker state that I might be misunderstanding. Thanks,Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From dedgrant at gmail.com Sun Aug 9 17:52:11 2015 From: dedgrant at gmail.com (Darren Grant) Date: Sun, 9 Aug 2015 10:52:11 -0700 Subject: ghc-devs Digest, Vol 144, Issue 13 In-Reply-To: References: Message-ID: The Windows Task Force roster is here: https://ghc.haskell.org/trac/ghc/wiki/WindowsTaskForce A couple months ago I was looking into some of the linker issues (one manifestation of which was failure to link in mingw under TH phases as in #10672), but I was sadly unable to make much headway. I'm still interested in cracking this as Windows is my primary development platform, but new leads are required. Eventually, I may just seek a ground-up understanding of linkers in Windows and mingw. If anyone else has leads, please let me know if I can help. Cheers, Darren On Sun, Aug 9, 2015 at 5:00 AM, wrote: > Send ghc-devs mailing list submissions to > ghc-devs at haskell.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > or, via email, send a message with subject or body 'help' to > ghc-devs-request at haskell.org > > You can reach the person managing the list at > ghc-devs-owner at haskell.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of ghc-devs digest..." > > > Today's Topics: > > 1. Linking issue on Windows: #10672 (Richard Eisenberg) > 2. Re: Linking issue on Windows: #10672 (John Wiegley) > 3. Type checker error messages question (Jack Bowman) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 8 Aug 2015 14:23:31 -0400 > From: Richard Eisenberg > To: ghc-devs Devs > Subject: Linking issue on Windows: #10672 > Message-ID: <8839B562-9A83-4137-A0D6-5DC15137C9C6 at cis.upenn.edu> > Content-Type: text/plain; charset=us-ascii > > Hi devs, > > Bug #10672 was posted two weeks ago and has not received any response. It > concerns linking issues on Windows with Template Haskell. This is very far > out of my depth, but I don't recognize the name of the poster (lukexi), and > so it would be great to let that person know we've received the report and > have at least triaged it. Who are the Windows task force? I can't seem to > find that listed on https://ghc.haskell.org/trac/ghc/wiki/TeamGHC, but > perhaps there's a better place to look. > > If you have any idea of what's going on, would you mind at least > commenting on the bug? > > Thanks! > Richard > Template Haskell czar but linking buffoon and Windows ignoramus > > ------------------------------ > > Message: 2 > Date: Sat, 08 Aug 2015 12:35:25 -0700 > From: "John Wiegley" > To: Richard Eisenberg > Cc: ghc-devs Devs > Subject: Re: Linking issue on Windows: #10672 > Message-ID: > Content-Type: text/plain > > >>>>> Richard Eisenberg writes: > > > If you have any idea of what's going on, would you mind at least > commenting > > on the bug? > > I would also like to know who knows what about which linking platforms we > support. > > John > > > ------------------------------ > > Message: 3 > Date: Sat, 8 Aug 2015 22:56:00 -0400 > From: Jack Bowman > To: ghc-devs at haskell.org > Subject: Type checker error messages question > Message-ID: > < > CACcVNgUgbw9WLocN7dFM5Wewh8WbG4d81MwNmZQhcSedoSJEVQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi all, > I'm new to the GHC codebase and am looking to contribute. I started trying > to implement https://ghc.haskell.org/trac/ghc/ticket/9173 , "better type > error messages". I'm having some difficulty and was hoping someone more > experienced could point me in the right direction. > Part of the proposed change is that the "inferred:" line lists both the > expression ("Just 5") and its type ("Maybe a0"). That's good because it's > more clear that way. Currently those are in different parts of the error > message. The function misMatchMsg (TcErrors.hs) creates the text "couldn't > match expected type...". The text "In the second argument of..." is created > by funAppCtxt (TcExpr.hs) during type checking. Bringing these together > seems tricky. When we're generating a message like "couldn't match expected > type..." I don't see how to access the expression text, like the "LHsExpr > Name"s available to funAppCtxt. Is there an easy way to get that? We have > the constraint and the two types, but I don't think those include this > info. > It seems fundamental to the error messages that the context ("in the _ > argument..., in the expression ...") is built up as *text* while we do type > checking, which means it's hard to customize when building the final error > message. > Please let me know if I'm missing something. There's a lot of type checker > state that I might be misunderstanding. > Thanks,Jack > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.haskell.org/pipermail/ghc-devs/attachments/20150808/500a1dcd/attachment-0001.html > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > ------------------------------ > > End of ghc-devs Digest, Vol 144, Issue 13 > ***************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Sun Aug 9 20:10:46 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Sun, 9 Aug 2015 16:10:46 -0400 Subject: Type checker error messages question In-Reply-To: References: Message-ID: <33BB517C-9EF7-4E14-BA3C-365124E21623@cis.upenn.edu> Hi Jack, You're spot on here. And, looking through #9173, I wouldn't expect this to be an easy change, precisely for the reason you articulate here. You could change the TypeEqOrigin constructor of CtOrigin (in TcRnTypes.hs) to also carry the expression, except that sometimes even that will be hard. In unrelated work, I needed to do precisely this, and needed to use something like (Maybe (HsExpr TcId)). In any case, this isn't newcomer material, I'm afraid. You're welcome to keep trying, but I think your energy is better spent elsewhere for a few bugs and then to return here with a little more experience working with GHC. If you want type-checker related bugs, you could always look through the first section of my pet page of such things: https://ghc.haskell.org/trac/ghc/wiki/Status/RAE-Tickets But even things listed there as "easy" mean that they may be easy for someone who knows GHC well, but not so much for a newcomer. Thanks for contributing! Richard On Aug 8, 2015, at 10:56 PM, Jack Bowman wrote: > Hi all, > I'm new to the GHC codebase and am looking to contribute. I started trying to implement https://ghc.haskell.org/trac/ghc/ticket/9173 , "better type error messages". I'm having some difficulty and was hoping someone more experienced could point me in the right direction. > Part of the proposed change is that the "inferred:" line lists both the expression ("Just 5") and its type ("Maybe a0"). That's good because it's more clear that way. Currently those are in different parts of the error message. The function misMatchMsg (TcErrors.hs) creates the text "couldn't match expected type...". The text "In the second argument of..." is created by funAppCtxt (TcExpr.hs) during type checking. Bringing these together seems tricky. When we're generating a message like "couldn't match expected type..." I don't see how to access the expression text, like the "LHsExpr Name"s available to funAppCtxt. Is there an easy way to get that? We have the constraint and the two types, but I don't think those include this info. > It seems fundamental to the error messages that the context ("in the _ argument..., in the expression ...") is built up as *text* while we do type checking, which means it's hard to customize when building the final error message. > Please let me know if I'm missing something. There's a lot of type checker state that I might be misunderstanding. > Thanks,Jack > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Mon Aug 10 23:10:36 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Tue, 11 Aug 2015 01:10:36 +0200 Subject: Record syntax for pattern synonyms Message-ID: I was looking at implementing #8582 but before I got too far I thought it best to clear up a few design points. A summary can be found below and a more fleshed out version with some examples can be found on the wiki page[1]. My main question is about how best to deal with record updates. Say that Foo is a record pattern synonym then how would we expect the following program to behave? ``` foo a at Foo{..} = a {bar = baz} ``` Then say that `pattern Foo{bar} = Just bar`, how should the following two programs behave? Is this partiality any different to that caused by ordinary use of pattern synonyms? (At least partiality in patterns is warned but how comprehensive is the coverage?) ``` foo :: Maybe a -> Maybe Int foo x = x {bar = 5} -- error as `bar` unique determines that we need Foo bar = Nothing {bar = 5} ``` Abandoning record updates seems to make record syntax for pattern synonyms far less useful and confusing to users. Is this design how others have imagined it? I have cced Gerg? who originally implemented the extension and created #8582. Matt ---- Unidirectional patterns * Provide the same ability to match as normal records (RecordWildcards etc) * Provide selector functions Bidirectional patterns * Provide the constructor which can be used as normal record constructors Record Updates - unclear * Generalise update syntax to arbitrary expressions? [1]: https://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms#RecordPatternSynonyms From eir at cis.upenn.edu Tue Aug 11 12:26:00 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 11 Aug 2015 08:26:00 -0400 Subject: Record syntax for pattern synonyms In-Reply-To: References: Message-ID: On Aug 10, 2015, at 7:10 PM, Matthew Pickering wrote: > > My main question is about how best to deal with record updates. Say > that Foo is a record pattern synonym then how would we expect the > following program to behave? > > ``` > foo a at Foo{..} = a {bar = baz} > ``` > I'm assuming `pattern Foo{bar, baz} = (bar, baz)` from the wiki page, without any further pattern type signature. This example then looks straightforward to me -- I feel I'm missing the subtlety. `foo` would get the type `(a,b) -> (b,b)` and would be roughly equivalent to `foo a@(bar, baz) = case a of (_, baz2) -> (baz, baz2)`. The case statement and baz2 is necessary just to provide a predictable desugaring of record updates; handwritten code should clearly be more succinct. > Then say that `pattern Foo{bar} = Just bar`, how should the following > two programs behave? Is this partiality any different to that caused > by ordinary use of pattern synonyms? Did you mean "pattern synonyms" --> "record updates"? > (At least partiality in patterns > is warned but how comprehensive is the coverage?) > > ``` > foo :: Maybe a -> Maybe Int > foo x = x {bar = 5} This would desugar to `foo x = case x of Just _ -> Just 5`. I'm not sure about pattern exhaustiveness warnings, but I would expect such a record update to be partial. The partiality of record updates has been surprising in the past, but I don't think adding pattern synonyms to the mix should change that. > > -- error as `quux` unique determines that we need Foo > quux = Nothing {bar = 5} This (renamed example) does not compile, as GHC would parse this as a record-creation expression using the Nothing constructor, which does not have record fields. On the other hand > quux = (Nothing) {bar = 5} *would* compile. (This is no different than record updates today.) It would desugar to `quux = case Nothing of Just _ -> Just 5` and `quux` would have type `Maybe Int` (assuming `5 :: Int`). Evaluating `quux` would clearly fail. > ``` > > Abandoning record updates seems to make record syntax for pattern > synonyms far less useful and confusing to users. Is this design how > others have imagined it? I have cced Gerg? who originally implemented > the extension and created #8582. I would like to keep record updates for the same reasons you appear to. I will warn that they are quite hard to work with, though! About 220 lines of dense code (including comments) are necessary to type-check regular old record updates. This isn't to scare you off, but to have you suitably forewarned and forearmed. > > Matt > > ---- > > Unidirectional patterns > * Provide the same ability to match as normal records (RecordWildcards etc) > * Provide selector functions > > Bidirectional patterns > * Provide the constructor which can be used as normal record constructors > > Record Updates - unclear > * Generalise update syntax to arbitrary expressions? What do you mean here? Without checking, I assumed that the x in `x { ... }` had to be a variable. But this is wrong! See 3.15.3 of the Haskell 2010 report (https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-490003.15). So I think it's already generalized. Many thanks for taking this on! Richard > > > [1]: https://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms#RecordPatternSynonyms > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From gergo at erdi.hu Tue Aug 11 13:05:49 2015 From: gergo at erdi.hu (=?UTF-8?B?RHIuIMOJUkRJIEdlcmfFkQ==?=) Date: Tue, 11 Aug 2015 21:05:49 +0800 Subject: Record syntax for pattern synonyms In-Reply-To: References: Message-ID: Record field updates via patsyns looks very weird to me (and, as just a user, it would be unexpected). Can't we do just matchers and builders for now, and add field updaters as a second step, if there's concensus that it's a Good Idea? Bye, Gergo On 11 Aug 2015 07:11, "Matthew Pickering" wrote: > I was looking at implementing #8582 but before I got too far I thought > it best to clear up a few design points. > > A summary can be found below and a more fleshed out version with some > examples can be found on the wiki page[1]. > > My main question is about how best to deal with record updates. Say > that Foo is a record pattern synonym then how would we expect the > following program to behave? > > ``` > foo a at Foo{..} = a {bar = baz} > ``` > > Then say that `pattern Foo{bar} = Just bar`, how should the following > two programs behave? Is this partiality any different to that caused > by ordinary use of pattern synonyms? (At least partiality in patterns > is warned but how comprehensive is the coverage?) > > ``` > foo :: Maybe a -> Maybe Int > foo x = x {bar = 5} > > -- error as `bar` unique determines that we need Foo > bar = Nothing {bar = 5} > ``` > > Abandoning record updates seems to make record syntax for pattern > synonyms far less useful and confusing to users. Is this design how > others have imagined it? I have cced Gerg? who originally implemented > the extension and created #8582. > > Matt > > ---- > > Unidirectional patterns > * Provide the same ability to match as normal records (RecordWildcards etc) > * Provide selector functions > > Bidirectional patterns > * Provide the constructor which can be used as normal record constructors > > Record Updates - unclear > * Generalise update syntax to arbitrary expressions? > > > [1]: > https://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms#RecordPatternSynonyms > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Tue Aug 11 13:11:54 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 11 Aug 2015 09:11:54 -0400 Subject: Record syntax for pattern synonyms In-Reply-To: References: Message-ID: <7178A68B-E256-4F70-A006-26569097A0E5@cis.upenn.edu> I think debating the overall idea before implementing is a great idea. Here's my reason why I like these: it allows a library designer to change an internal representation of a previously-concrete datatype while providing backward compatibility. Since the datatype had been exporting its constructors, clients might have used record update. Now the author can change the datatype and add pattern synonyms to keep the interface constant. Without the feature proposed here, such a change would be impossible. Furthermore, record update syntax is awfully convenient, and may be attractive to new libraries with abstract types. I haven't tried to do it, but I imagine you could do some cool lens-like constructs with proper (ab)use of this feature. Richard On Aug 11, 2015, at 9:05 AM, Dr. ?RDI Gerg? wrote: > Record field updates via patsyns looks very weird to me (and, as just a user, it would be unexpected). Can't we do just matchers and builders for now, and add field updaters as a second step, if there's concensus that it's a Good Idea? > > Bye, > Gergo > > On 11 Aug 2015 07:11, "Matthew Pickering" wrote: > I was looking at implementing #8582 but before I got too far I thought > it best to clear up a few design points. > > A summary can be found below and a more fleshed out version with some > examples can be found on the wiki page[1]. > > My main question is about how best to deal with record updates. Say > that Foo is a record pattern synonym then how would we expect the > following program to behave? > > ``` > foo a at Foo{..} = a {bar = baz} > ``` > > Then say that `pattern Foo{bar} = Just bar`, how should the following > two programs behave? Is this partiality any different to that caused > by ordinary use of pattern synonyms? (At least partiality in patterns > is warned but how comprehensive is the coverage?) > > ``` > foo :: Maybe a -> Maybe Int > foo x = x {bar = 5} > > -- error as `bar` unique determines that we need Foo > bar = Nothing {bar = 5} > ``` > > Abandoning record updates seems to make record syntax for pattern > synonyms far less useful and confusing to users. Is this design how > others have imagined it? I have cced Gerg? who originally implemented > the extension and created #8582. > > Matt > > ---- > > Unidirectional patterns > * Provide the same ability to match as normal records (RecordWildcards etc) > * Provide selector functions > > Bidirectional patterns > * Provide the constructor which can be used as normal record constructors > > Record Updates - unclear > * Generalise update syntax to arbitrary expressions? > > > [1]: https://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms#RecordPatternSynonyms > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From allbery.b at gmail.com Tue Aug 11 13:42:23 2015 From: allbery.b at gmail.com (Brandon Allbery) Date: Tue, 11 Aug 2015 09:42:23 -0400 Subject: Record syntax for pattern synonyms In-Reply-To: <7178A68B-E256-4F70-A006-26569097A0E5@cis.upenn.edu> References: <7178A68B-E256-4F70-A006-26569097A0E5@cis.upenn.edu> Message-ID: On Tue, Aug 11, 2015 at 9:11 AM, Richard Eisenberg wrote: > I haven't tried to do it, but I imagine you could do some cool lens-like > constructs with proper (ab)use of this feature. Seems likely given that generalizing record update was the original impetus for lenses. :) -- brandon s allbery kf8nh sine nomine associates allbery.b at gmail.com ballbery at sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From lonetiger at gmail.com Tue Aug 11 19:43:34 2015 From: lonetiger at gmail.com (lonetiger at gmail.com) Date: Tue, 11 Aug 2015 21:43:34 +0200 Subject: Linker questions Message-ID: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> Hi *, I had a few questions about the linker I was hoping someone can help me with, I'm pretty new to it so any insights would be appreciated. 1) Has to do with checkProddableBlock and #10672 and #10563 static void checkProddableBlock (ObjectCode *oc, void *addr, size_t size ) { ProddableBlock* pb; for (pb = oc->proddables; pb != NULL; pb = pb->next) { char* s = (char*)(pb->start); char* e = s + pb->size; char* a = (char*)addr; if (a >= s && (a+size) <= e) return; } barf("checkProddableBlock: invalid fixup in runtime linker: %p", addr); } >From what I have found, these errors seem to happen because oc->proddables is initially NULL, the for loop is skipped. From what I can tell, this function is checking if there's a "proddable" that fits within the supplied address and size. So if there is no proddables to begin with, should this check just not be skipped and the callee of this call not use this ObjectCode instead of erroring out? 2) The second question is about static int ocGetNames_PEi386 ( ObjectCode* oc ) I am getting a test failure because it is claiming that .eh_frame section is misaligned. This comes from this code: if (kind != SECTIONKIND_OTHER && end >= start) { if ((((size_t)(start)) % 4) != 0) { errorBelch("Misaligned section %s: %p", (char*)secname, start); stgFree(secname); return 0; } Where start is defined as: start = ((UChar*)(oc->image)) + sectab_i->PointerToRawData; and oc->image is a memory location received by allocateImageAndTrampolines. In the case of my test failure it is because the .eh_frame section seems to begin at 0x30F since oc->image will always be 4 aligned (so it doesn't really matter in the check) it gives that error because PointerToRawData isn't aligned by 4. So my question is would it not be better just to check the alignment flag in the PE section header instead of checking the load address (which is always going to aligned to 4?) and The file pointer to the first page of the section within the COFF file to determine the alignment? Like objdump and dumpbin do? e.g. 9 .eh_frame 00000038 00000000 00000000 0000030f 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA Is the output from objdump which correctly determines the alignment from the section. From what I understand from the PE specification the on disk address doesn't have to be aligned by 4: "For object files, the value *should* be aligned on a 4-byte boundary for best performance." I'm wondering if we are incorrectly erroring out here, instead of using the section and making sure we pad it to the alignment boundary. Regards, Tamar -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Tue Aug 11 21:26:38 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Tue, 11 Aug 2015 23:26:38 +0200 Subject: Record syntax for pattern synonyms In-Reply-To: References: Message-ID: Thank you for your comments Richard. > I'm assuming `pattern Foo{bar, baz} = (bar, baz)` from the wiki page, without any further pattern type signature. This example then looks straightforward to me -- I feel I'm missing the subtlety. `foo` would get the type `(a,b) -> (b,b)` and would be roughly equivalent to `foo a@(bar, baz) = case a of (_, baz2) -> (baz, baz2)`. The case statement and baz2 is necessary just to provide a predictable desugaring of record updates; handwritten code should clearly be more succinct. This is how I imagined it to work. > This would desugar to `foo x = case x of Just _ -> Just 5`. I'm not sure about pattern exhaustiveness warnings, but I would expect such a record update to be partial. The partiality of record updates has been surprising in the past, but I don't think adding pattern synonyms to the mix should change that. Yes, I agree. > I would like to keep record updates for the same reasons you appear to. I will warn that they are quite hard to work with, though! About 220 lines of dense code (including comments) are necessary to type-check regular old record updates. This isn't to scare you off, but to have you suitably forewarned and forearmed. I consider myself warned! > What do you mean here? Without checking, I assumed that the x in `x { ... }` had to be a variable. But this is wrong! See 3.15.3 of the Haskell 2010 report (https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-490003.15). So I think it's already generalized. Good news. This should simplify the implementation. > > Many thanks for taking this on! > Richard > From matthewtpickering at gmail.com Wed Aug 12 15:34:23 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Wed, 12 Aug 2015 17:34:23 +0200 Subject: [GHC] #9868: ghc: panic! Dynamic linker not initialised In-Reply-To: <061.f170b9a853efaaee4125090d86cd0913@haskell.org> References: <046.5f3f9e960e287bc6e82a07e7ce17a8b6@haskell.org> <061.f170b9a853efaaee4125090d86cd0913@haskell.org> Message-ID: I can't reproduce this with 7.10.2. On Wed, Aug 12, 2015 at 4:54 PM, GHC wrote: > #9868: ghc: panic! Dynamic linker not initialised > -------------------------------------+------------------------------------- > Reporter: Jamedjo | Owner: > Type: bug | Status: infoneeded > Priority: normal | Milestone: > Component: Compiler | Version: 7.8.3 > Resolution: | Keywords: > Operating System: MacOS X | Architecture: x86_64 > Type of failure: Compile-time | (amd64) > crash | Test Case: > Blocked By: | Blocking: > Related Tickets: | Differential Revisions: > -------------------------------------+------------------------------------- > Changes (by bgamari): > > * status: new => infoneeded > > > Comment: > > I'm not really sure what we can do with this bug as is as there isn't > enough information here to reproduce the crash. Moreover, it looks two of > the issues reported here are distinct from the panic which apparently > provoked the ticket. > > Could those affected by this see if they can reproduce the issue on GHC > 7.10.2 and if so provide a detailed set of steps to reproduce the issue? > > -- > Ticket URL: > GHC > The Glasgow Haskell Compiler From eir at cis.upenn.edu Fri Aug 14 02:47:36 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Thu, 13 Aug 2015 22:47:36 -0400 Subject: Linker questions In-Reply-To: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> References: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> Message-ID: <93CD0804-6CAA-4F50-B8D4-62AA0376BA57@cis.upenn.edu> Hi Tamar, I haven't a clue about any of this. But I didn't want your detailed email to go without any response. Perhaps agitate a bit on #ghc at freenode to get some attention? Also, be aware that many people are on holiday right now, and so responses may be slower than at other times. Sorry I can't be more helpful, but I definitely appreciate your looking into this! Richard On Aug 11, 2015, at 3:43 PM, lonetiger at gmail.com wrote: > Hi *, > > I had a few questions about the linker I was hoping someone can help me with, > I'm pretty new to it so any insights would be appreciated. > > 1) Has to do with checkProddableBlock and #10672 and #10563 > > static void checkProddableBlock (ObjectCode *oc, void *addr, size_t size ) > { > ProddableBlock* pb; > > for (pb = oc->proddables; pb != NULL; pb = pb->next) { > char* s = (char*)(pb->start); > char* e = s + pb->size; > char* a = (char*)addr; > if (a >= s && (a+size) <= e) return; > } > barf("checkProddableBlock: invalid fixup in runtime linker: %p", addr); > } > > From what I have found, these errors seem to happen because oc->proddables is initially NULL, > the for loop is skipped. From what I can tell, this function is checking if there's a "proddable" > that fits within the supplied address and size. So if there is no proddables to begin with, should this > check just not be skipped and the callee of this call not use this ObjectCode instead of erroring out? > > 2) The second question is about static int ocGetNames_PEi386 ( ObjectCode* oc ) > I am getting a test failure because it is claiming that .eh_frame section is misaligned. > This comes from this code: > > if (kind != SECTIONKIND_OTHER && end >= start) { > if ((((size_t)(start)) % 4) != 0) { > errorBelch("Misaligned section %s: %p", (char*)secname, start); > stgFree(secname); > return 0; > } > > Where start is defined as: > > start = ((UChar*)(oc->image)) + sectab_i->PointerToRawData; > and oc->image is a memory location received by allocateImageAndTrampolines. > > In the case of my test failure it is because the .eh_frame section seems to begin at 0x30F > since oc->image will always be 4 aligned (so it doesn't really matter in the check) it gives that error because PointerToRawData isn't aligned by 4. > > So my question is would it not be better just to check the alignment flag in the PE section header instead of checking the load address (which is always going to aligned to 4?) and The file pointer to > the first page of the section within the COFF file to determine the alignment? Like objdump and dumpbin do? > > e.g. > > 9 .eh_frame 00000038 00000000 00000000 0000030f 2**2 > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > > Is the output from objdump which correctly determines the alignment from the section. From what I understand from the PE specification > the on disk address doesn't have to be aligned by 4: > > "For object files, the value *should* be aligned on a 4-byte boundary for best performance." > > I'm wondering if we are incorrectly erroring out here, instead of using the section and making sure we pad it to the alignment boundary. > > Regards, > Tamar > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From lonetiger at gmail.com Fri Aug 14 03:31:18 2015 From: lonetiger at gmail.com (lonetiger at gmail.com) Date: Fri, 14 Aug 2015 05:31:18 +0200 Subject: Linker questions In-Reply-To: <93CD0804-6CAA-4F50-B8D4-62AA0376BA57@cis.upenn.edu> References: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> <93CD0804-6CAA-4F50-B8D4-62AA0376BA57@cis.upenn.edu> Message-ID: <55cd6106.847bc20a.1cd5.018d@mx.google.com> Hi Richard, Thanks for the reply, I completely forgot that most people were probably on holidays ? I?ll try the irc channel as well. Cheers, Tamar From: Richard Eisenberg Sent: Friday, August 14, 2015 04:46 To: lonetiger at gmail.com Cc: GHC Subject: Re: Linker questions Hi Tamar, I haven't a clue about any of this. But I didn't want your detailed email to go without any response. Perhaps agitate a bit on #ghc at freenode to get some attention? Also, be aware that many people are on holiday right now, and so responses may be slower than at other times. Sorry I can't be more helpful, but I definitely appreciate your looking into this! Richard On Aug 11, 2015, at 3:43 PM, lonetiger at gmail.com wrote: Hi *, ? I had a few questions about the linker I was hoping someone can help me with, I'm pretty new to it so any insights would be appreciated. ? 1) Has to do with checkProddableBlock and #10672 and #10563 ? static void checkProddableBlock (ObjectCode *oc, void *addr, size_t size ) { ???ProddableBlock* pb; ? ?? for (pb = oc->proddables; pb != NULL; pb = pb->next) { ??????char* s = (char*)(pb->start); ??????char* e = s + pb->size; ??????char* a = (char*)addr; ??????if (a >= s && (a+size) <= e) return; ???} ???barf("checkProddableBlock: invalid fixup in runtime linker: %p", addr); } ? >From what I have found, these errors seem to happen because oc->proddables is initially NULL, the for loop is skipped. From what I can tell, this function is checking if there's a "proddable" that fits within the supplied address and size. So if there is no proddables to begin with, should this check just not be skipped and the callee of this call not use this ObjectCode instead of erroring out? ? 2) The second question is about static int ocGetNames_PEi386 ( ObjectCode* oc ) I am getting a test failure because it is claiming that .eh_frame section is misaligned. This comes from this code: ? ? if (kind != SECTIONKIND_OTHER && end >= start) { ???????????if ((((size_t)(start)) % 4) != 0) { ???????????????errorBelch("Misaligned section %s: %p", (char*)secname, start); ???????????????stgFree(secname); ???????????????return 0; ???????????} ? Where start is defined as: ? start = ((UChar*)(oc->image)) + sectab_i->PointerToRawData; and? oc->image is a memory location received by allocateImageAndTrampolines. ? In the case of my test failure it is because the .eh_frame section seems to begin at 0x30F since oc->image will always be 4 aligned (so it doesn't really matter in the check) it gives that error because PointerToRawData isn't aligned by 4. ? So my question is would it not be better just to check the alignment flag in the PE section header instead of checking the load address (which is always going to aligned to 4?) and The file pointer to the first page of the section within the COFF file to determine the alignment? Like objdump and dumpbin do? ? e.g. ? 9 .eh_frame???? 00000038? 00000000? 00000000? 0000030f? 2**2 ????????????????? CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA ????????????????? Is the output from objdump which correctly determines the alignment from the section. From what I understand from the PE specification the on disk address doesn't have to be aligned by 4: ? "For object files, the value *should* be aligned on a 4-byte boundary for best performance." ? I'm wondering if we are incorrectly erroring out here, instead of using the section and making sure we pad it to the alignment boundary. ? Regards, Tamar _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From mle+hs at mega-nerd.com Fri Aug 14 07:59:02 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Fri, 14 Aug 2015 17:59:02 +1000 Subject: Forcing a linking error? Message-ID: <20150814175902.6607035ff2d9876d95ee38e0@mega-nerd.com> Dear ghc-devs, There is a commonly used library which has at least one function that when compiled into a program, requires the threaded run time system. Without the threaded runtime, the program just hangs. One kludgy solution to this problem is to have the function check for Control.Concurrent.rtsSupportsBoundThreads being true and throwing an error if its not. However, it would be much nicer if this could be turned into a link time error. Anyone have any ideas how this might be done? Cheers, Eri -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From alexander at plaimi.net Fri Aug 14 09:30:05 2015 From: alexander at plaimi.net (Alexander Berntsen) Date: Fri, 14 Aug 2015 11:30:05 +0200 Subject: Proposal: Include GHC version target in libraries' description In-Reply-To: <55C1ECCC.5080409@plaimi.net> References: <55C1ECCC.5080409@plaimi.net> Message-ID: <55CDB51D.8070504@plaimi.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 05/08/15 13:00, Alexander Berntsen wrote: > Proposed improvement: Add "[library]-[library version] is bundled > with ghc-[GHC version]." in the description string of > [library].cabal. This immediately clears up the confusion when > looking at [library]'s hackage page. Following a discussion with Herbert & Duncan, it would seem Hackage/.cabal may be a better place to solve this. I don't want this information to be lost on IRC, so I'd like to summarise here on the list. We can: -make GHC a distribution and use that field, -use the platform feature, -make a new field -- bundled-with, bundles, or similar, -or use the package collection feature. All of these have upsides and downsides. But we should choose one and be done with it. So let's make a decision. I like the third option, but I value resolving it quickly more than getting my preferred colour on the bikeshed. - -- Alexander alexander at plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCgAGBQJVzbUbAAoJENQqWdRUGk8BZ2IP/AswmP6vLgoiD4TgEBUuUtcE w8TxK73AM5814Yewgv6NMq4kaoxWh16MSivcTVfjRZYKXvGZeDY8458wZX1W8qou 3zHgiu63rH8bcSrwfcGcnT7d+2/NI5xKCfHlUhxA8vFNwk80qBCzTX/SajAWyg2J toRGnmCJz14jKdtGBMWJ3bcZrSJS1/+LAnmU3qB+v1bm0gEpQROyjhls5IMtkoHs Bu2TA7fUO2+DEvvTZoyc7ZFmEnnh966U1CIReeMbUpxi3lSke2P2M2I104fsqaxE wVs7Z7cd3XA2YjorKmk/yOTndlTj5X69ifu64WvIEUNMreUbp6EkTKEoGPhJwyHE MUirlRLsptzmcOXANnuryFSey0WLTx+B7QfWbrl8JiuHj43eLlU+YsTiPd/ak/xP noR02pqhF6TclkXg1ls8q+o40N4g4ndnL5B3of/IPSItVKnwpCKi5Qz713/P4wmv CWNu19SEOiI38nqT1OzczhAtvCiwVPaTJ2qmPhRVPIwnG0DklNBxY4Xg4ua/S5zM jrxE6Cl9CqsdLQ2nq8CHJI+G6PBW7K5suA3SpXQPI+leQV+d1I9VyFEkBB0sKmrg YovPmvQWIAmHfVw7eutXx7NbgBVd6ipIfxQXfy/P9eNgfIx4bD269XP0LbdB3MIE Mx/X6WFHUvOQv78U1pd2 =0os9 -----END PGP SIGNATURE----- From omeragacan at gmail.com Fri Aug 14 10:51:49 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Fri, 14 Aug 2015 06:51:49 -0400 Subject: Forcing a linking error? In-Reply-To: <20150814175902.6607035ff2d9876d95ee38e0@mega-nerd.com> References: <20150814175902.6607035ff2d9876d95ee38e0@mega-nerd.com> Message-ID: Here's an example that fails with a link time error when -threaded is not used: ? rts_test ghc --make Main.hs [1 of 1] Compiling Main ( Main.hs, Main.o ) Linking Main ... Main.o: In function `rn4_info': (.text+0x26): undefined reference to `wakeUpRts' collect2: error: ld returned 1 exit status With -threaded it works: ? rts_test ghc --make Main.hs -threaded Linking Main ... Code: ? rts_test cat Main.hs {-# LANGUAGE ForeignFunctionInterface #-} module Main where foreign import ccall "wakeUpRts" wakeUpRts :: IO () main :: IO () main = return () What I did is basically I found a function in GHC RTS that is only defined when THREADED_RTS is defined and referred to it in my program. 2015-08-14 3:59 GMT-04:00 Erik de Castro Lopo : > Dear ghc-devs, > > There is a commonly used library which has at least one function > that when compiled into a program, requires the threaded run time > system. Without the threaded runtime, the program just hangs. > > One kludgy solution to this problem is to have the function check > for Control.Concurrent.rtsSupportsBoundThreads being true and > throwing an error if its not. However, it would be much nicer if > this could be turned into a link time error. > > Anyone have any ideas how this might be done? > > Cheers, > Eri > -- > ---------------------------------------------------------------------- > Erik de Castro Lopo > http://www.mega-nerd.com/ > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From luka.rahne at gmail.com Fri Aug 14 17:20:47 2015 From: luka.rahne at gmail.com (Luka Rahne) Date: Fri, 14 Aug 2015 19:20:47 +0200 Subject: problem running on arm Message-ID: Hello everyone I am new here and I have build crosscompiler for RedPitaya (http://redpitaya.com), but now i am unable to run hello world. (main = putStrLn "hello world") Running with +RTS -Gg -RTS prints out bunch of data. What I think is going on is that GC consumes all memory. Here is one output on device. (using larger heap just takes more time and produces longer output) https://gist.github.com/ra1u/ab7ab0c23b86436a09ae In qemu everything is working fine. here is ghc verbose output for compilation https://gist.github.com/ra1u/ed376dc81ea21cd5a66b If somebody has some pointers to share i would be very happy. best regards, Luka Rahne From ezyang at mit.edu Fri Aug 14 17:51:32 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Fri, 14 Aug 2015 10:51:32 -0700 Subject: Forcing a linking error? In-Reply-To: References: <20150814175902.6607035ff2d9876d95ee38e0@mega-nerd.com> Message-ID: <1439574618-sup-8381@sabre> Omer, this ticket may be of interest to you: https://ghc.haskell.org/trac/ghc/ticket/7790 Edward Excerpts from ?mer Sinan A?acan's message of 2015-08-14 03:51:49 -0700: > Here's an example that fails with a link time error when -threaded is not used: > > ? rts_test ghc --make Main.hs > [1 of 1] Compiling Main ( Main.hs, Main.o ) > Linking Main ... > Main.o: In function `rn4_info': > (.text+0x26): undefined reference to `wakeUpRts' > collect2: error: ld returned 1 exit status > > With -threaded it works: > > ? rts_test ghc --make Main.hs -threaded > Linking Main ... > > Code: > > ? rts_test cat Main.hs > {-# LANGUAGE ForeignFunctionInterface #-} > > module Main where > > foreign import ccall "wakeUpRts" wakeUpRts :: IO () > > main :: IO () > main = return () > > What I did is basically I found a function in GHC RTS that is only defined when > THREADED_RTS is defined and referred to it in my program. > > 2015-08-14 3:59 GMT-04:00 Erik de Castro Lopo : > > Dear ghc-devs, > > > > There is a commonly used library which has at least one function > > that when compiled into a program, requires the threaded run time > > system. Without the threaded runtime, the program just hangs. > > > > One kludgy solution to this problem is to have the function check > > for Control.Concurrent.rtsSupportsBoundThreads being true and > > throwing an error if its not. However, it would be much nicer if > > this could be turned into a link time error. > > > > Anyone have any ideas how this might be done? > > > > Cheers, > > Eri > > -- > > ---------------------------------------------------------------------- > > Erik de Castro Lopo > > http://www.mega-nerd.com/ > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From luka.rahne at gmail.com Sat Aug 15 18:45:38 2015 From: luka.rahne at gmail.com (Luka Rahne) Date: Sat, 15 Aug 2015 20:45:38 +0200 Subject: problem running on arm In-Reply-To: References: Message-ID: It turns out that issue is due to some locale settings and having impropper/not having locale setting just freeze runtime. Temporary solution is using ascii marshaling functions myPrintStrLn str = withCAString str c_myprint Basic stuff now works. I will have to figure out how to fix this locale issue. Binding for library will be ready soon. On 14 August 2015 at 19:20, Luka Rahne wrote: > Hello everyone I am new here and I have build crosscompiler for > RedPitaya (http://redpitaya.com), but now i am unable to run hello > world. (main = putStrLn "hello world") > > Running with +RTS -Gg -RTS prints out bunch of data. > What I think is going on is that GC consumes all memory. > > Here is one output on device. (using larger heap just takes more time > and produces longer output) > > https://gist.github.com/ra1u/ab7ab0c23b86436a09ae > > In qemu everything is working fine. > > here is ghc verbose output for compilation > https://gist.github.com/ra1u/ed376dc81ea21cd5a66b > > If somebody has some pointers to share i would be very happy. > > best regards, Luka Rahne From omefire at yahoo.fr Sat Aug 15 20:40:35 2015 From: omefire at yahoo.fr (Omar Mefire) Date: Sat, 15 Aug 2015 20:40:35 +0000 (UTC) Subject: How to use pprTrace ? Message-ID: <1315651121.5341657.1439671235791.JavaMail.yahoo@mail.yahoo.com> Hi all,I'm trying to step through some ghc code.I am trying to use pprTrace ( for the first time ) and I keep getting an error when I use it :I've added it to the file ghc/Main.hs and the resulting code is this : ? ? ? ?let argv1' = map (mkGeneralLocated "on the commandline") argv1? ? ? ?(argv2, staticFlagWarnings) <- pprTrace "argv1 prime" (ppr argv1') $ parseStaticFlags argv1' I want to examine the value of argv1'.?After my modification, I go into the ghc/ folder and run : 'make'But doing this leads to an error when I try to run the program : ? ? ? ghc-stage2: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): Static flags have not been initialised!? ? ? ? ? Please call GHC.parseStaticFlags early enough. What am I doing wrong ??Omar Mefire,? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Sat Aug 15 20:45:41 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Sat, 15 Aug 2015 13:45:41 -0700 Subject: How to use pprTrace ? In-Reply-To: <1315651121.5341657.1439671235791.JavaMail.yahoo@mail.yahoo.com> References: <1315651121.5341657.1439671235791.JavaMail.yahoo@mail.yahoo.com> Message-ID: <1439671510-sup-754@sabre> It is just as the message says: you can't use pprTrace too early in the execution of a GHC program. It looks like you're trying to print something pretty simple, so why not try a plain old Debug.Trace trace? Edward Excerpts from Omar Mefire's message of 2015-08-15 13:40:35 -0700: > Hi all,I'm trying to step through some ghc code.I am trying to use pprTrace ( for the first time ) and I keep getting an error when I use it :I've added it to the file ghc/Main.hs and the resulting code is this : > ? ? ? ?let argv1' = map (mkGeneralLocated "on the commandline") argv1? ? ? ?(argv2, staticFlagWarnings) <- pprTrace "argv1 prime" (ppr argv1') $ parseStaticFlags argv1' > I want to examine the value of argv1'.?After my modification, I go into the ghc/ folder and run : 'make'But doing this leads to an error when I try to run the program : > ? ? ? ghc-stage2: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): Static flags have not been initialised!? ? ? ? ? Please call GHC.parseStaticFlags early enough. > What am I doing wrong ??Omar Mefire,? From omefire at yahoo.fr Sat Aug 15 21:56:35 2015 From: omefire at yahoo.fr (Omar Mefire) Date: Sat, 15 Aug 2015 21:56:35 +0000 (UTC) Subject: How to use pprTrace ? In-Reply-To: <1439671510-sup-754@sabre> References: <1439671510-sup-754@sabre> Message-ID: <495715435.5378915.1439675795273.JavaMail.yahoo@mail.yahoo.com> Thanks,It's pretty simple what I wanna print, I'm trying to familiarize myself with debugging GHC so I can get a general feel for how the whole thing works. Trying to use Debug.Trace.trace or Debug.Trace.traceStack, I now get the following error : ? ? ? GHC [stage 1] compiler/stage2/build/Module.o-boot? ? ? ghc-stage1: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): v_unsafeGlobalDynFlags: not initialised ?Omar Mefire, Le Samedi 15 ao?t 2015 13h45, Edward Z. Yang a ?crit : It is just as the message says: you can't use pprTrace too early in the execution of a GHC program.? It looks like you're trying to print something pretty simple, so why not try a plain old Debug.Trace trace? Edward Excerpts from Omar Mefire's message of 2015-08-15 13:40:35 -0700: > Hi all,I'm trying to step through some ghc code.I am trying to use pprTrace ( for the first time ) and I keep getting an error when I use it :I've added it to the file ghc/Main.hs and the resulting code is this : > ? ? ? ?let argv1' = map (mkGeneralLocated "on the commandline") argv1? ? ? ?(argv2, staticFlagWarnings) <- pprTrace "argv1 prime" (ppr argv1') $ parseStaticFlags argv1' > I want to examine the value of argv1'.?After my modification, I go into the ghc/ folder and run : 'make'But doing this leads to an error when I try to run the program : > ? ? ? ghc-stage2: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): Static flags have not been initialised!? ? ? ? ? Please call GHC.parseStaticFlags early enough. > What am I doing wrong ??Omar Mefire,? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezyang at mit.edu Sat Aug 15 22:05:12 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Sat, 15 Aug 2015 15:05:12 -0700 Subject: How to use pprTrace ? In-Reply-To: <495715435.5378915.1439675795273.JavaMail.yahoo@mail.yahoo.com> References: <1439671510-sup-754@sabre> <495715435.5378915.1439675795273.JavaMail.yahoo@mail.yahoo.com> Message-ID: <1439676205-sup-4514@sabre> If you are really just using trace, and using it on a plain set of strings from the command line, this shouldn't happen. Hard to say more without a diff. Why not experiment with tracing GHC in NOT ghc/Main.hs, or at least after the code that sets up the global dynamic flags is all done? If you add trace statements to the type checker, you can expect pprTrace to work fine. As is, you're in a lot of pain for not very much benefit. Edward Excerpts from Omar Mefire's message of 2015-08-15 14:56:35 -0700: > Thanks,It's pretty simple what I wanna print, I'm trying to familiarize myself with debugging GHC so I can get a general feel for how the whole thing works. > Trying to use Debug.Trace.trace or Debug.Trace.traceStack, I now get the following error : > ? ? ? GHC [stage 1] compiler/stage2/build/Module.o-boot? ? ? ghc-stage1: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): v_unsafeGlobalDynFlags: not initialised > ?Omar Mefire, > > > Le Samedi 15 ao?t 2015 13h45, Edward Z. Yang a ?crit : > > > It is just as the message says: you can't use pprTrace too early in the > execution of a GHC program.? It looks like you're trying to print > something pretty simple, so why not try a plain old Debug.Trace trace? > > Edward > > Excerpts from Omar Mefire's message of 2015-08-15 13:40:35 -0700: > > Hi all,I'm trying to step through some ghc code.I am trying to use pprTrace ( for the first time ) and I keep getting an error when I use it :I've added it to the file ghc/Main.hs and the resulting code is this : > > ? ? ? ?let argv1' = map (mkGeneralLocated "on the commandline") argv1? ? ? ?(argv2, staticFlagWarnings) <- pprTrace "argv1 prime" (ppr argv1') $ parseStaticFlags argv1' > > I want to examine the value of argv1'.?After my modification, I go into the ghc/ folder and run : 'make'But doing this leads to an error when I try to run the program : > > ? ? ? ghc-stage2: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): Static flags have not been initialised!? ? ? ? ? Please call GHC.parseStaticFlags early enough. > > What am I doing wrong ??Omar Mefire,? > From omefire at yahoo.fr Sat Aug 15 22:48:40 2015 From: omefire at yahoo.fr (Omar Mefire) Date: Sat, 15 Aug 2015 22:48:40 +0000 (UTC) Subject: How to use pprTrace ? In-Reply-To: <1439676205-sup-4514@sabre> References: <1439676205-sup-4514@sabre> Message-ID: <797923744.5442576.1439678920934.JavaMail.yahoo@mail.yahoo.com> Thanks Edward !I'll heed your advice.?Omar Mefire,? Le Samedi 15 ao?t 2015 15h05, Edward Z. Yang a ?crit : If you are really just using trace, and using it on a plain set of strings from the command line, this shouldn't happen. Hard to say more without a diff. Why not experiment with tracing GHC in NOT ghc/Main.hs, or at least after the code that sets up the global dynamic flags is all done?? If you add trace statements to the type checker, you can expect pprTrace to work fine.? As is, you're in a lot of pain for not very much benefit. Edward Excerpts from Omar Mefire's message of 2015-08-15 14:56:35 -0700: > Thanks,It's pretty simple what I wanna print, I'm trying to familiarize myself with debugging GHC so I can get a general feel for how the whole thing works. > Trying to use Debug.Trace.trace or Debug.Trace.traceStack, I now get the following error : > ? ? ? GHC [stage 1] compiler/stage2/build/Module.o-boot? ? ? ghc-stage1: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): v_unsafeGlobalDynFlags: not initialised > ?Omar Mefire, > > >? ? ? Le Samedi 15 ao?t 2015 13h45, Edward Z. Yang a ?crit : >? ? > >? It is just as the message says: you can't use pprTrace too early in the > execution of a GHC program.? It looks like you're trying to print > something pretty simple, so why not try a plain old Debug.Trace trace? > > Edward > > Excerpts from Omar Mefire's message of 2015-08-15 13:40:35 -0700: > > Hi all,I'm trying to step through some ghc code.I am trying to use pprTrace ( for the first time ) and I keep getting an error when I use it :I've added it to the file ghc/Main.hs and the resulting code is this : > > ? ? ? ?let argv1' = map (mkGeneralLocated "on the commandline") argv1? ? ? ?(argv2, staticFlagWarnings) <- pprTrace "argv1 prime" (ppr argv1') $ parseStaticFlags argv1' > > I want to examine the value of argv1'.?After my modification, I go into the ghc/ folder and run : 'make'But doing this leads to an error when I try to run the program : > > ? ? ? ghc-stage2: panic! (the 'impossible' happened)? ? ? (GHC version 7.11.20150810 for x86_64-unknown-linux): Static flags have not been initialised!? ? ? ? ? Please call GHC.parseStaticFlags early enough. > > What am I doing wrong ??Omar Mefire,? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggreif at gmail.com Sun Aug 16 10:36:17 2015 From: ggreif at gmail.com (Gabor Greif) Date: Sun, 16 Aug 2015 07:36:17 -0300 Subject: problem running on arm In-Reply-To: References: Message-ID: This is bug https://ghc.haskell.org/trac/ghc/ticket/7695 and I've been bitten by it on an iconv-starved embedded system (PowerPC) for which I cross-compiled "Hello World" years ago. Cheers, Gabor Em s?bado, 15 de agosto de 2015, Luka Rahne escreveu: > It turns out that issue is due to some locale settings and having > impropper/not having locale setting just freeze runtime. Temporary > solution is using ascii marshaling functions > > myPrintStrLn str = withCAString str c_myprint > > Basic stuff now works. I will have to figure out how to fix this > locale issue. Binding for library will be ready soon. > > On 14 August 2015 at 19:20, Luka Rahne > wrote: > > Hello everyone I am new here and I have build crosscompiler for > > RedPitaya (http://redpitaya.com), but now i am unable to run hello > > world. (main = putStrLn "hello world") > > > > Running with +RTS -Gg -RTS prints out bunch of data. > > What I think is going on is that GC consumes all memory. > > > > Here is one output on device. (using larger heap just takes more time > > and produces longer output) > > > > https://gist.github.com/ra1u/ab7ab0c23b86436a09ae > > > > In qemu everything is working fine. > > > > here is ghc verbose output for compilation > > https://gist.github.com/ra1u/ed376dc81ea21cd5a66b > > > > If somebody has some pointers to share i would be very happy. > > > > best regards, Luka Rahne > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeyerstaylor11 at gmail.com Sun Aug 16 11:47:02 2015 From: aeyerstaylor11 at gmail.com (Alexander Eyers-Taylor) Date: Sun, 16 Aug 2015 12:47:02 +0100 Subject: A question about roles. Message-ID: <55D07836.1000506@gmail.com> Hello I have noticed in looking at some core that GADT type constructors are often applied with a representational role. These constructors are explicitly marked as nominal. Is this information just ignored at a Core level or is this invalid core? Looking at the code a see that we if we downgrade a TyConAppCo we unconditionally change it to a representational role after changing its children. I think this is where it is introduced. Alex ET From eir at cis.upenn.edu Sun Aug 16 12:34:45 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Sun, 16 Aug 2015 08:34:45 -0400 Subject: A question about roles. In-Reply-To: <55D07836.1000506@gmail.com> References: <55D07836.1000506@gmail.com> Message-ID: <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> Hi Alex, Do you have a concrete example? With the -dcore-lint flag, the Core is checked, including all the roles. Thanks, Richard On Aug 16, 2015, at 7:47 AM, Alexander Eyers-Taylor wrote: > Hello > > I have noticed in looking at some core that GADT type constructors are often applied with a representational role. These constructors are explicitly marked as nominal. > > Is this information just ignored at a Core level or is this invalid core? > > Looking at the code a see that we if we downgrade a TyConAppCo we unconditionally change it to a representational role after changing its children. I think this is where it is introduced. > > Alex ET > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From aeyerstaylor11 at gmail.com Sun Aug 16 14:12:59 2015 From: aeyerstaylor11 at gmail.com (Alexander Eyers-Taylor) Date: Sun, 16 Aug 2015 15:12:59 +0100 Subject: A question about roles. In-Reply-To: <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> References: <55D07836.1000506@gmail.com> <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> Message-ID: <55D09A6B.2080802@gmail.com> Hello Richard The code at the bottom ends up compiling to core with the following cast (ParNat (dt1_dor ; (Flip (dt2_dos ; Sym dt_doq))_N ; Nat.TFCo:R:Flip[0]))_R but ParNat is nominal so (I think) shouldn't have a representational cast. The cast is safe as we can just make it nominal and then add Sub but it feels invalid. This doesn't actually create a bug in any program and it may just be a misunderstanding on my part about roles and the differences between roles in Coercible and in the core language. -dcore-lint makes no changes. {-# LANGUAGE DataKinds, GADTs, KindSignatures, TypeFamilies, RoleAnnotations #-} module Nat where data Nat = Z | S Nat data Parity = Even | Odd type family Flip (x :: Parity) :: Parity where Flip Even = Odd Flip Odd = Even type role ParNat nominal data ParNat :: Parity -> * where PZ :: ParNat Even PS :: (x ~ Flip y, y ~ Flip x) => ParNat x -> ParNat (Flip x) halve :: ParNat Even -> Nat halve PZ = Z halve (PS a) = helper a where helper :: ParNat Odd -> Nat helper (PS b) = S (halve b) On 16/08/15 13:34, Richard Eisenberg wrote: > Hi Alex, > > Do you have a concrete example? With the -dcore-lint flag, the Core is checked, including all the roles. > > Thanks, > Richard > > On Aug 16, 2015, at 7:47 AM, Alexander Eyers-Taylor wrote: > >> Hello >> >> I have noticed in looking at some core that GADT type constructors are often applied with a representational role. These constructors are explicitly marked as nominal. >> >> Is this information just ignored at a Core level or is this invalid core? >> >> Looking at the code a see that we if we downgrade a TyConAppCo we unconditionally change it to a representational role after changing its children. I think this is where it is introduced. >> >> Alex ET >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From aeyerstaylor11 at gmail.com Sun Aug 16 14:19:34 2015 From: aeyerstaylor11 at gmail.com (Alexander Eyers-Taylor) Date: Sun, 16 Aug 2015 15:19:34 +0100 Subject: A question about roles. In-Reply-To: <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> References: <55D07836.1000506@gmail.com> <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> Message-ID: <55D09BF6.2000701@gmail.com> Hello I have reread the comment about TyConAppCo and I now understand how it works and this is all how it should work. I was just slightly confused to start with. The coercion was representational but the arguments were nominal. Thanks Alex On 16/08/15 13:34, Richard Eisenberg wrote: > Hi Alex, > > Do you have a concrete example? With the -dcore-lint flag, the Core is checked, including all the roles. > > Thanks, > Richard > > On Aug 16, 2015, at 7:47 AM, Alexander Eyers-Taylor wrote: > >> Hello >> >> I have noticed in looking at some core that GADT type constructors are often applied with a representational role. These constructors are explicitly marked as nominal. >> >> Is this information just ignored at a Core level or is this invalid core? >> >> Looking at the code a see that we if we downgrade a TyConAppCo we unconditionally change it to a representational role after changing its children. I think this is where it is introduced. >> >> Alex ET >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From eir at cis.upenn.edu Sun Aug 16 17:01:15 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Sun, 16 Aug 2015 13:01:15 -0400 Subject: A question about roles. In-Reply-To: <55D09BF6.2000701@gmail.com> References: <55D07836.1000506@gmail.com> <1AB06DE1-7782-4AFD-8D18-2DD4F15726EA@cis.upenn.edu> <55D09BF6.2000701@gmail.com> Message-ID: If you're trying to understand core, check out https://github.com/ghc/ghc/blob/master/docs/core-spec/core-spec.pdf which has all the typing rules. It might be helpful. But your analysis below is spot on. Richard On Aug 16, 2015, at 10:19 AM, Alexander Eyers-Taylor wrote: > Hello > > I have reread the comment about TyConAppCo and I now understand how it works and this is all how it should work. I was just slightly confused to start with. The coercion was representational but the arguments were nominal. > > Thanks > > Alex > > On 16/08/15 13:34, Richard Eisenberg wrote: >> Hi Alex, >> >> Do you have a concrete example? With the -dcore-lint flag, the Core is checked, including all the roles. >> >> Thanks, >> Richard >> >> On Aug 16, 2015, at 7:47 AM, Alexander Eyers-Taylor wrote: >> >>> Hello >>> >>> I have noticed in looking at some core that GADT type constructors are often applied with a representational role. These constructors are explicitly marked as nominal. >>> >>> Is this information just ignored at a Core level or is this invalid core? >>> >>> Looking at the code a see that we if we downgrade a TyConAppCo we unconditionally change it to a representational role after changing its children. I think this is where it is introduced. >>> >>> Alex ET >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From mwanikibusiness at gmail.com Sun Aug 16 21:08:40 2015 From: mwanikibusiness at gmail.com (Njagi Mwaniki) Date: Mon, 17 Aug 2015 00:08:40 +0300 Subject: getting addDependFile from a hi file Message-ID: <55D0FBD8.4000702@gmail.com> Hello, I have a case in which I wish to parse a hi file to extract the addDependentFile section created by template haskell. I found the function `readBinIface :: CheckHiWay -> TraceBinIFaceReading -> FilePath -> TcRnIf a b ModIface` https://downloads.haskell.org/~ghc/7.10.2/docs/html/libraries/ghc-7.10.2/BinIface.html#v:readBinIface I want to use the ModIface type to extract the mi_usages where I believe I will find the list of dependent files. The issue is that I can't extract a value of type ModIface due to the strict dependence on Env of values that are related to this. I don't understand what env is and despite reading about the docs extensively I can't find a function explaining how to work with the env value or how to generate one. The closest thing I found was in TcRnMonad.getTopEnv but even that is cryptic. Could I get some help. All in all, I just want to get the addDependFile section from a hi file. From ezyang at mit.edu Sun Aug 16 21:21:29 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Sun, 16 Aug 2015 14:21:29 -0700 Subject: getting addDependFile from a hi file In-Reply-To: <55D0FBD8.4000702@gmail.com> References: <55D0FBD8.4000702@gmail.com> Message-ID: <1439759875-sup-9794@sabre> It is a bit hard to parse your question, but it sounds like you are trying to figure out how to run a value in the 'TcRnIf' monad? In that case, initIfaceCheck will be sufficient for your needs. Assuming that you're operating in the Ghc monad, you should do something like: do hsc_env <- getSession iface <- liftIO $ initIfaceCheck hsc_env (readBinIface ...) ... You might also consider using 'readIface' instead of 'readBinIface'. Edward Excerpts from Njagi Mwaniki's message of 2015-08-16 14:08:40 -0700: > Hello, > > I have a case in which I wish to parse a hi file to extract the > addDependentFile section created by template haskell. > > I found the function `readBinIface :: CheckHiWay -> TraceBinIFaceReading > -> FilePath -> TcRnIf a b ModIface` > https://downloads.haskell.org/~ghc/7.10.2/docs/html/libraries/ghc-7.10.2/BinIface.html#v:readBinIface > > I want to use the ModIface type to extract the mi_usages where I believe > I will find the list of dependent files. > > The issue is that I can't extract a value of type ModIface due to the > strict dependence on Env of values that are related to this. I don't > understand what env is and despite reading about the docs extensively I > can't find a function explaining how to work with the env value or how > to generate one. The closest thing I found was in TcRnMonad.getTopEnv > but even that is cryptic. Could I get some help. > > All in all, I just want to get the addDependFile section from a hi file. From mike at izbicki.me Mon Aug 17 23:12:17 2015 From: mike at izbicki.me (Mike Izbicki) Date: Mon, 17 Aug 2015 16:12:17 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: <1439014742-sup-2126@sabre> References: <1439014742-sup-2126@sabre> Message-ID: I'm not sure how either of those two functions can help me. The problem is that given an operator (e.g. `+`), I don't know the name of the dictionary that needs to be passed in as the first argument to the operator. I could probably hard code these names, but then the plugin wouldn't be able to work with alternative preludes. On Fri, Aug 7, 2015 at 11:20 PM, Edward Z. Yang wrote: > Hello Mike, > > Give importDecl from LoadIface a try, or maybe tcLookupGlobal if > you're in TcM. > > Edward > > Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: >> I'm trying to write a GHC plugin. The purpose of the plugin is to >> provide Haskell bindings to Herbie. Herbie >> (https://github.com/uwplse/herbie) is a program that takes a >> mathematical statement as input, and gives you a numerically stable >> formula to compute it as output. The plugin is supposed to automate >> this process for Haskell programs. >> >> I can convert the core expressions into a format for Herbie just fine. >> Where I'm having trouble is converting the output from Herbie back >> into core. Given a string that represents a numeric operator (e.g. >> "log" or "+"), I can get that converted into a Name that matches the >> Name of the version of that operator in scope at the location. But in >> order to create an Expr, I need to convert the Name into a Var. All >> the functions that I can find for this (e.g. mkGlobalVar) also require >> the type of the variable. But I can't find a way to figure out the >> Type given a Name. How can I do this? From afarmer at ittc.ku.edu Mon Aug 17 23:21:16 2015 From: afarmer at ittc.ku.edu (Andrew Farmer) Date: Mon, 17 Aug 2015 16:21:16 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: HERMIT has some code for building dictionaries for a given predicate type (by invoking the typechecker functions that do this): https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Dictionary/GHC.hs#L223 The functions to run TcM computations inside CoreM are here: https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Monad.hs#L242 and https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/GHC/Typechecker.hs#L47 Perhaps that will help get you started? I would like to push these interfaces back into the GHC API at some point, but just haven't done it yet. HTH Andrew On Mon, Aug 17, 2015 at 4:12 PM, Mike Izbicki wrote: > I'm not sure how either of those two functions can help me. The > problem is that given an operator (e.g. `+`), I don't know the name of > the dictionary that needs to be passed in as the first argument to the > operator. I could probably hard code these names, but then the plugin > wouldn't be able to work with alternative preludes. > > On Fri, Aug 7, 2015 at 11:20 PM, Edward Z. Yang wrote: >> Hello Mike, >> >> Give importDecl from LoadIface a try, or maybe tcLookupGlobal if >> you're in TcM. >> >> Edward >> >> Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: >>> I'm trying to write a GHC plugin. The purpose of the plugin is to >>> provide Haskell bindings to Herbie. Herbie >>> (https://github.com/uwplse/herbie) is a program that takes a >>> mathematical statement as input, and gives you a numerically stable >>> formula to compute it as output. The plugin is supposed to automate >>> this process for Haskell programs. >>> >>> I can convert the core expressions into a format for Herbie just fine. >>> Where I'm having trouble is converting the output from Herbie back >>> into core. Given a string that represents a numeric operator (e.g. >>> "log" or "+"), I can get that converted into a Name that matches the >>> Name of the version of that operator in scope at the location. But in >>> order to create an Expr, I need to convert the Name into a Var. All >>> the functions that I can find for this (e.g. mkGlobalVar) also require >>> the type of the variable. But I can't find a way to figure out the >>> Type given a Name. How can I do this? > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From oleg.grenrus at iki.fi Wed Aug 19 14:46:56 2015 From: oleg.grenrus at iki.fi (Oleg Grenrus) Date: Wed, 19 Aug 2015 17:46:56 +0300 Subject: Compiling cabal with GHC HEAD Message-ID: <7784B3F6-1EAF-4D6F-A770-BA594F39BD3B@iki.fi> I tried to fix compilation of Cabal using Cabal HEAD. It?s trivial patch: https://github.com/phadej/cabal/commit/525e0680505c74f42a321e55b357a27222790628 but it breaks build on every other released GHC: https://travis-ci.org/phadej/cabal/builds/76288656 ? The original issue GHC-7.11 complained was: Distribution/Client/Types.hs:71:10: error: Illegal instance declaration for ?PackageFixedDeps InstalledPackageInfo? (All instance types must be of the form (T t1 ... tn) where T is not a synonym. Use TypeSynonymInstances if you want to disable this.) In the instance declaration for ?PackageFixedDeps InstalledPackageInfo? So I had to add TypeSynonymInstances and FlexibleInstances And also had to change import of InstalledPackageInfo(exposed) in Haddock module. ? At this point I?m really confused. I cannot find ?InstalledPackageInfo_? symbol anywhere. Can someone explain what happens? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ezyang at mit.edu Wed Aug 19 16:19:47 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 19 Aug 2015 09:19:47 -0700 Subject: Compiling cabal with GHC HEAD In-Reply-To: <7784B3F6-1EAF-4D6F-A770-BA594F39BD3B@iki.fi> References: <7784B3F6-1EAF-4D6F-A770-BA594F39BD3B@iki.fi> Message-ID: <1440001103-sup-6417@sabre> Oh, this is irritating. The problem is we recently updated Cabal the library to get rid of InstalledPackageInfo_ (so there is only InstalledPackageInfo now) but it looks like in some situations cabal-install can be compiled with an old version of Cabal (as is happening to you). I suppose an appropriate remedy is to bump the Cabal library dependency in cabal-install so we don't attempt to use the old Cabal; alternately we could preprocessor macro to make it work in both cases. Edward Excerpts from Oleg Grenrus's message of 2015-08-19 07:46:56 -0700: > I tried to fix compilation of Cabal using Cabal HEAD. It?s trivial patch: > > https://github.com/phadej/cabal/commit/525e0680505c74f42a321e55b357a27222790628 > > but it breaks build on every other released GHC: > > https://travis-ci.org/phadej/cabal/builds/76288656 > > ? > > The original issue GHC-7.11 complained was: > > Distribution/Client/Types.hs:71:10: error: > Illegal instance declaration for > ?PackageFixedDeps InstalledPackageInfo? > (All instance types must be of the form (T t1 ... tn) > where T is not a synonym. > Use TypeSynonymInstances if you want to disable this.) > In the instance declaration for > ?PackageFixedDeps InstalledPackageInfo? > > So I had to add TypeSynonymInstances and FlexibleInstances > > And also had to change import of InstalledPackageInfo(exposed) in Haddock module. > > ? > > At this point I?m really confused. I cannot find ?InstalledPackageInfo_? symbol anywhere. Can someone explain what happens? From ezyang at mit.edu Wed Aug 19 23:28:14 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 19 Aug 2015 16:28:14 -0700 Subject: Linker questions In-Reply-To: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> References: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> Message-ID: <1440025180-sup-6159@sabre> Excerpts from lonetiger's message of 2015-08-11 12:43:34 -0700: > 1) Has to do with checkProddableBlock and #10672 and #10563 > > static void checkProddableBlock (ObjectCode *oc, void *addr, size_t size ) > { > ProddableBlock* pb; > > for (pb = oc->proddables; pb != NULL; pb = pb->next) { > char* s = (char*)(pb->start); > char* e = s + pb->size; > char* a = (char*)addr; > if (a >= s && (a+size) <= e) return; > } > barf("checkProddableBlock: invalid fixup in runtime linker: %p", addr); > } > > From what I have found, these errors seem to happen because oc->proddables is initially NULL, > the for loop is skipped. From what I can tell, this function is checking if there's a "proddable" > that fits within the supplied address and size. So if there is no proddables to begin with, should this > check just not be skipped and the callee of this call not use this ObjectCode instead of erroring out? Relocating objects consists of iterating over a list of "relocations", which essentially says, "please modify the word of memory at addr to point to the actual location of some symbol." The essential effect is that GHC is going to scribble over some memory that the object told it to. So it's a /really really/ idea to make sure that we aren't scribbling over something random, like some GHC structures. checkProddableBlock ensures that the memory location to be relocated ACTUALLY resides in the object code we are loading. If we put it this way, it's pretty obvious what the bug has to be: we are processing a relocation for some code that we didn't actually make a proddable block for. This can happen if we didn't understand the section. I've updated #10672 and #10563 accordingly. > 2) The second question is about static int ocGetNames_PEi386 ( ObjectCode* oc ) > I am getting a test failure because it is claiming that .eh_frame section is misaligned. > This comes from this code: > > if (kind != SECTIONKIND_OTHER && end >= start) { > if ((((size_t)(start)) % 4) != 0) { > errorBelch("Misaligned section %s: %p", (char*)secname, start); > stgFree(secname); > return 0; > } > > Where start is defined as: > > start = ((UChar*)(oc->image)) + sectab_i->PointerToRawData; > and oc->image is a memory location received by allocateImageAndTrampolines. > > In the case of my test failure it is because the .eh_frame section seems to begin at 0x30F > since oc->image will always be 4 aligned (so it doesn't really matter in the check) it gives that error because PointerToRawData isn't aligned by 4. > > So my question is would it not be better just to check the alignment flag in the PE section header instead of checking the load address (which is always going to aligned to 4?) and The file pointer to > the first page of the section within the COFF file to determine the alignment? Like objdump and dumpbin do? > > e.g. > > 9 .eh_frame 00000038 00000000 00000000 0000030f 2**2 > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > > Is the output from objdump which correctly determines the alignment from the section. From what I understand from the PE specification > the on disk address doesn't have to be aligned by 4: > > "For object files, the value *should* be aligned on a 4-byte boundary for best performance." > > I'm wondering if we are incorrectly erroring out here, instead of using the section and making sure we pad it to the alignment boundary. It should be fine to make the code more flexible to accept arbitrary alignments. However, I would expect you would have to make some code to make this work. If you are interested in doing this, make sure you add tests to the test suite which specifically construct objects with sections which are not 4-byte aligned. Please also feel free to open a bug to track this work. Thanks, Edward From lonetiger at gmail.com Wed Aug 19 23:53:30 2015 From: lonetiger at gmail.com (Phyx) Date: Thu, 20 Aug 2015 01:53:30 +0200 Subject: Linker questions In-Reply-To: <1440025180-sup-6159@sabre> References: <55ca5066.0785c20a.59f3.5fe4@mx.google.com> <1440025180-sup-6159@sabre> Message-ID: Hi Edward, Thanks for the information, it really helped make it more clear to me what's going on. I would ideally like to get these validate errors on Windows down to 0 (without marking them as expected fail). So I will probably make a ticket for this. Cheers, Tamar On Thu, Aug 20, 2015 at 1:28 AM, Edward Z. Yang wrote: > Excerpts from lonetiger's message of 2015-08-11 12:43:34 -0700: > > 1) Has to do with checkProddableBlock and #10672 and #10563 > > > > static void checkProddableBlock (ObjectCode *oc, void *addr, size_t size > ) > > { > > ProddableBlock* pb; > > > > for (pb = oc->proddables; pb != NULL; pb = pb->next) { > > char* s = (char*)(pb->start); > > char* e = s + pb->size; > > char* a = (char*)addr; > > if (a >= s && (a+size) <= e) return; > > } > > barf("checkProddableBlock: invalid fixup in runtime linker: %p", > addr); > > } > > > > From what I have found, these errors seem to happen because > oc->proddables is initially NULL, > > the for loop is skipped. From what I can tell, this function is checking > if there's a "proddable" > > that fits within the supplied address and size. So if there is no > proddables to begin with, should this > > check just not be skipped and the callee of this call not use this > ObjectCode instead of erroring out? > > Relocating objects consists of iterating over a list of "relocations", > which essentially says, "please modify the word of memory at addr to > point to the actual location of some symbol." > > The essential effect is that GHC is going to scribble over some memory > that the object told it to. So it's a /really really/ idea to make sure > that we aren't scribbling over something random, like some GHC > structures. checkProddableBlock ensures that the memory location to > be relocated ACTUALLY resides in the object code we are loading. > > If we put it this way, it's pretty obvious what the bug has to be: > we are processing a relocation for some code that we didn't actually > make a proddable block for. This can happen if we didn't understand > the section. > > I've updated #10672 and #10563 accordingly. > > > 2) The second question is about static int ocGetNames_PEi386 ( > ObjectCode* oc ) > > I am getting a test failure because it is claiming that .eh_frame > section is misaligned. > > This comes from this code: > > > > if (kind != SECTIONKIND_OTHER && end >= start) { > > if ((((size_t)(start)) % 4) != 0) { > > errorBelch("Misaligned section %s: %p", (char*)secname, > start); > > stgFree(secname); > > return 0; > > } > > > > Where start is defined as: > > > > start = ((UChar*)(oc->image)) + sectab_i->PointerToRawData; > > and oc->image is a memory location received by > allocateImageAndTrampolines. > > > > In the case of my test failure it is because the .eh_frame section seems > to begin at 0x30F > > since oc->image will always be 4 aligned (so it doesn't really matter in > the check) it gives that error because PointerToRawData isn't aligned by 4. > > > > So my question is would it not be better just to check the alignment > flag in the PE section header instead of checking the load address (which > is always going to aligned to 4?) and The file pointer to > > the first page of the section within the COFF file to determine the > alignment? Like objdump and dumpbin do? > > > > e.g. > > > > 9 .eh_frame 00000038 00000000 00000000 0000030f 2**2 > > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > > > > Is the output from objdump which correctly determines the alignment from > the section. From what I understand from the PE specification > > the on disk address doesn't have to be aligned by 4: > > > > "For object files, the value *should* be aligned on a 4-byte boundary > for best performance." > > > > I'm wondering if we are incorrectly erroring out here, instead of using > the section and making sure we pad it to the alignment boundary. > > It should be fine to make the code more flexible to accept arbitrary > alignments. However, I would expect you would have to make some code > to make this work. > > If you are interested in doing this, make sure you add tests to the test > suite which specifically construct objects with sections which are not > 4-byte aligned. Please also feel free to open a bug to track this work. > > Thanks, > Edward > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmiedema at gmail.com Thu Aug 20 23:03:54 2015 From: thomasmiedema at gmail.com (Thomas Miedema) Date: Fri, 21 Aug 2015 01:03:54 +0200 Subject: Deleting sync-all In-Reply-To: References: Message-ID: sync-all has been deleted On Tue, Jul 21, 2015 at 12:45 PM, Thomas Miedema wrote: > Hello ghc-devs, > > I would like to delete the file sync-all > from the GHC > repository. It should not have been necessary to use it for about a year > now. > > Please speak up if you want those 1000 lines of buggy Perl a.k.a. sync-all to stay for some reason, or if you have questions about a certain git submodules workflow. > > > The source code (./boot no longer suggests it) and the wiki are already sync-all free, except for a few historical pages. > > Discussion period: 1 month. > > > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at izbicki.me Fri Aug 21 00:05:22 2015 From: mike at izbicki.me (Mike Izbicki) Date: Thu, 20 Aug 2015 17:05:22 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: I'm pretty sure the `buildDictionary` function doesn't do what I need. AFAICT, you pass it a `Var` which contains a dictionary, and it tells you what is in that dictionary. What I need is a function with type `Var -> Var` where the first `Var` contains a function, and the output `Var` is the dictionary. For example, given the expression: log (a1+a2) In core, this might look like: log @ Float $fFloatingFloat (+ @ Float $fNumFloat a1 a2) I want to mechanically construct the core code above. When doing so, each function within a type class has an extra argument, which is the dictionary for that type class. `log` no longer takes one parameter; in core, it takes two. I'm having trouble figuring out how to get the appropriate dictionary to pass as the "dictionary parameter" to these functions. On Mon, Aug 17, 2015 at 4:21 PM, Andrew Farmer wrote: > HERMIT has some code for building dictionaries for a given predicate > type (by invoking the typechecker functions that do this): > > https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Dictionary/GHC.hs#L223 > > The functions to run TcM computations inside CoreM are here: > > https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Monad.hs#L242 > and > https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/GHC/Typechecker.hs#L47 > > Perhaps that will help get you started? > > I would like to push these interfaces back into the GHC API at some > point, but just haven't done it yet. > > HTH > Andrew > > On Mon, Aug 17, 2015 at 4:12 PM, Mike Izbicki wrote: >> I'm not sure how either of those two functions can help me. The >> problem is that given an operator (e.g. `+`), I don't know the name of >> the dictionary that needs to be passed in as the first argument to the >> operator. I could probably hard code these names, but then the plugin >> wouldn't be able to work with alternative preludes. >> >> On Fri, Aug 7, 2015 at 11:20 PM, Edward Z. Yang wrote: >>> Hello Mike, >>> >>> Give importDecl from LoadIface a try, or maybe tcLookupGlobal if >>> you're in TcM. >>> >>> Edward >>> >>> Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: >>>> I'm trying to write a GHC plugin. The purpose of the plugin is to >>>> provide Haskell bindings to Herbie. Herbie >>>> (https://github.com/uwplse/herbie) is a program that takes a >>>> mathematical statement as input, and gives you a numerically stable >>>> formula to compute it as output. The plugin is supposed to automate >>>> this process for Haskell programs. >>>> >>>> I can convert the core expressions into a format for Herbie just fine. >>>> Where I'm having trouble is converting the output from Herbie back >>>> into core. Given a string that represents a numeric operator (e.g. >>>> "log" or "+"), I can get that converted into a Name that matches the >>>> Name of the version of that operator in scope at the location. But in >>>> order to create an Expr, I need to convert the Name into a Var. All >>>> the functions that I can find for this (e.g. mkGlobalVar) also require >>>> the type of the variable. But I can't find a way to figure out the >>>> Type given a Name. How can I do this? >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> From ekmett at gmail.com Fri Aug 21 00:29:34 2015 From: ekmett at gmail.com (Edward Kmett) Date: Thu, 20 Aug 2015 20:29:34 -0400 Subject: ArrayArrays Message-ID: Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents. I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety story preventing this. I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked. Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection. -Edward -------------- next part -------------- An HTML attachment was scrubbed... URL: From afarmer at ittc.ku.edu Fri Aug 21 00:55:54 2015 From: afarmer at ittc.ku.edu (Andrew Farmer) Date: Thu, 20 Aug 2015 17:55:54 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: The `buildDictionary` function takes a Var with a dictionary type, and builds the expression which implements that dictionary. For instance, you might create a new Var: x :: Num Float and pass that to buildDictionary. It will return: (x, [NonRec x $fNumFloat]) which you could blindly turn into: let x = $fNumFloat in x or you could do what buildDictionaryT (a bit further down in the same module), and spot that case and just return $fNumFloat directly. (The list can have more than one element in the case that dictionaries are built in terms of other dictionaries.) Thus, you've built a dictionary expression of type Num Float. As I understand it, you want to pass something 'log' and get back the dictionary argument. You'll need to choose a type (like Float), but once that is done, it should be easy to use buildDictionary to build the dictionary arguments... just take apart the type of 'log @ Float', make a new Var with the argument type, build a dictionary expression, and apply it. On Thu, Aug 20, 2015 at 5:05 PM, Mike Izbicki wrote: > I'm pretty sure the `buildDictionary` function doesn't do what I need. > AFAICT, you pass it a `Var` which contains a dictionary, and it tells > you what is in that dictionary. What I need is a function with type > `Var -> Var` where the first `Var` contains a function, and the output > `Var` is the dictionary. > > For example, given the expression: > > log (a1+a2) > > In core, this might look like: > > log @ Float $fFloatingFloat (+ @ Float $fNumFloat a1 a2) > > I want to mechanically construct the core code above. When doing so, > each function within a type class has an extra argument, which is the > dictionary for that type class. `log` no longer takes one parameter; > in core, it takes two. I'm having trouble figuring out how to get the > appropriate dictionary to pass as the "dictionary parameter" to these > functions. > > On Mon, Aug 17, 2015 at 4:21 PM, Andrew Farmer wrote: >> HERMIT has some code for building dictionaries for a given predicate >> type (by invoking the typechecker functions that do this): >> >> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Dictionary/GHC.hs#L223 >> >> The functions to run TcM computations inside CoreM are here: >> >> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Monad.hs#L242 >> and >> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/GHC/Typechecker.hs#L47 >> >> Perhaps that will help get you started? >> >> I would like to push these interfaces back into the GHC API at some >> point, but just haven't done it yet. >> >> HTH >> Andrew >> >> On Mon, Aug 17, 2015 at 4:12 PM, Mike Izbicki wrote: >>> I'm not sure how either of those two functions can help me. The >>> problem is that given an operator (e.g. `+`), I don't know the name of >>> the dictionary that needs to be passed in as the first argument to the >>> operator. I could probably hard code these names, but then the plugin >>> wouldn't be able to work with alternative preludes. >>> >>> On Fri, Aug 7, 2015 at 11:20 PM, Edward Z. Yang wrote: >>>> Hello Mike, >>>> >>>> Give importDecl from LoadIface a try, or maybe tcLookupGlobal if >>>> you're in TcM. >>>> >>>> Edward >>>> >>>> Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: >>>>> I'm trying to write a GHC plugin. The purpose of the plugin is to >>>>> provide Haskell bindings to Herbie. Herbie >>>>> (https://github.com/uwplse/herbie) is a program that takes a >>>>> mathematical statement as input, and gives you a numerically stable >>>>> formula to compute it as output. The plugin is supposed to automate >>>>> this process for Haskell programs. >>>>> >>>>> I can convert the core expressions into a format for Herbie just fine. >>>>> Where I'm having trouble is converting the output from Herbie back >>>>> into core. Given a string that represents a numeric operator (e.g. >>>>> "log" or "+"), I can get that converted into a Name that matches the >>>>> Name of the version of that operator in scope at the location. But in >>>>> order to create an Expr, I need to convert the Name into a Var. All >>>>> the functions that I can find for this (e.g. mkGlobalVar) also require >>>>> the type of the variable. But I can't find a way to figure out the >>>>> Type given a Name. How can I do this? >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> > From chak at cse.unsw.edu.au Fri Aug 21 01:01:25 2015 From: chak at cse.unsw.edu.au (Manuel M T Chakravarty) Date: Fri, 21 Aug 2015 11:01:25 +1000 Subject: ArrayArrays In-Reply-To: References: Message-ID: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> That?s an interesting idea. Manuel > Edward Kmett : > > Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents. > > I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety story preventing this. > > I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked. > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection. > > -Edward > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ekmett at gmail.com Fri Aug 21 04:25:08 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 21 Aug 2015 00:25:08 -0400 Subject: ArrayArrays In-Reply-To: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: When (ab)using them for this purpose, SmallArrayArray's would be very handy as well. Consider right now if I have something like an order-maintenance structure I have: data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} !(MutVar s (Lower s)) The former contains, logically, a mutable integer and two pointers, one for forward and one for backwards. The latter is basically the same thing with a mutable reference up pointing at the structure above. On the heap this is an object that points to a structure for the bytearray, and points to another structure for each mutvar which each point to the other 'Upper' structure. So there is a level of indirection smeared over everything. So this is a pair of doubly linked lists with an upward link from the structure below to the structure above. Converted into ArrayArray#s I'd get data Upper s = Upper (MutableArrayArray# s) w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots pointing to the previous and next previous objects, represented just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for object identity, which lets me check for the ends of the lists by tying things back on themselves. and below that data Lower s = Lower (MutableArrayArray# s) is similar, with an extra MutableArrayArray slot pointing up to an upper structure. I can then write a handful of combinators for getting out the slots in question, while it has gained a level of indirection between the wrapper to put it in * and the MutableArrayArray# s in #, that one can be basically erased by ghc. Unlike before I don't have several separate objects on the heap for each thing. I only have 2 now. The MutableArrayArray# for the object itself, and the MutableByteArray# that it references to carry around the mutable int. The only pain points are 1.) the aforementioned limitation that currently prevents me from stuffing normal boxed data through a SmallArray or Array into an ArrayArray leaving me in a little ghetto disconnected from the rest of Haskell, and 2.) the lack of SmallArrayArray's, which could let us avoid the card marking overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't help. Alternately I could just try to do really evil things and convert the whole mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, stuffing the #'d references to the other arrays directly into the SmallArray as slots, removing the limitation we see here by aping the MutableArrayArray# s API, but that gets really really dangerous! I'm pretty much willing to sacrifice almost anything on the altar of speed here, but I'd like to be able to let the GC move them and collect them which rules out simpler Ptr and Addr based solutions. -Edward On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < chak at cse.unsw.edu.au> wrote: > That?s an interesting idea. > > Manuel > > > Edward Kmett : > > > > Would it be possible to add unsafe primops to add Array# and SmallArray# > entries to an ArrayArray#? The fact that the ArrayArray# entries are all > directly unlifted avoiding a level of indirection for the containing > structure is amazing, but I can only currently use it if my leaf level data > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be > able to have the ability to put SmallArray# a stuff down at the leaves to > hold lifted contents. > > > > I accept fully that if I name the wrong type when I go to access one of > the fields it'll lie to me, but I suppose it'd do that if i tried to use > one of the members that held a nested ArrayArray# as a ByteArray# anyways, > so it isn't like there is a safety story preventing this. > > > > I've been hunting for ways to try to kill the indirection problems I get > with Haskell and mutable structures, and I could shoehorn a number of them > into ArrayArrays if this worked. > > > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection > compared to c/java and this could reduce that pain to just 1 level of > unnecessary indirection. > > > > -Edward > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at snoyman.com Fri Aug 21 05:05:53 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 08:05:53 +0300 Subject: Building on Windows Message-ID: I'm trying to test a patch I wrote for Windows builds[1]. I'm following the preparation guide[2], but my configure step fails[3] with config.log contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is it possible that this would contribute to the unrecognized --enable-tarballs-autodownload option, and/or the inability to compile C files? [1] https://phabricator.haskell.org/D1158, handles long linker command line arguments [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows [3] http://lpaste.net/139330 [4] http://lpaste.net/139331 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lonetiger at gmail.com Fri Aug 21 05:16:58 2015 From: lonetiger at gmail.com (Tamar Christina) Date: Thu, 20 Aug 2015 22:16:58 -0700 Subject: Building on Windows Message-ID: <-8199413097024316113@unknownmsgid> Hi Michael, Those instructions are for the GHC head. For 7.10 and earlier this page https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy should have been updated but it seems it never was.. To get the tarballs on that version do git clone git://git.haskell.org/ghc-tarballs.git I will update the legacy page later. Regards, Tamar ------------------------------ From: Michael Snoyman Sent: ?8/?21/?2015 7:06 To: ghc-devs at haskell.org Subject: Building on Windows I'm trying to test a patch I wrote for Windows builds[1]. I'm following the preparation guide[2], but my configure step fails[3] with config.log contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is it possible that this would contribute to the unrecognized --enable-tarballs-autodownload option, and/or the inability to compile C files? [1] https://phabricator.haskell.org/D1158, handles long linker command line arguments [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows [3] http://lpaste.net/139330 [4] http://lpaste.net/139331 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From michael at snoyman.com Fri Aug 21 05:18:07 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 08:18:07 +0300 Subject: Building on Windows In-Reply-To: <-8199413097024316113@unknownmsgid> References: <-8199413097024316113@unknownmsgid> Message-ID: Awesome, thanks for the quick response Tamar. Cloning now. On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina wrote: > Hi Michael, > > Those instructions are for the GHC head. For 7.10 and earlier this page > https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy > should have been updated but it seems it never was.. > > To get the tarballs on that version do > git clone git://git.haskell.org/ghc-tarballs.git > > I will update the legacy page later. > > Regards, > Tamar > ------------------------------ > From: Michael Snoyman > Sent: ?8/?21/?2015 7:06 > To: ghc-devs at haskell.org > Subject: Building on Windows > > I'm trying to test a patch I wrote for Windows builds[1]. I'm following > the preparation guide[2], but my configure step fails[3] with config.log > contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is > it possible that this would contribute to the unrecognized > --enable-tarballs-autodownload option, and/or the inability to compile C > files? > > [1] https://phabricator.haskell.org/D1158, handles long linker command > line arguments > [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows > [3] http://lpaste.net/139330 > [4] http://lpaste.net/139331 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at snoyman.com Fri Aug 21 06:26:40 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 09:26:40 +0300 Subject: Building on Windows In-Reply-To: References: <-8199413097024316113@unknownmsgid> Message-ID: That worked, and got me much farther. If you don't mind one more newb question, I'm now seeing the following. Any thoughts? ===--- building final phase make -r --no-print-directory -f ghc.mk phase=final all /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe inplace/bin/hp2ps.exe cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe -hisuf hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m -O -i -idriver/ghci/. -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts -no-auto-link-packages -no-hs-main driver/ghci/dist/build/ghci.o driver/ghci/dist/build/../utils/cwrapper.o driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. Call hs_init_ghc() from your main() function to set these options. gcc.exe: error: driverghcidistbuildghci.o: No such file or directory gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or directory gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file or directory gcc.exe: error: driverghcighci.res: No such file or directory gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory driver/ghci/ghc.mk:39: recipe for target 'driver/ghci/dist/build/tmp/ghci.exe' failed make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman wrote: > Awesome, thanks for the quick response Tamar. Cloning now. > > On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina > wrote: > >> Hi Michael, >> >> Those instructions are for the GHC head. For 7.10 and earlier this page >> https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy >> should have been updated but it seems it never was.. >> >> To get the tarballs on that version do >> git clone git://git.haskell.org/ghc-tarballs.git >> >> I will update the legacy page later. >> >> Regards, >> Tamar >> ------------------------------ >> From: Michael Snoyman >> Sent: ?8/?21/?2015 7:06 >> To: ghc-devs at haskell.org >> Subject: Building on Windows >> >> I'm trying to test a patch I wrote for Windows builds[1]. I'm following >> the preparation guide[2], but my configure step fails[3] with config.log >> contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is >> it possible that this would contribute to the unrecognized >> --enable-tarballs-autodownload option, and/or the inability to compile C >> files? >> >> [1] https://phabricator.haskell.org/D1158, handles long linker command >> line arguments >> [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows >> [3] http://lpaste.net/139330 >> [4] http://lpaste.net/139331 >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lonetiger at gmail.com Fri Aug 21 06:52:44 2015 From: lonetiger at gmail.com (Tamar Christina) Date: Thu, 20 Aug 2015 23:52:44 -0700 Subject: Building on Windows Message-ID: <-9089258717179065625@unknownmsgid> Hmm no that doesn't seem familiar to me. It looks like some Windows style paths are being passed around but don't know why.. Are you running the non-emulating shells? E.g. The MinGW-w64 Win64 Shell bat? ------------------------------ From: Michael Snoyman Sent: ?8/?21/?2015 8:27 To: Tamar Christina Cc: ghc-devs at haskell.org Subject: Re: Building on Windows That worked, and got me much farther. If you don't mind one more newb question, I'm now seeing the following. Any thoughts? ===--- building final phase make -r --no-print-directory -f ghc.mk phase=final all /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe inplace/bin/hp2ps.exe cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe -hisuf hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m -O -i -idriver/ghci/. -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts -no-auto-link-packages -no-hs-main driver/ghci/dist/build/ghci.o driver/ghci/dist/build/../utils/cwrapper.o driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. Call hs_init_ghc() from your main() function to set these options. gcc.exe: error: driverghcidistbuildghci.o: No such file or directory gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or directory gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file or directory gcc.exe: error: driverghcighci.res: No such file or directory gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory driver/ghci/ghc.mk:39: recipe for target 'driver/ghci/dist/build/tmp/ghci.exe' failed make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 Makefile:71: recipe for target 'all' failed make: *** [all] Error 2 On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman wrote: > Awesome, thanks for the quick response Tamar. Cloning now. > > On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina > wrote: > >> Hi Michael, >> >> Those instructions are for the GHC head. For 7.10 and earlier this page >> https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy >> should have been updated but it seems it never was.. >> >> To get the tarballs on that version do >> git clone git://git.haskell.org/ghc-tarballs.git >> >> I will update the legacy page later. >> >> Regards, >> Tamar >> ------------------------------ >> From: Michael Snoyman >> Sent: ?8/?21/?2015 7:06 >> To: ghc-devs at haskell.org >> Subject: Building on Windows >> >> I'm trying to test a patch I wrote for Windows builds[1]. I'm following >> the preparation guide[2], but my configure step fails[3] with config.log >> contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is >> it possible that this would contribute to the unrecognized >> --enable-tarballs-autodownload option, and/or the inability to compile C >> files? >> >> [1] https://phabricator.haskell.org/D1158, handles long linker command >> line arguments >> [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows >> [3] http://lpaste.net/139330 >> [4] http://lpaste.net/139331 >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at snoyman.com Fri Aug 21 07:03:54 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 10:03:54 +0300 Subject: Building on Windows In-Reply-To: <-9089258717179065625@unknownmsgid> References: <-9089258717179065625@unknownmsgid> Message-ID: That may have been it, thanks. On Fri, Aug 21, 2015 at 9:52 AM, Tamar Christina wrote: > Hmm no that doesn't seem familiar to me. It looks like some Windows style > paths are being passed around but don't know why.. > > Are you running the non-emulating shells? E.g. The MinGW-w64 Win64 Shell > bat? > ------------------------------ > From: Michael Snoyman > Sent: ?8/?21/?2015 8:27 > To: Tamar Christina > Cc: ghc-devs at haskell.org > Subject: Re: Building on Windows > > That worked, and got me much farther. If you don't mind one more newb > question, I'm now seeing the following. Any thoughts? > > ===--- building final phase > make -r --no-print-directory -f ghc.mk phase=final all > /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe > inplace/bin/hp2ps.exe > cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt > cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt > "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe -hisuf > hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. > -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen > -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen > -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir > driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m > -O -i -idriver/ghci/. -idriver/ghci/dist/build > -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build > -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts > -no-auto-link-packages -no-hs-main driver/ghci/dist/build/ghci.o > driver/ghci/dist/build/../utils/cwrapper.o > driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res > Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. > Call hs_init_ghc() from your main() function to set these options. > gcc.exe: error: driverghcidistbuildghci.o: No such file or directory > gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or > directory > gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file or > directory > gcc.exe: error: driverghcighci.res: No such file or directory > gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory > gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory > driver/ghci/ghc.mk:39: recipe for target > 'driver/ghci/dist/build/tmp/ghci.exe' failed > make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > > On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman > wrote: > >> Awesome, thanks for the quick response Tamar. Cloning now. >> >> On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina >> wrote: >> >>> Hi Michael, >>> >>> Those instructions are for the GHC head. For 7.10 and earlier this page >>> https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy >>> should have been updated but it seems it never was.. >>> >>> To get the tarballs on that version do >>> git clone git://git.haskell.org/ghc-tarballs.git >>> >>> I will update the legacy page later. >>> >>> Regards, >>> Tamar >>> ------------------------------ >>> From: Michael Snoyman >>> Sent: ?8/?21/?2015 7:06 >>> To: ghc-devs at haskell.org >>> Subject: Building on Windows >>> >>> I'm trying to test a patch I wrote for Windows builds[1]. I'm following >>> the preparation guide[2], but my configure step fails[3] with config.log >>> contents[4]. Note that I'm building on the ghc-7.10 branch, not master. Is >>> it possible that this would contribute to the unrecognized >>> --enable-tarballs-autodownload option, and/or the inability to compile C >>> files? >>> >>> [1] https://phabricator.haskell.org/D1158, handles long linker command >>> line arguments >>> [2] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows >>> [3] http://lpaste.net/139330 >>> [4] http://lpaste.net/139331 >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyrab at mail.ru Fri Aug 21 07:17:33 2015 From: kyrab at mail.ru (kyra) Date: Fri, 21 Aug 2015 10:17:33 +0300 Subject: Building on Windows In-Reply-To: <-9089258717179065625@unknownmsgid> References: <-9089258717179065625@unknownmsgid> Message-ID: <55D6D08D.8060401@mail.ru> My original patch https://ghc.haskell.org/trac/ghc/attachment/ticket/10777/cabal-rsp.patch contains 'normslash' function. It seems, Michael have overlooked this. GNU tools wait response files containing forward slashes in paths. On 21.08.2015 9:52, Tamar Christina wrote: > Hmm no that doesn't seem familiar to me. It looks like some Windows > style paths are being passed around but don't know why.. > > Are you running the non-emulating shells? E.g. The MinGW-w64 Win64 > Shell bat? > ------------------------------------------------------------------------ > From: Michael Snoyman > Sent: ?8/?21/?2015 8:27 > To: Tamar Christina > Cc: ghc-devs at haskell.org > Subject: Re: Building on Windows > > That worked, and got me much farther. If you don't mind one more newb > question, I'm now seeing the following. Any thoughts? > > ===--- building final phase > make -r --no-print-directory -f ghc.mk phase=final all > /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe > inplace/bin/hp2ps.exe > cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt > cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt > "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe > -hisuf hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. > -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen > -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen > -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir > driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m > -O -i -idriver/ghci/. -idriver/ghci/dist/build > -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build > -Idriver/ghci/dist/build/autogen -no-user-package-db > -rtsopts -no-auto-link-packages -no-hs-main > driver/ghci/dist/build/ghci.o > driver/ghci/dist/build/../utils/cwrapper.o > driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res > Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. > Call hs_init_ghc() from your main() function to set these options. > gcc.exe: error: driverghcidistbuildghci.o: No such file or directory > gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or > directory > gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file > or directory > gcc.exe: error: driverghcighci.res: No such file or directory > gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory > gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory > driver/ghci/ghc.mk:39 : recipe for target > 'driver/ghci/dist/build/tmp/ghci.exe' failed > make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 > Makefile:71: recipe for target 'all' failed > make: *** [all] Error 2 > > On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman > wrote: > > Awesome, thanks for the quick response Tamar. Cloning now. > > On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina > > wrote: > > Hi Michael, > > Those instructions are for the GHC head. For 7.10 and earlier > this page > https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy > should have been updated but it seems it never was.. > > To get the tarballs on that version do > git clone git://git.haskell.org/ghc-tarballs.git > > > I will update the legacy page later. > > Regards, > Tamar > ------------------------------------------------------------------------ > From: Michael Snoyman > Sent: ?8/?21/?2015 7:06 > To: ghc-devs at haskell.org > Subject: Building on Windows > > I'm trying to test a patch I wrote for Windows builds[1]. I'm > following the preparation guide[2], but my configure step > fails[3] with config.log contents[4]. Note that I'm building > on the ghc-7.10 branch, not master. Is it possible that this > would contribute to the unrecognized > --enable-tarballs-autodownload option, and/or the inability to > compile C files? > > [1] https://phabricator.haskell.org/D1158, handles long linker > command line arguments > [2] > https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows > [3] http://lpaste.net/139330 > [4] http://lpaste.net/139331 > > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From michael at snoyman.com Fri Aug 21 07:24:47 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 10:24:47 +0300 Subject: Building on Windows In-Reply-To: <55D6D08D.8060401@mail.ru> References: <-9089258717179065625@unknownmsgid> <55D6D08D.8060401@mail.ru> Message-ID: I'd like to discuss that in Phabricator to keep the discussion centralized. Could you add a comment? On Fri, Aug 21, 2015 at 10:17 AM, kyra wrote: > My original patch > https://ghc.haskell.org/trac/ghc/attachment/ticket/10777/cabal-rsp.patch > contains 'normslash' function. It seems, Michael have overlooked this. GNU > tools wait response files containing forward slashes in paths. > > On 21.08.2015 9:52, Tamar Christina wrote: > >> Hmm no that doesn't seem familiar to me. It looks like some Windows style >> paths are being passed around but don't know why.. >> >> Are you running the non-emulating shells? E.g. The MinGW-w64 Win64 Shell >> bat? >> ------------------------------------------------------------------------ >> From: Michael Snoyman >> Sent: ?8/?21/?2015 8:27 >> To: Tamar Christina >> Cc: ghc-devs at haskell.org >> Subject: Re: Building on Windows >> >> That worked, and got me much farther. If you don't mind one more newb >> question, I'm now seeing the following. Any thoughts? >> >> ===--- building final phase >> make -r --no-print-directory -f ghc.mk phase=final all >> /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe >> inplace/bin/hp2ps.exe >> cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt >> cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt >> "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe >> -hisuf hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. >> -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen >> -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen >> -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir >> driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m -O >> -i -idriver/ghci/. -idriver/ghci/dist/build >> -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build >> -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts >> -no-auto-link-packages -no-hs-main driver/ghci/dist/build/ghci.o >> driver/ghci/dist/build/../utils/cwrapper.o >> driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res >> Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. >> Call hs_init_ghc() from your main() function to set these options. >> gcc.exe: error: driverghcidistbuildghci.o: No such file or directory >> gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or >> directory >> gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file or >> directory >> gcc.exe: error: driverghcighci.res: No such file or directory >> gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory >> gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory >> driver/ghci/ghc.mk:39 : recipe for target >> 'driver/ghci/dist/build/tmp/ghci.exe' failed >> make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 >> Makefile:71: recipe for target 'all' failed >> make: *** [all] Error 2 >> >> On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman > > wrote: >> >> Awesome, thanks for the quick response Tamar. Cloning now. >> >> On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina >> > wrote: >> >> Hi Michael, >> >> Those instructions are for the GHC head. For 7.10 and earlier >> this page >> >> https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy >> should have been updated but it seems it never was.. >> >> To get the tarballs on that version do >> git clone git://git.haskell.org/ghc-tarballs.git >> >> >> I will update the legacy page later. >> >> Regards, >> Tamar >> >> ------------------------------------------------------------------------ >> From: Michael Snoyman >> Sent: ?8/?21/?2015 7:06 >> To: ghc-devs at haskell.org >> Subject: Building on Windows >> >> I'm trying to test a patch I wrote for Windows builds[1]. I'm >> following the preparation guide[2], but my configure step >> fails[3] with config.log contents[4]. Note that I'm building >> on the ghc-7.10 branch, not master. Is it possible that this >> would contribute to the unrecognized >> --enable-tarballs-autodownload option, and/or the inability to >> compile C files? >> >> [1] https://phabricator.haskell.org/D1158, handles long linker >> command line arguments >> [2] >> >> https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows >> [3] http://lpaste.net/139330 >> [4] http://lpaste.net/139331 >> >> >> >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at snoyman.com Fri Aug 21 07:25:50 2015 From: michael at snoyman.com (Michael Snoyman) Date: Fri, 21 Aug 2015 10:25:50 +0300 Subject: Building on Windows In-Reply-To: References: <-9089258717179065625@unknownmsgid> <55D6D08D.8060401@mail.ru> Message-ID: And I just understood that you're saying that's the cause of the failed compile, thank you :) Nonetheless, could you add the comment to Phabricator? On Fri, Aug 21, 2015 at 10:24 AM, Michael Snoyman wrote: > I'd like to discuss that in Phabricator to keep the discussion > centralized. Could you add a comment? > > On Fri, Aug 21, 2015 at 10:17 AM, kyra wrote: > >> My original patch >> https://ghc.haskell.org/trac/ghc/attachment/ticket/10777/cabal-rsp.patch >> contains 'normslash' function. It seems, Michael have overlooked this. GNU >> tools wait response files containing forward slashes in paths. >> >> On 21.08.2015 9:52, Tamar Christina wrote: >> >>> Hmm no that doesn't seem familiar to me. It looks like some Windows >>> style paths are being passed around but don't know why.. >>> >>> Are you running the non-emulating shells? E.g. The MinGW-w64 Win64 Shell >>> bat? >>> ------------------------------------------------------------------------ >>> From: Michael Snoyman >>> Sent: ?8/?21/?2015 8:27 >>> To: Tamar Christina >>> Cc: ghc-devs at haskell.org >>> Subject: Re: Building on Windows >>> >>> That worked, and got me much farther. If you don't mind one more newb >>> question, I'm now seeing the following. Any thoughts? >>> >>> ===--- building final phase >>> make -r --no-print-directory -f ghc.mk phase=final all >>> /usr/bin/install -c -m 755 utils/hp2ps/dist/build/tmp/hp2ps.exe >>> inplace/bin/hp2ps.exe >>> cp driver/ghc-usage.txt inplace/lib/ghc-usage.txt >>> cp driver/ghci-usage.txt inplace/lib/ghci-usage.txt >>> "inplace/bin/ghc-stage1.exe" -o driver/ghci/dist/build/tmp/ghci.exe >>> -hisuf hi -osuf o -hcsuf hc -static -H32m -O -i -idriver/ghci/. >>> -idriver/ghci/dist/build -idriver/ghci/dist/build/autogen >>> -Idriver/ghci/dist/build -Idriver/ghci/dist/build/autogen >>> -no-user-package-db -rtsopts -odir driver/ghci/dist/build -hidir >>> driver/ghci/dist/build -stubdir driver/ghci/dist/build -static -H32m -O >>> -i -idriver/ghci/. -idriver/ghci/dist/build >>> -idriver/ghci/dist/build/autogen -Idriver/ghci/dist/build >>> -Idriver/ghci/dist/build/autogen -no-user-package-db -rtsopts >>> -no-auto-link-packages -no-hs-main driver/ghci/dist/build/ghci.o >>> driver/ghci/dist/build/../utils/cwrapper.o >>> driver/ghci/dist/build/../utils/getLocation.o driver/ghci/ghci.res >>> Warning: -rtsopts and -with-rtsopts have no effect with -no-hs-main. >>> Call hs_init_ghc() from your main() function to set these options. >>> gcc.exe: error: driverghcidistbuildghci.o: No such file or directory >>> gcc.exe: error: driverghcidistbuild..utilscwrapper.o: No such file or >>> directory >>> gcc.exe: error: driverghcidistbuild..utilsgetLocation.o: No such file or >>> directory >>> gcc.exe: error: driverghcighci.res: No such file or directory >>> gcc.exe: error: C:msys64-2tmpghc6528_0ghc_4.o: No such file or directory >>> gcc.exe: error: C:msys64-2tmpghc6528_0ghc_2.o: No such file or directory >>> driver/ghci/ghc.mk:39 : recipe for target >>> 'driver/ghci/dist/build/tmp/ghci.exe' failed >>> make[1]: *** [driver/ghci/dist/build/tmp/ghci.exe] Error 1 >>> Makefile:71: recipe for target 'all' failed >>> make: *** [all] Error 2 >>> >>> On Fri, Aug 21, 2015 at 8:18 AM, Michael Snoyman >> > wrote: >>> >>> Awesome, thanks for the quick response Tamar. Cloning now. >>> >>> On Fri, Aug 21, 2015 at 8:16 AM, Tamar Christina >>> > wrote: >>> >>> Hi Michael, >>> >>> Those instructions are for the GHC head. For 7.10 and earlier >>> this page >>> >>> https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources/Legacy >>> should have been updated but it seems it never was.. >>> >>> To get the tarballs on that version do >>> git clone git://git.haskell.org/ghc-tarballs.git >>> >>> >>> I will update the legacy page later. >>> >>> Regards, >>> Tamar >>> >>> ------------------------------------------------------------------------ >>> From: Michael Snoyman >>> Sent: ?8/?21/?2015 7:06 >>> To: ghc-devs at haskell.org >>> Subject: Building on Windows >>> >>> I'm trying to test a patch I wrote for Windows builds[1]. I'm >>> following the preparation guide[2], but my configure step >>> fails[3] with config.log contents[4]. Note that I'm building >>> on the ghc-7.10 branch, not master. Is it possible that this >>> would contribute to the unrecognized >>> --enable-tarballs-autodownload option, and/or the inability to >>> compile C files? >>> >>> [1] https://phabricator.haskell.org/D1158, handles long linker >>> command line arguments >>> [2] >>> >>> https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows >>> [3] http://lpaste.net/139330 >>> [4] http://lpaste.net/139331 >>> >>> >>> >>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthewtpickering at gmail.com Fri Aug 21 09:23:48 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Fri, 21 Aug 2015 11:23:48 +0200 Subject: Record syntax for pattern synonyms In-Reply-To: References: Message-ID: The patch now validates at D1152. https://phabricator.haskell.org/D1152 On Tue, Aug 11, 2015 at 11:26 PM, Matthew Pickering wrote: > Thank you for your comments Richard. > > >> I'm assuming `pattern Foo{bar, baz} = (bar, baz)` from the wiki page, without any further pattern type signature. This example then looks straightforward to me -- I feel I'm missing the subtlety. `foo` would get the type `(a,b) -> (b,b)` and would be roughly equivalent to `foo a@(bar, baz) = case a of (_, baz2) -> (baz, baz2)`. The case statement and baz2 is necessary just to provide a predictable desugaring of record updates; handwritten code should clearly be more succinct. > > This is how I imagined it to work. > >> This would desugar to `foo x = case x of Just _ -> Just 5`. I'm not sure about pattern exhaustiveness warnings, but I would expect such a record update to be partial. The partiality of record updates has been surprising in the past, but I don't think adding pattern synonyms to the mix should change that. > > Yes, I agree. > >> I would like to keep record updates for the same reasons you appear to. I will warn that they are quite hard to work with, though! About 220 lines of dense code (including comments) are necessary to type-check regular old record updates. This isn't to scare you off, but to have you suitably forewarned and forearmed. > > I consider myself warned! > > >> What do you mean here? Without checking, I assumed that the x in `x { ... }` had to be a variable. But this is wrong! See 3.15.3 of the Haskell 2010 report (https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-490003.15). So I think it's already generalized. > > Good news. This should simplify the implementation. > >> >> Many thanks for taking this on! >> Richard >> From fryguybob at gmail.com Fri Aug 21 13:49:47 2015 From: fryguybob at gmail.com (Ryan Yates) Date: Fri, 21 Aug 2015 09:49:47 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: Hi Edward, I've been working on removing indirection in STM and I added a heap object like SmallArray, but with a mix of words and pointers (as well as a header with metadata for STM). It appears to work well now, but it is missing the type information. All the pointers have the same type which works fine for your Upper. In my case I use it to represent a red-black tree node [1]. Also all the structures I make are fixed size and it would be nice if the compiler could treat that fix size like a constant in code generation. I don't know what the right design is or what would be needed, but it seems simple enough to give the right typing information to something like this and basically get a mutable struct. I'm talking about this work at HIW and really hope to find someone interested in extending this expressiveness to let us write something that looks clear in Haskell, but gives the heap representation that we really need for performance. From the RTS perspective I think there are any obstacles. [1]: https://github.com/fryguybob/ghc-stm-benchmarks/blob/master/benchmarks/RBTree-Throughput/RBTreeNode.hs Ryan On Fri, Aug 21, 2015 at 12:25 AM, Edward Kmett wrote: > When (ab)using them for this purpose, SmallArrayArray's would be very handy > as well. > > Consider right now if I have something like an order-maintenance structure I > have: > > data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) > > data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} > !(MutVar s (Lower s)) > > The former contains, logically, a mutable integer and two pointers, one for > forward and one for backwards. The latter is basically the same thing with a > mutable reference up pointing at the structure above. > > On the heap this is an object that points to a structure for the bytearray, > and points to another structure for each mutvar which each point to the > other 'Upper' structure. So there is a level of indirection smeared over > everything. > > So this is a pair of doubly linked lists with an upward link from the > structure below to the structure above. > > Converted into ArrayArray#s I'd get > > data Upper s = Upper (MutableArrayArray# s) > > w/ the first slot being a pointer to a MutableByteArray#, and the next 2 > slots pointing to the previous and next previous objects, represented just > as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for > object identity, which lets me check for the ends of the lists by tying > things back on themselves. > > and below that > > data Lower s = Lower (MutableArrayArray# s) > > is similar, with an extra MutableArrayArray slot pointing up to an upper > structure. > > I can then write a handful of combinators for getting out the slots in > question, while it has gained a level of indirection between the wrapper to > put it in * and the MutableArrayArray# s in #, that one can be basically > erased by ghc. > > Unlike before I don't have several separate objects on the heap for each > thing. I only have 2 now. The MutableArrayArray# for the object itself, and > the MutableByteArray# that it references to carry around the mutable int. > > The only pain points are > > 1.) the aforementioned limitation that currently prevents me from stuffing > normal boxed data through a SmallArray or Array into an ArrayArray leaving > me in a little ghetto disconnected from the rest of Haskell, > > and > > 2.) the lack of SmallArrayArray's, which could let us avoid the card marking > overhead. These objects are all small, 3-4 pointers wide. Card marking > doesn't help. > > Alternately I could just try to do really evil things and convert the whole > mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, > stuffing the #'d references to the other arrays directly into the SmallArray > as slots, removing the limitation we see here by aping the > MutableArrayArray# s API, but that gets really really dangerous! > > I'm pretty much willing to sacrifice almost anything on the altar of speed > here, but I'd like to be able to let the GC move them and collect them which > rules out simpler Ptr and Addr based solutions. > > -Edward > > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > wrote: >> >> That?s an interesting idea. >> >> Manuel >> >> > Edward Kmett : >> > >> > Would it be possible to add unsafe primops to add Array# and SmallArray# >> > entries to an ArrayArray#? The fact that the ArrayArray# entries are all >> > directly unlifted avoiding a level of indirection for the containing >> > structure is amazing, but I can only currently use it if my leaf level data >> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be >> > able to have the ability to put SmallArray# a stuff down at the leaves to >> > hold lifted contents. >> > >> > I accept fully that if I name the wrong type when I go to access one of >> > the fields it'll lie to me, but I suppose it'd do that if i tried to use one >> > of the members that held a nested ArrayArray# as a ByteArray# anyways, so it >> > isn't like there is a safety story preventing this. >> > >> > I've been hunting for ways to try to kill the indirection problems I get >> > with Haskell and mutable structures, and I could shoehorn a number of them >> > into ArrayArrays if this worked. >> > >> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection >> > compared to c/java and this could reduce that pain to just 1 level of >> > unnecessary indirection. >> > >> > -Edward >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From ekmett at gmail.com Fri Aug 21 14:58:00 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 21 Aug 2015 10:58:00 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: On Fri, Aug 21, 2015 at 9:49 AM, Ryan Yates wrote: > Hi Edward, > > I've been working on removing indirection in STM and I added a heap > object like SmallArray, but with a mix of words and pointers (as well > as a header with metadata for STM). It appears to work well now, but > it is missing the type information. All the pointers have the same > type which works fine for your Upper. In my case I use it to > represent a red-black tree node [1]. > This would be perfect for my purposes. > Also all the structures I make are fixed size and it would be nice if > the compiler could treat that fix size like a constant in code > generation. To make the fixed sized thing work without an extra couple of size parameters in the arguments, you'd want to be able to build an info table for each generated size. That sounds messy. > I don't know what the right design is or what would be > needed, but it seems simple enough to give the right typing > information to something like this and basically get a mutable struct. > I'm talking about this work at HIW and really hope to find someone > interested in extending this expressiveness to let us write something > that looks clear in Haskell, but gives the heap representation that we > really need for performance. I'll be there. Let's talk. > From the RTS perspective I think there are any obstacles. > FWIW- I was able to get some code put together that let me scribble unlifted SmallMutableArray#s directly into other SmallMutableArray#s, which nicely "just works" as long as you fix up all the fields that are supposed to be arrays before you ever dare use them. writeSmallMutableArraySmallArray# :: SmallMutableArray# s Any -> Int# -> SmallMutableArray# s Any -> State# s -> State# s writeSmallMutableArraySmallArray# m i a s = unsafeCoerce# writeSmallArray# m i a s {-# INLINE writeSmallMutableArraySmallArray# #-} readSmallMutableArraySmallArray# :: SmallMutableArray# s Any -> Int# -> State# s -> (# State# s, SmallMutableArray# s Any #) readSmallMutableArraySmallArray# m i s = unsafeCoerce# readSmallArray# m i s {-# INLINE readSmallMutableArraySmallArray# #-} With some support for typed 'Field's I can write code now that looks like: order :: PrimMonad m => Upper (PrimState m) -> Int -> Order (PrimState m) -> Order (PrimState m) -> m (Order (PrimState m)) order p a l r = st $ do this <- primitive $ \s -> case unsafeCoerce# newSmallArray# 4# a s of (# s', b #) -> (# s', Order b #) set parent this p set next this l set prev this r return this and in there basically build my own little strict, mutable, universe and with some careful monitoring of the core make sure that the little Order wrappers as the fringes get removed. Here I'm using one of the slots as a pointer to a boxed Int for testing, rather than as a pointer to a MutableByteArray that holds the Int. -Edward -------------- next part -------------- An HTML attachment was scrubbed... URL: From bburdette at gmail.com Fri Aug 21 15:14:53 2015 From: bburdette at gmail.com (Ben Burdette) Date: Fri, 21 Aug 2015 09:14:53 -0600 Subject: armv7 "invalid instruction" problem building cabal Message-ID: <55D7406D.4020901@gmail.com> I've been trying to get going with ghc 7.10.2 on armv7 debian, problem is described here: http://stackoverflow.com/questions/32124334/ghc-armv7-binary-cabal-illegal-instruction thanks for any help! Ben From mle+hs at mega-nerd.com Fri Aug 21 21:16:02 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Sat, 22 Aug 2015 07:16:02 +1000 Subject: armv7 "invalid instruction" problem building cabal In-Reply-To: <55D7406D.4020901@gmail.com> References: <55D7406D.4020901@gmail.com> Message-ID: <20150822071602.61297d40116f40dfa5b38e3f@mega-nerd.com> Ben Burdette wrote: > I've been trying to get going with ghc 7.10.2 on armv7 debian, problem > is described here: > > http://stackoverflow.com/questions/32124334/ghc-armv7-binary-cabal-illegal-instruction Sorry, I'll answer here instead of SO if thats OK. Mostly because I don't have an answer and need to ask you to provide more info. As a first step, what is the output of "ghc --info"? Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From bburdette at gmail.com Fri Aug 21 21:18:44 2015 From: bburdette at gmail.com (Ben Burdette) Date: Fri, 21 Aug 2015 15:18:44 -0600 Subject: armv7 "invalid instruction" problem building cabal In-Reply-To: <20150822071602.61297d40116f40dfa5b38e3f@mega-nerd.com> References: <55D7406D.4020901@gmail.com> <20150822071602.61297d40116f40dfa5b38e3f@mega-nerd.com> Message-ID: <55D795B4.10005@gmail.com> On 08/21/2015 03:16 PM, Erik de Castro Lopo wrote: > Ben Burdette wrote: > >> I've been trying to get going with ghc 7.10.2 on armv7 debian, problem >> is described here: >> >> http://stackoverflow.com/questions/32124334/ghc-armv7-binary-cabal-illegal-instruction > Sorry, I'll answer here instead of SO if thats OK. Mostly because I > don't have an answer and need to ask you to provide more info. > > As a first step, what is the output of "ghc --info"? > > Cheers, > Erik No prob! Either forum is ok for me. get info output: [("Project name","The Glorious Glasgow Haskell Compilation System") ,("GCC extra via C opts"," -fwrapv") ,("C compiler command","/usr/bin/gcc") ,("C compiler flags"," -fno-stack-protector") ,("C compiler link flags"," -fuse-ld=gold -Wl,-z,noexecstack") ,("Haskell CPP command","/usr/bin/gcc") ,("Haskell CPP flags","-E -undef -traditional ") ,("ld command","/usr/bin/ld.gold") ,("ld flags"," -z noexecstack") ,("ld supports compact unwind","YES") ,("ld supports build-id","YES") ,("ld supports filelist","NO") ,("ld is GNU ld","YES") ,("ar command","/usr/bin/ar") ,("ar flags","q") ,("ar supports at file","YES") ,("touch command","touch") ,("dllwrap command","/bin/false") ,("windres command","/bin/false") ,("libtool command","libtool") ,("perl command","/usr/bin/perl") ,("cross compiling","NO") ,("target os","OSLinux") ,("target arch","ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}") ,("target word size","4") ,("target has GNU nonexec stack","False") ,("target has .ident directive","True") ,("target has subsections via symbols","False") ,("Unregisterised","NO") ,("LLVM llc command","/usr/bin/llc-3.5") ,("LLVM opt command","/usr/bin/opt-3.5") ,("Project version","7.10.2") ,("Project Git commit id","0da488c4438d88c9252e0b860426b8e74b5fc9e8") ,("Booter version","7.6.3") ,("Stage","2") ,("Build platform","arm-unknown-linux") ,("Host platform","arm-unknown-linux") ,("Target platform","arm-unknown-linux") ,("Have interpreter","YES") ,("Object splitting supported","NO") ,("Have native code generator","NO") ,("Support SMP","YES") ,("Tables next to code","YES") ,("RTS ways","l debug thr thr_debug thr_l thr_p dyn debug_dyn thr_dyn thr_debug_dyn l_dyn thr_l_dyn") ,("Support dynamic-too","YES") ,("Support parallel --make","YES") ,("Support reexported-modules","YES") ,("Support thinning and renaming package flags","YES") ,("Uses package keys","YES") ,("Dynamic by default","NO") ,("GHC Dynamic","YES") ,("Leading underscore","NO") ,("Debug on","False") ,("LibDir","/usr/local/lib/ghc-7.10.2") ,("Global Package DB","/usr/local/lib/ghc-7.10.2/package.conf.d") ] From mle+hs at mega-nerd.com Fri Aug 21 21:44:23 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Sat, 22 Aug 2015 07:44:23 +1000 Subject: armv7 "invalid instruction" problem building cabal In-Reply-To: <55D795B4.10005@gmail.com> References: <55D7406D.4020901@gmail.com> <20150822071602.61297d40116f40dfa5b38e3f@mega-nerd.com> <55D795B4.10005@gmail.com> Message-ID: <20150822074423.379449a019180345baa73c21@mega-nerd.com> Ben Burdette wrote: > No prob! Either forum is ok for me. get info output: Ok, this: > ,("ld command","/usr/bin/ld.gold") means GHC uses ld.gold explicitly and this > ,("LLVM llc command","/usr/bin/llc-3.5") > ,("LLVM opt command","/usr/bin/opt-3.5") means its using the right versions of the llvm tool. Try compiling and running a simple "Hello world" type program. Try compiling with and without optimisation. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From mle+hs at mega-nerd.com Fri Aug 21 22:00:55 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Sat, 22 Aug 2015 08:00:55 +1000 Subject: armv7 "invalid instruction" problem building cabal In-Reply-To: <55D7406D.4020901@gmail.com> References: <55D7406D.4020901@gmail.com> Message-ID: <20150822080055.ac53c1ce4fe8711fcbd2e34a@mega-nerd.com> Ben Burdette wrote: > I've been trying to get going with ghc 7.10.2 on armv7 debian, problem > is described here: > > http://stackoverflow.com/questions/32124334/ghc-armv7-binary-cabal-illegal-instruction I suspect your problem may be related to this one: https://ghc.haskell.org/trac/ghc/ticket/10375 which I started work on, got stuck and haven't had time to return to. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From bburdette at gmail.com Fri Aug 21 23:43:46 2015 From: bburdette at gmail.com (Ben Burdette) Date: Fri, 21 Aug 2015 17:43:46 -0600 Subject: armv7 "invalid instruction" problem building cabal In-Reply-To: <20150822080055.ac53c1ce4fe8711fcbd2e34a@mega-nerd.com> References: <55D7406D.4020901@gmail.com> <20150822080055.ac53c1ce4fe8711fcbd2e34a@mega-nerd.com> Message-ID: <55D7B7B2.1020207@gmail.com> On 08/21/2015 04:00 PM, Erik de Castro Lopo wrote: > Ben Burdette wrote: > >> I've been trying to get going with ghc 7.10.2 on armv7 debian, problem >> is described here: >> >> http://stackoverflow.com/questions/32124334/ghc-armv7-binary-cabal-illegal-instruction > I suspect your problem may be related to this one: > > https://ghc.haskell.org/trac/ghc/ticket/10375 > > which I started work on, got stuck and haven't had time to return to. > > Erik That does seem to be the case. I was able to duplicate their ghci error: /GHCi, version 7.10.2: http://www.haskell.org/ghc/ :? for help// //Prelude> data Planet = Mercury | Venus deriving Eq// //Prelude> Mercury == Mercury// //Illegal instruction// //bburdette at jessie-rpi:~$ // / And my hello world program: / main = do putStrLn "hello" /The results: / //bburdette at jessie-rpi:~$ ghc hello.hs // //[1 of 1] Compiling Main ( hello.hs, hello.o )// //Linking hello ...// //bburdette at jessie-rpi:~$ ls// //bin ghc-7.10.2-arm-unknown-linux.tar.xz hello.hi hello.o// //code hello hello.hs// //bburdette at jessie-rpi:~$ ./hello // //Illegal instruction// //bburdette at jessie-rpi:~$ ghc -O2 hello.hs// //bburdette at jessie-rpi:~$ ./hello // //Illegal instruction// //bburdette at jessie-rpi:~$ // // /Ok, thx for the help - I'll follow progress on the bug. Let me know if there's anything else you'd like be to try. Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at izbicki.me Fri Aug 21 23:57:28 2015 From: mike at izbicki.me (Mike Izbicki) Date: Fri, 21 Aug 2015 16:57:28 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Ahh... I get it now! Thanks for your patience :) I have a new question: I'm working on supporting literals now. I'm having trouble creating something that looks like `(App (Var F#) (Lit 1.0))` because I don't know how to create a variable that corresponds to the `F#` constructor. The mkWiredInName function looks promising, but overly complicated. Is this the correct function? If so, what do I pass in for the Module, Unique, TyThing, and BuiltInSyntax parameters? On Thu, Aug 20, 2015 at 5:55 PM, Andrew Farmer wrote: > The `buildDictionary` function takes a Var with a dictionary type, and > builds the expression which implements that dictionary. > > For instance, you might create a new Var: > > x :: Num Float > > and pass that to buildDictionary. It will return: > > (x, [NonRec x $fNumFloat]) > > which you could blindly turn into: > > let x = $fNumFloat > in x > > or you could do what buildDictionaryT (a bit further down in the same > module), and spot that case and just return $fNumFloat directly. (The > list can have more than one element in the case that dictionaries are > built in terms of other dictionaries.) > > Thus, you've built a dictionary expression of type Num Float. > > As I understand it, you want to pass something 'log' and get back the > dictionary argument. You'll need to choose a type (like Float), but > once that is done, it should be easy to use buildDictionary to build > the dictionary arguments... just take apart the type of 'log @ Float', > make a new Var with the argument type, build a dictionary expression, > and apply it. > > On Thu, Aug 20, 2015 at 5:05 PM, Mike Izbicki wrote: >> I'm pretty sure the `buildDictionary` function doesn't do what I need. >> AFAICT, you pass it a `Var` which contains a dictionary, and it tells >> you what is in that dictionary. What I need is a function with type >> `Var -> Var` where the first `Var` contains a function, and the output >> `Var` is the dictionary. >> >> For example, given the expression: >> >> log (a1+a2) >> >> In core, this might look like: >> >> log @ Float $fFloatingFloat (+ @ Float $fNumFloat a1 a2) >> >> I want to mechanically construct the core code above. When doing so, >> each function within a type class has an extra argument, which is the >> dictionary for that type class. `log` no longer takes one parameter; >> in core, it takes two. I'm having trouble figuring out how to get the >> appropriate dictionary to pass as the "dictionary parameter" to these >> functions. >> >> On Mon, Aug 17, 2015 at 4:21 PM, Andrew Farmer wrote: >>> HERMIT has some code for building dictionaries for a given predicate >>> type (by invoking the typechecker functions that do this): >>> >>> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Dictionary/GHC.hs#L223 >>> >>> The functions to run TcM computations inside CoreM are here: >>> >>> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/Monad.hs#L242 >>> and >>> https://github.com/ku-fpg/hermit/blob/master/src/HERMIT/GHC/Typechecker.hs#L47 >>> >>> Perhaps that will help get you started? >>> >>> I would like to push these interfaces back into the GHC API at some >>> point, but just haven't done it yet. >>> >>> HTH >>> Andrew >>> >>> On Mon, Aug 17, 2015 at 4:12 PM, Mike Izbicki wrote: >>>> I'm not sure how either of those two functions can help me. The >>>> problem is that given an operator (e.g. `+`), I don't know the name of >>>> the dictionary that needs to be passed in as the first argument to the >>>> operator. I could probably hard code these names, but then the plugin >>>> wouldn't be able to work with alternative preludes. >>>> >>>> On Fri, Aug 7, 2015 at 11:20 PM, Edward Z. Yang wrote: >>>>> Hello Mike, >>>>> >>>>> Give importDecl from LoadIface a try, or maybe tcLookupGlobal if >>>>> you're in TcM. >>>>> >>>>> Edward >>>>> >>>>> Excerpts from Mike Izbicki's message of 2015-08-07 15:40:30 -0700: >>>>>> I'm trying to write a GHC plugin. The purpose of the plugin is to >>>>>> provide Haskell bindings to Herbie. Herbie >>>>>> (https://github.com/uwplse/herbie) is a program that takes a >>>>>> mathematical statement as input, and gives you a numerically stable >>>>>> formula to compute it as output. The plugin is supposed to automate >>>>>> this process for Haskell programs. >>>>>> >>>>>> I can convert the core expressions into a format for Herbie just fine. >>>>>> Where I'm having trouble is converting the output from Herbie back >>>>>> into core. Given a string that represents a numeric operator (e.g. >>>>>> "log" or "+"), I can get that converted into a Name that matches the >>>>>> Name of the version of that operator in scope at the location. But in >>>>>> order to create an Expr, I need to convert the Name into a Var. All >>>>>> the functions that I can find for this (e.g. mkGlobalVar) also require >>>>>> the type of the variable. But I can't find a way to figure out the >>>>>> Type given a Name. How can I do this? >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >> From omeragacan at gmail.com Sat Aug 22 14:35:09 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Sat, 22 Aug 2015 10:35:09 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: > I have a new question: I'm working on supporting literals now. I'm having > trouble creating something that looks like `(App (Var F#) (Lit 1.0))` because > I don't know how to create a variable that corresponds to the `F#` > constructor. The mkWiredInName function looks promising, but overly > complicated. Is this the correct function? If so, what do I pass in for the > Module, Unique, TyThing, and BuiltInSyntax parameters? mkConApp intDataCon [mkIntLit dynFlags PUT_YOUR_INTEGER HERE] mkConApp floatDataCon [mkFloatLit dynFlags PUT_YOUR_FLOAT_HERE] Similarly for other literals... From rpglover64 at gmail.com Sat Aug 22 15:50:47 2015 From: rpglover64 at gmail.com (Alex Rozenshteyn) Date: Sat, 22 Aug 2015 11:50:47 -0400 Subject: Request for input on #7253: Top-level bindings in GHCI Message-ID: I'm thinking of working on this ticket ( https://ghc.haskell.org/trac/ghc/ticket/7253), so, as per mpickering's suggestion (https://phabricator.haskell.org/chatlog/channel/3/?at=1353572), I'm emailing the list to solicit input. My first instinct was to treat declarations like "a = 1" in GHCI as equivalent to "let a = 1"; this would be a straightforward matter of parsing. On the other hand, as thoughtpolice comments, let-bound variables are treated subtly differently than top-level bindings, so the proper solution may be more involved. Comments? -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.feuer at gmail.com Sat Aug 22 16:54:47 2015 From: david.feuer at gmail.com (David Feuer) Date: Sat, 22 Aug 2015 12:54:47 -0400 Subject: Access to class defaults and derived instances Message-ID: >From time to time, a library lacks an instance for something that I want. For example, I may need to convert data Foo = Bar (Vector Baz) to FishFood, but (to avoid unreasonable dependencies) Vector doesn't have a ToFishFood instance, so I can't just write instance ToFishFood Foo and (using Generic magic) be done with it. Instead, I must write the instance completely by hand, which could be painful. I *could* write an orphan instance, but orphans are evil. What I wish I could do: newtype Vec a = Vec (Vector a) instance ToFishFood a => (newtype Vec) a where -- if needed toFishFood (v :: Vector a) = ... That is, I want to write a super-secret orphan instance for Vector and transfer it to Vec via GND precisely when it is legal to do so. The secret instance could itself be derived (if the constructors are visible) or could make use of default member definitions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From omeragacan at gmail.com Sat Aug 22 22:26:25 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Sat, 22 Aug 2015 18:26:25 -0400 Subject: How is this Generic-based instance implementation optimized by GHC? Message-ID: Hi all, I'm very confused by an optimization GHC is doing. I have this code: data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving (Generic, Show, NFData) data Tree1 a = Leaf1 a | Branch1 (Tree1 a) (Tree1 a) deriving (Show) instance NFData a => NFData (Tree1 a) where rnf (Leaf1 a) = rnf a rnf (Branch1 t1 t2) = rnf t1 `seq` rnf t2 When I benchmarked rnf calls I realized that they're too close, and I looked at simplifier outputs. I believe these are relevant parts: Rec { Main.$fNFDataTree_$crnf [Occ=LoopBreaker] :: forall a_ab5v. NFData a_ab5v => Tree a_ab5v -> () Main.$fNFDataTree_$crnf = \ (@ a17_ab5v) ($dNFData_ab5w :: NFData a17_ab5v) (eta_B1 :: Tree a17_ab5v) -> case eta_B1 of _ [Occ=Dead] { Leaf g1_aaHO -> ($dNFData_ab5w `cast` (Control.DeepSeq.NTCo:NFData[0] _N :: NFData a17_ab5v ~R# (a17_ab5v -> ()))) g1_aaHO; Branch g1_aaHP g2_aaHQ -> case Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g1_aaHP of _ [Occ=Dead] { () -> Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g2_aaHQ } } end Rec } Rec { Main.$fNFDataTree1_$crnf [Occ=LoopBreaker] :: forall a_abd4. NFData a_abd4 => Tree1 a_abd4 -> () Main.$fNFDataTree1_$crnf = \ (@ a17_abd4) ($dNFData_abd5 :: NFData a17_abd4) (eta_B1 :: Tree1 a17_abd4) -> case eta_B1 of _ [Occ=Dead] { Leaf1 a18_a4tg -> ($dNFData_abd5 `cast` (Control.DeepSeq.NTCo:NFData[0] _N :: NFData a17_abd4 ~R# (a17_abd4 -> ()))) a18_a4tg; Branch1 t1_a4th t2_a4ti -> case Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t1_a4th of _ [Occ=Dead] { () -> Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t2_a4ti } } end Rec } First one is generated by GHC and second one is hand-written. If you compare, you'll see that they're identical. This looks like some serious magic, because first one is generated from a default method that uses Generic methods and types. Does anyone know how is that possible? Which optimization passes are involved in this? Thanks. From dreixel at gmail.com Sat Aug 22 23:01:16 2015 From: dreixel at gmail.com (=?UTF-8?Q?Jos=C3=A9_Pedro_Magalh=C3=A3es?=) Date: Sun, 23 Aug 2015 00:01:16 +0100 Subject: How is this Generic-based instance implementation optimized by GHC? In-Reply-To: References: Message-ID: Hi there, GHC can often do a pretty good job at optimising generics. I wrote a paper that looks at that in detail: Jos? Pedro Magalh?es. Optimisation of Generic Programs through Inlining. In 24th Symposium on Implementation and Application of Functional Languages (IFL'12), 2013. http://dreixel.net/research/pdf/ogpi.pdf Cheers, Pedro On Sat, Aug 22, 2015 at 11:26 PM, ?mer Sinan A?acan wrote: > Hi all, > > I'm very confused by an optimization GHC is doing. I have this code: > > > data Tree a = Leaf a | Branch (Tree a) (Tree a) > deriving (Generic, Show, NFData) > > data Tree1 a = Leaf1 a | Branch1 (Tree1 a) (Tree1 a) > deriving (Show) > > instance NFData a => NFData (Tree1 a) where > rnf (Leaf1 a) = rnf a > rnf (Branch1 t1 t2) = rnf t1 `seq` rnf t2 > > > When I benchmarked rnf calls I realized that they're too close, and I > looked at > simplifier outputs. I believe these are relevant parts: > > Rec { > Main.$fNFDataTree_$crnf [Occ=LoopBreaker] > :: forall a_ab5v. NFData a_ab5v => Tree a_ab5v -> () > Main.$fNFDataTree_$crnf = > \ (@ a17_ab5v) > ($dNFData_ab5w :: NFData a17_ab5v) > (eta_B1 :: Tree a17_ab5v) -> > case eta_B1 of _ [Occ=Dead] { > Leaf g1_aaHO -> > ($dNFData_ab5w > `cast` (Control.DeepSeq.NTCo:NFData[0] _N > :: NFData a17_ab5v ~R# (a17_ab5v -> ()))) > g1_aaHO; > Branch g1_aaHP g2_aaHQ -> > case Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g1_aaHP > of _ [Occ=Dead] { () -> > Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g2_aaHQ > } > } > end Rec } > > Rec { > Main.$fNFDataTree1_$crnf [Occ=LoopBreaker] > :: forall a_abd4. NFData a_abd4 => Tree1 a_abd4 -> () > Main.$fNFDataTree1_$crnf = > \ (@ a17_abd4) > ($dNFData_abd5 :: NFData a17_abd4) > (eta_B1 :: Tree1 a17_abd4) -> > case eta_B1 of _ [Occ=Dead] { > Leaf1 a18_a4tg -> > ($dNFData_abd5 > `cast` (Control.DeepSeq.NTCo:NFData[0] _N > :: NFData a17_abd4 ~R# (a17_abd4 -> ()))) > a18_a4tg; > Branch1 t1_a4th t2_a4ti -> > case Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t1_a4th > of _ [Occ=Dead] { () -> > Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t2_a4ti > } > } > end Rec } > > First one is generated by GHC and second one is hand-written. If you > compare, > you'll see that they're identical. This looks like some serious magic, > because > first one is generated from a default method that uses Generic methods and > types. Does anyone know how is that possible? Which optimization passes are > involved in this? > > Thanks. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.gibiansky at gmail.com Sat Aug 22 23:13:51 2015 From: andrew.gibiansky at gmail.com (Andrew Gibiansky) Date: Sat, 22 Aug 2015 16:13:51 -0700 Subject: Request for input on #7253: Top-level bindings in GHCI In-Reply-To: References: Message-ID: I would suggest treating "a = 1" as a declaration. This is what IHaskell does, and it seems more intuitive than hacky parsing it into a "let a = 1". The implementation should be easy using runDecls from InteractiveEval and parseDeclaration from Parser.y to do the actual parsing. -- Andrew On Sat, Aug 22, 2015 at 8:50 AM, Alex Rozenshteyn wrote: > I'm thinking of working on this ticket ( > https://ghc.haskell.org/trac/ghc/ticket/7253), so, as per mpickering's > suggestion (https://phabricator.haskell.org/chatlog/channel/3/?at=1353572), > I'm emailing the list to solicit input. > > My first instinct was to treat declarations like "a = 1" in GHCI as > equivalent to "let a = 1"; this would be a straightforward matter of > parsing. On the other hand, as thoughtpolice comments, let-bound variables > are treated subtly differently than top-level bindings, so the proper > solution may be more involved. > > Comments? > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From omeragacan at gmail.com Sun Aug 23 00:23:26 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Sat, 22 Aug 2015 20:23:26 -0400 Subject: How is this Generic-based instance implementation optimized by GHC? In-Reply-To: References: Message-ID: Awesome, thanks for the pointer, Pedro. 2015-08-22 19:01 GMT-04:00 Jos? Pedro Magalh?es : > Hi there, > > GHC can often do a pretty good job at optimising generics. I wrote a paper > that looks at that in detail: > > Jos? Pedro Magalh?es. Optimisation of Generic Programs through Inlining. In > 24th Symposium on Implementation and Application of Functional Languages > (IFL'12), 2013. > http://dreixel.net/research/pdf/ogpi.pdf > > > Cheers, > Pedro > > On Sat, Aug 22, 2015 at 11:26 PM, ?mer Sinan A?acan > wrote: >> >> Hi all, >> >> I'm very confused by an optimization GHC is doing. I have this code: >> >> >> data Tree a = Leaf a | Branch (Tree a) (Tree a) >> deriving (Generic, Show, NFData) >> >> data Tree1 a = Leaf1 a | Branch1 (Tree1 a) (Tree1 a) >> deriving (Show) >> >> instance NFData a => NFData (Tree1 a) where >> rnf (Leaf1 a) = rnf a >> rnf (Branch1 t1 t2) = rnf t1 `seq` rnf t2 >> >> >> When I benchmarked rnf calls I realized that they're too close, and I >> looked at >> simplifier outputs. I believe these are relevant parts: >> >> Rec { >> Main.$fNFDataTree_$crnf [Occ=LoopBreaker] >> :: forall a_ab5v. NFData a_ab5v => Tree a_ab5v -> () >> Main.$fNFDataTree_$crnf = >> \ (@ a17_ab5v) >> ($dNFData_ab5w :: NFData a17_ab5v) >> (eta_B1 :: Tree a17_ab5v) -> >> case eta_B1 of _ [Occ=Dead] { >> Leaf g1_aaHO -> >> ($dNFData_ab5w >> `cast` (Control.DeepSeq.NTCo:NFData[0] _N >> :: NFData a17_ab5v ~R# (a17_ab5v -> ()))) >> g1_aaHO; >> Branch g1_aaHP g2_aaHQ -> >> case Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g1_aaHP >> of _ [Occ=Dead] { () -> >> Main.$fNFDataTree_$crnf @ a17_ab5v $dNFData_ab5w g2_aaHQ >> } >> } >> end Rec } >> >> Rec { >> Main.$fNFDataTree1_$crnf [Occ=LoopBreaker] >> :: forall a_abd4. NFData a_abd4 => Tree1 a_abd4 -> () >> Main.$fNFDataTree1_$crnf = >> \ (@ a17_abd4) >> ($dNFData_abd5 :: NFData a17_abd4) >> (eta_B1 :: Tree1 a17_abd4) -> >> case eta_B1 of _ [Occ=Dead] { >> Leaf1 a18_a4tg -> >> ($dNFData_abd5 >> `cast` (Control.DeepSeq.NTCo:NFData[0] _N >> :: NFData a17_abd4 ~R# (a17_abd4 -> ()))) >> a18_a4tg; >> Branch1 t1_a4th t2_a4ti -> >> case Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t1_a4th >> of _ [Occ=Dead] { () -> >> Main.$fNFDataTree1_$crnf @ a17_abd4 $dNFData_abd5 t2_a4ti >> } >> } >> end Rec } >> >> First one is generated by GHC and second one is hand-written. If you >> compare, >> you'll see that they're identical. This looks like some serious magic, >> because >> first one is generated from a default method that uses Generic methods and >> types. Does anyone know how is that possible? Which optimization passes >> are >> involved in this? >> >> Thanks. >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > From ndmitchell at gmail.com Sun Aug 23 12:43:28 2015 From: ndmitchell at gmail.com (Neil Mitchell) Date: Sun, 23 Aug 2015 13:43:28 +0100 Subject: Using GHC API to compile Haskell file Message-ID: Hi, Is this the right place for GHC API queries? If not, is there anywhere better? I want to compile a Haskell module, much like `ghc --make` or `ghc -c` does. The sample code on the Haskell wiki (https://wiki.haskell.org/GHC/As_a_library#A_Simple_Example), StackOverflow (http://stackoverflow.com/a/5631338/160673) and in GHC API slides (http://sneezy.cs.nott.ac.uk/fplunch/weblog/wp-content/uploads/2008/12/ghc-api-slidesnotes.pdf) says: import GHC import GHC.Paths ( libdir ) import DynFlags main = defaultErrorHandler defaultFatalMessager defaultFlushOut $ do runGhc (Just libdir) $ do dflags <- getSessionDynFlags setSessionDynFlags dflags target <- guessTarget "Test.hs" Nothing setTargets [target] load LoadAllTargets However, given a `Test.hs` file with the contents `main = print 1`, I get the error: C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: cannot find -lHSbase-4.7.0.1-ghc7.8.3 C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: cannot find -lHSinteger-gmp-0.5.1.0-ghc7.8.3 C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: cannot find -lHSghc-prim-0.3.1.0-ghc7.8.3 C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: cannot find -lHSrts-ghc7.8.3 C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: cannot find -lffi-6 collect2: ld returned 1 exit status Has the recipe changed? By turning up the verbosity, I was able to compare the command line passed to the linker. The failing GHC API call contains: "-lHSbase-4.7.0.1-ghc7.8.3" "-lHSinteger-gmp-0.5.1.0-ghc7.8.3" "-lHSghc-prim-0.3.1.0-ghc7.8.3" "-lHSrts-ghc7.8.3" "-lffi-6" While the succeeding ghc --make contains: "-lHSbase-4.7.0.1" "-lHSinteger-gmp-0.5.1.0" "-lHSghc-prim-0.3.1.0" "-lHSrts" "-lCffi-6" Should I be getting DynFlags differently to influence those link variables? Thanks, Neil From ezyang at mit.edu Sun Aug 23 23:00:35 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Sun, 23 Aug 2015 16:00:35 -0700 Subject: Using GHC API to compile Haskell file In-Reply-To: References: Message-ID: <1440368677-sup-472@sabre> The problem is that the default code is trying to build a dynamically linked executable, but the Windows distributions don't come with dlls by default. Why doesn't the GHC API code pick this up? Based on snooping ghc/Main.hs, it's probably because you need to call parseDynamicFlags* which will call updateWays which will turn off -dynamic-too if the platform doesn't support it. GHC bug? Absolutely! Please file a ticket. Edward Excerpts from Neil Mitchell's message of 2015-08-23 05:43:28 -0700: > Hi, > > Is this the right place for GHC API queries? If not, is there anywhere better? > > I want to compile a Haskell module, much like `ghc --make` or `ghc -c` > does. The sample code on the Haskell wiki > (https://wiki.haskell.org/GHC/As_a_library#A_Simple_Example), > StackOverflow (http://stackoverflow.com/a/5631338/160673) and in GHC > API slides (http://sneezy.cs.nott.ac.uk/fplunch/weblog/wp-content/uploads/2008/12/ghc-api-slidesnotes.pdf) > says: > > import GHC > import GHC.Paths ( libdir ) > import DynFlags > > main = > defaultErrorHandler defaultFatalMessager defaultFlushOut $ do > runGhc (Just libdir) $ do > dflags <- getSessionDynFlags > setSessionDynFlags dflags > target <- guessTarget "Test.hs" Nothing > setTargets [target] > load LoadAllTargets > > However, given a `Test.hs` file with the contents `main = print 1`, I > get the error: > > C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: > cannot find -lHSbase-4.7.0.1-ghc7.8.3 > C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: > cannot find -lHSinteger-gmp-0.5.1.0-ghc7.8.3 > C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: > cannot find -lHSghc-prim-0.3.1.0-ghc7.8.3 > C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: > cannot find -lHSrts-ghc7.8.3 > C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: > cannot find -lffi-6 > collect2: ld returned 1 exit status > > Has the recipe changed? > > By turning up the verbosity, I was able to compare the command line > passed to the linker. The failing GHC API call contains: > > "-lHSbase-4.7.0.1-ghc7.8.3" "-lHSinteger-gmp-0.5.1.0-ghc7.8.3" > "-lHSghc-prim-0.3.1.0-ghc7.8.3" "-lHSrts-ghc7.8.3" "-lffi-6" > > While the succeeding ghc --make contains: > > "-lHSbase-4.7.0.1" "-lHSinteger-gmp-0.5.1.0" > "-lHSghc-prim-0.3.1.0" "-lHSrts" "-lCffi-6" > > Should I be getting DynFlags differently to influence those link variables? > > Thanks, Neil From jhendrix at galois.com Mon Aug 24 06:23:40 2015 From: jhendrix at galois.com (Joe Hendrix) Date: Sun, 23 Aug 2015 23:23:40 -0700 Subject: Releasing resources associated with a RTS worker thread? Message-ID: <8ECB280E-1853-428A-A71F-2B97BE7A18DF@galois.com> I am working on FFI bindings to C++ code that associates several memory pools to each thread using the library. When a thread is done using the library, it can call a C function to release the memory pool objects. In the Haskell bindings, I?d like to be able to attach a finalizer that called this cleanup code whenever an RTS OS thread is shutdown. It appeared that the main place that happens is here: https://github.com/ghc/ghc/blob/master/rts/Capability.c#L562-L577 I did not see a way to register a finalizer that is run before the call to shutdownThread. Could this be something worth adding to the RTS? It seems like the most straightforward way would be to add a finalizer list to each task object, and expose a C function for adding callbacks to it. As a fallback, it should be possible to modify the memory pools in the C++ code so that they can be disabled. According to the author of the C++ code, this should have about a 10% slowdown on execution, so it would be nice if I could keep them enabled without a risk of memory leaks. Regards, Joe From ndmitchell at gmail.com Mon Aug 24 07:42:48 2015 From: ndmitchell at gmail.com (Neil Mitchell) Date: Mon, 24 Aug 2015 08:42:48 +0100 Subject: Using GHC API to compile Haskell file In-Reply-To: <1440368677-sup-472@sabre> References: <1440368677-sup-472@sabre> Message-ID: Thanks Edward, that fixed the issue with GHC 7.8.3. While trying to replicate with 7.10.2 to submit a bug report, I got a different error, even with your fix included: C:\Users\NDMIT_~1\AppData\Local\Temp\ghc2428_1\ghc_4.o:ghc_3.c:(.text+0x55): undefined reference to `ZCMain_main_closure' Doing another diff of the command lines, I see ghc --make includes "Test.o" on the Link line, but the API doesn't. Thanks, Neil On Mon, Aug 24, 2015 at 12:00 AM, Edward Z. Yang wrote: > The problem is that the default code is trying to build a dynamically > linked executable, but the Windows distributions don't come with dlls > by default. > > Why doesn't the GHC API code pick this up? Based on snooping > ghc/Main.hs, it's probably because you need to call parseDynamicFlags* > which will call updateWays which will turn off -dynamic-too if the > platform doesn't support it. > > GHC bug? Absolutely! Please file a ticket. > > Edward > > Excerpts from Neil Mitchell's message of 2015-08-23 05:43:28 -0700: >> Hi, >> >> Is this the right place for GHC API queries? If not, is there anywhere better? >> >> I want to compile a Haskell module, much like `ghc --make` or `ghc -c` >> does. The sample code on the Haskell wiki >> (https://wiki.haskell.org/GHC/As_a_library#A_Simple_Example), >> StackOverflow (http://stackoverflow.com/a/5631338/160673) and in GHC >> API slides (http://sneezy.cs.nott.ac.uk/fplunch/weblog/wp-content/uploads/2008/12/ghc-api-slidesnotes.pdf) >> says: >> >> import GHC >> import GHC.Paths ( libdir ) >> import DynFlags >> >> main = >> defaultErrorHandler defaultFatalMessager defaultFlushOut $ do >> runGhc (Just libdir) $ do >> dflags <- getSessionDynFlags >> setSessionDynFlags dflags >> target <- guessTarget "Test.hs" Nothing >> setTargets [target] >> load LoadAllTargets >> >> However, given a `Test.hs` file with the contents `main = print 1`, I >> get the error: >> >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSbase-4.7.0.1-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSinteger-gmp-0.5.1.0-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSghc-prim-0.3.1.0-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lHSrts-ghc7.8.3 >> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >> cannot find -lffi-6 >> collect2: ld returned 1 exit status >> >> Has the recipe changed? >> >> By turning up the verbosity, I was able to compare the command line >> passed to the linker. The failing GHC API call contains: >> >> "-lHSbase-4.7.0.1-ghc7.8.3" "-lHSinteger-gmp-0.5.1.0-ghc7.8.3" >> "-lHSghc-prim-0.3.1.0-ghc7.8.3" "-lHSrts-ghc7.8.3" "-lffi-6" >> >> While the succeeding ghc --make contains: >> >> "-lHSbase-4.7.0.1" "-lHSinteger-gmp-0.5.1.0" >> "-lHSghc-prim-0.3.1.0" "-lHSrts" "-lCffi-6" >> >> Should I be getting DynFlags differently to influence those link variables? >> >> Thanks, Neil From mle+hs at mega-nerd.com Mon Aug 24 11:54:46 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Mon, 24 Aug 2015 21:54:46 +1000 Subject: Two step allocator for 64-bit systems Message-ID: <20150824215446.b989ae5b0765bc58fbec83e3@mega-nerd.com> Dear Giovanni, Your commit: commit 0d1a8d09f452977aadef7897aa12a8d41c7a4af0 Author: Giovanni Campagna Date: Fri Jul 17 11:55:49 2015 +0100 Two step allocator for 64-bit systems fails for me on Arm64 (also known as AArch64) Linux. I was wondering if you might be able to look at this ticket I've raised and hopefully shed some light on this issue: https://ghc.haskell.org/trac/ghc/ticket/10682 Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From eir at cis.upenn.edu Mon Aug 24 12:40:52 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 24 Aug 2015 08:40:52 -0400 Subject: Access to class defaults and derived instances In-Reply-To: References: Message-ID: <5684F9AD-11F6-4170-A116-28925AAEDFED@cis.upenn.edu> I have a hard time fully understanding this request without more context. But I do think I understand the last paragraph. And it seems bound to create class incoherence. What if someone else *does* write that orphan instance you're avoiding writing? Richard On Aug 22, 2015, at 12:54 PM, David Feuer wrote: > From time to time, a library lacks an instance for something that I want. For example, I may need to convert > > data Foo = Bar (Vector Baz) > > to FishFood, but (to avoid unreasonable dependencies) Vector doesn't have a ToFishFood instance, so I can't just write > > instance ToFishFood Foo > > and (using Generic magic) be done with it. Instead, I must write the instance completely by hand, which could be painful. I *could* write an orphan instance, but orphans are evil. > > What I wish I could do: > > newtype Vec a = Vec (Vector a) > > instance ToFishFood a => (newtype Vec) a where > -- if needed > toFishFood (v :: Vector a) = ... > > That is, I want to write a super-secret orphan instance for Vector and transfer it to Vec via GND precisely when it is legal to do so. The secret instance could itself be derived (if the constructors are visible) or could make use of default member definitions. > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.feuer at gmail.com Mon Aug 24 12:52:31 2015 From: david.feuer at gmail.com (David Feuer) Date: Mon, 24 Aug 2015 08:52:31 -0400 Subject: Access to class defaults and derived instances In-Reply-To: <5684F9AD-11F6-4170-A116-28925AAEDFED@cis.upenn.edu> References: <5684F9AD-11F6-4170-A116-28925AAEDFED@cis.upenn.edu> Message-ID: I'm not sure if it really could work out at all. The concept is that I want the newtype wrapper to get the class defaults the wrapped type would have gotten (whether the wrapped type is actually a class instance or not). On Aug 24, 2015 8:39 AM, "Richard Eisenberg" wrote: > I have a hard time fully understanding this request without more context. > But I do think I understand the last paragraph. And it seems bound to > create class incoherence. What if someone else *does* write that orphan > instance you're avoiding writing? > > Richard > > On Aug 22, 2015, at 12:54 PM, David Feuer wrote: > > From time to time, a library lacks an instance for something that I want. > For example, I may need to convert > > data Foo = Bar (Vector Baz) > > to FishFood, but (to avoid unreasonable dependencies) Vector doesn't have > a ToFishFood instance, so I can't just write > > instance ToFishFood Foo > > and (using Generic magic) be done with it. Instead, I must write the > instance completely by hand, which could be painful. I *could* write an > orphan instance, but orphans are evil. > > What I wish I could do: > > newtype Vec a = Vec (Vector a) > > instance ToFishFood a => (newtype Vec) a where > -- if needed > toFishFood (v :: Vector a) = ... > > That is, I want to write a super-secret orphan instance for Vector and > transfer it to Vec via GND precisely when it is legal to do so. The secret > instance could itself be derived (if the constructors are visible) or could > make use of default member definitions. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Mon Aug 24 13:06:26 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Mon, 24 Aug 2015 09:06:26 -0400 Subject: Request for input on #7253: Top-level bindings in GHCI In-Reply-To: References: Message-ID: <451FB135-A2D0-4937-88B1-7BE49B6EC732@cis.upenn.edu> I don't think there is a user-visible difference between treating "a = 1" as a declaration vs as a let. One might be easier to implement. My guess is that Austin's thought in that chat log is around MonoLocalBinds. MonoLocalBinds (implied by GADTs and TypeFamilies) means that some let declarations are not generalized. In particular, let declarations that are manifestly *not* top-level. Like this: foo x = let y = x in y That `y` is manifestly not top-level because its RHS mentions a local variable. So, `y` is not generalized if MonoLocalBinds is in effect. But, in GHCi, this matters not. Anything the user writes in a top-level variable assignment can only possibly refer to top-level things, never to local things (because there are no local things). So MonoLocalBinds will not trigger, and treating an assignment as either a declaration or a "let" should have the same meaning. Richard On Aug 22, 2015, at 11:50 AM, Alex Rozenshteyn wrote: > I'm thinking of working on this ticket (https://ghc.haskell.org/trac/ghc/ticket/7253), so, as per mpickering's suggestion (https://phabricator.haskell.org/chatlog/channel/3/?at=1353572), I'm emailing the list to solicit input. > > My first instinct was to treat declarations like "a = 1" in GHCI as equivalent to "let a = 1"; this would be a straightforward matter of parsing. On the other hand, as thoughtpolice comments, let-bound variables are treated subtly differently than top-level bindings, so the proper solution may be more involved. > > Comments? > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From karel.gardas at centrum.cz Mon Aug 24 14:43:06 2015 From: karel.gardas at centrum.cz (Karel Gardas) Date: Mon, 24 Aug 2015 16:43:06 +0200 Subject: Two step allocator for 64-bit systems In-Reply-To: <20150824215446.b989ae5b0765bc58fbec83e3@mega-nerd.com> References: <20150824215446.b989ae5b0765bc58fbec83e3@mega-nerd.com> Message-ID: <55DB2D7A.8040503@centrum.cz> Dear Giovanni, thanks to Erik for pointing this out, very similar issue of 1TB allocation on dll-split is also happening on amd64-solaris11. I can't say this is the same issue yet as I've not debug that fully yet -- will do as time permits, but at least you know something else similar suspicious also happens on yet another platform... Thanks, Karel On 08/24/15 01:54 PM, Erik de Castro Lopo wrote: > Dear Giovanni, > > Your commit: > > commit 0d1a8d09f452977aadef7897aa12a8d41c7a4af0 > Author: Giovanni Campagna > Date: Fri Jul 17 11:55:49 2015 +0100 > > Two step allocator for 64-bit systems > > fails for me on Arm64 (also known as AArch64) Linux. > > I was wondering if you might be able to look at this ticket I've > raised and hopefully shed some light on this issue: > > https://ghc.haskell.org/trac/ghc/ticket/10682 > > Cheers, > Erik > From mike at izbicki.me Mon Aug 24 21:42:47 2015 From: mike at izbicki.me (Mike Izbicki) Date: Mon, 24 Aug 2015 14:42:47 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Thanks! Now one more question :) The code Andrew Farmer showed me for getting dictionaries works great when I have a concrete type (e.g. Float) I want a dictionary for. But now I'm working on polymorphic code and running into a problem. Lets say I'm running the plugin on a function with signature `Floating a => a -> a`, then the plugin has access to the `Floating` dictionary for the type. But if I want to add two numbers together, I need the `Num` dictionary. I know I should have access to `Num` since it's a superclass of `Floating`. How can I get access to these superclass dictionaries? On Sat, Aug 22, 2015 at 7:35 AM, ?mer Sinan A?acan wrote: >> I have a new question: I'm working on supporting literals now. I'm having >> trouble creating something that looks like `(App (Var F#) (Lit 1.0))` because >> I don't know how to create a variable that corresponds to the `F#` >> constructor. The mkWiredInName function looks promising, but overly >> complicated. Is this the correct function? If so, what do I pass in for the >> Module, Unique, TyThing, and BuiltInSyntax parameters? > > mkConApp intDataCon [mkIntLit dynFlags PUT_YOUR_INTEGER HERE] > mkConApp floatDataCon [mkFloatLit dynFlags PUT_YOUR_FLOAT_HERE] > > Similarly for other literals... From afarmer at ittc.ku.edu Mon Aug 24 22:06:17 2015 From: afarmer at ittc.ku.edu (Andrew Farmer) Date: Mon, 24 Aug 2015 15:06:17 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: I'm not positive, but I believe each dictionary has a field for its superclass dictionary. So if you have a dictionary for `Floating Float`, one of the fields will be the `Num Float` dictionary. How to get the projector function for the field... I'm not sure. But perhaps you can find it by type? On Mon, Aug 24, 2015 at 2:42 PM, Mike Izbicki wrote: > Thanks! Now one more question :) > > The code Andrew Farmer showed me for getting dictionaries works great > when I have a concrete type (e.g. Float) I want a dictionary for. But > now I'm working on polymorphic code and running into a problem. > > Lets say I'm running the plugin on a function with signature `Floating > a => a -> a`, then the plugin has access to the `Floating` dictionary > for the type. But if I want to add two numbers together, I need the > `Num` dictionary. I know I should have access to `Num` since it's a > superclass of `Floating`. How can I get access to these superclass > dictionaries? > > On Sat, Aug 22, 2015 at 7:35 AM, ?mer Sinan A?acan wrote: >>> I have a new question: I'm working on supporting literals now. I'm having >>> trouble creating something that looks like `(App (Var F#) (Lit 1.0))` because >>> I don't know how to create a variable that corresponds to the `F#` >>> constructor. The mkWiredInName function looks promising, but overly >>> complicated. Is this the correct function? If so, what do I pass in for the >>> Module, Unique, TyThing, and BuiltInSyntax parameters? >> >> mkConApp intDataCon [mkIntLit dynFlags PUT_YOUR_INTEGER HERE] >> mkConApp floatDataCon [mkFloatLit dynFlags PUT_YOUR_FLOAT_HERE] >> >> Similarly for other literals... > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From omeragacan at gmail.com Tue Aug 25 01:59:40 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Mon, 24 Aug 2015 21:59:40 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: > Lets say I'm running the plugin on a function with signature `Floating a => a > -> a`, then the plugin has access to the `Floating` dictionary for the type. > But if I want to add two numbers together, I need the `Num` dictionary. I > know I should have access to `Num` since it's a superclass of `Floating`. > How can I get access to these superclass dictionaries? I don't have a working code for this but this should get you started: let ord_dictionary :: Id = ... ord_class :: Class = ... in mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] I don't know how to get Class for Ord. I do `head` here because in the case of Ord we only have one superclass so `classSCSels` should have one Id. Then I apply ord_dictionary to this selector and it should return dictionary for Eq. I assumed you already have ord_dictionary, it should be passed to your function already if you had `(Ord a) => ` in your function. Now I realized you asked for getting Num from Floating. I think you should follow a similar path except you need two applications, first to get Fractional from Floating and second to get Num from Fractional: mkApps (Var (head (classSCSels fractional_class))) [mkApps (Var (head (classSCSels floating_class))) [Var floating_dictionary]] Return value should be a Num dictionary. From omeragacan at gmail.com Tue Aug 25 02:10:51 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Mon, 24 Aug 2015 22:10:51 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Mike, here's a piece of code that may be helpful to you: https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs Copy this module to your plugin, it doesn't have any dependencies other than ghc itself. When your plugin is initialized, update `dynFlags_ref` with your DynFlags as first thing to do. Then use Show instance to print AST directly. Horrible hack, but very useful for learning purposes. In fact, I don't know how else we can learn what Core is generated for a given code, and reverse-engineer to figure out details. Hope it helps. 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >> Lets say I'm running the plugin on a function with signature `Floating a => a >> -> a`, then the plugin has access to the `Floating` dictionary for the type. >> But if I want to add two numbers together, I need the `Num` dictionary. I >> know I should have access to `Num` since it's a superclass of `Floating`. >> How can I get access to these superclass dictionaries? > > I don't have a working code for this but this should get you started: > > let ord_dictionary :: Id = ... > ord_class :: Class = ... > in > mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] > > I don't know how to get Class for Ord. I do `head` here because in the case of > Ord we only have one superclass so `classSCSels` should have one Id. Then I > apply ord_dictionary to this selector and it should return dictionary for Eq. > > I assumed you already have ord_dictionary, it should be passed to your function > already if you had `(Ord a) => ` in your function. > > > Now I realized you asked for getting Num from Floating. I think you should > follow a similar path except you need two applications, first to get Fractional > from Floating and second to get Num from Fractional: > > mkApps (Var (head (classSCSels fractional_class))) > [mkApps (Var (head (classSCSels floating_class))) > [Var floating_dictionary]] > > Return value should be a Num dictionary. From johan.tibell at gmail.com Tue Aug 25 09:41:54 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Tue, 25 Aug 2015 11:41:54 +0200 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing Message-ID: It was brought to my attention that cassava, my library, uses OverlappingInstances, which is now deprecated. There's a suggested fix here: https://github.com/tibbe/cassava/pull/95. The fix seems correct but, as Mikhail points out, makes some client code no longer compile (due to a now missing OVERLAPPABLE pragma). What's the right way to migrate code? Just switching my library to the new pragmas breaks code, so that doesn't seem very attractive. Do clients have to migrate before the libraries they use? -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Tue Aug 25 11:25:18 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 25 Aug 2015 11:25:18 +0000 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing In-Reply-To: References: Message-ID: What's the right way to migrate code? Just switching my library to the new pragmas breaks code, so that doesn't seem very attractive. I don?t understand. Can you describe the problem more precisely, perhaps with an example? S From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Johan Tibell Sent: 25 August 2015 10:42 To: ghc-devs at haskell.org Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing It was brought to my attention that cassava, my library, uses OverlappingInstances, which is now deprecated. There's a suggested fix here: https://github.com/tibbe/cassava/pull/95. The fix seems correct but, as Mikhail points out, makes some client code no longer compile (due to a now missing OVERLAPPABLE pragma). What's the right way to migrate code? Just switching my library to the new pragmas breaks code, so that doesn't seem very attractive. Do clients have to migrate before the libraries they use? -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.tibell at gmail.com Tue Aug 25 12:18:39 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Tue, 25 Aug 2015 14:18:39 +0200 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing In-Reply-To: References: Message-ID: The proposed change to my library is here: https://github.com/tibbe/cassava/pull/95/files We remove the OverlappingInstances pragma and instead add an OVERLAPPABLE pragma like so: instance {-# OVERLAPPABLE #-} FromField a => FromField (Maybe a) where This causes clients of the library that previously compiled (e.g. the music-parts package) to no longer compile, due to a now lacking OVERLAPPING pragma in their code. The issue here is I'm trying to the right thing (move to new pragmas), but that causes clients to fail to compile. My question is: how do we avoid that? Would it be OK if they added the OVERLAPPING pragma first and then I change my library to use OVERLAPPABLE? On Tue, Aug 25, 2015 at 1:25 PM, Simon Peyton Jones wrote: > What's the right way to migrate code? Just switching my library to the new > pragmas breaks code, so that doesn't seem very attractive. > > > > I don?t understand. Can you describe the problem more precisely, perhaps > with an example? > > > > S > > > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Johan > Tibell > *Sent:* 25 August 2015 10:42 > *To:* ghc-devs at haskell.org > *Subject:* OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing > > > > It was brought to my attention that cassava, my library, > uses OverlappingInstances, which is now deprecated. There's a suggested fix > here: https://github.com/tibbe/cassava/pull/95. > > > > The fix seems correct but, as Mikhail points out, makes some client code > no longer compile (due to a now missing OVERLAPPABLE pragma). > > > > What's the right way to migrate code? Just switching my library to the new > pragmas breaks code, so that doesn't seem very attractive. Do clients have > to migrate before the libraries they use? > > > > -- Johan > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Tue Aug 25 12:30:01 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Tue, 25 Aug 2015 12:30:01 +0000 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing In-Reply-To: References: Message-ID: Would it be OK if they added the OVERLAPPING pragma first and then I change my library to use OVERLAPPABLE? I think so, yes. Does that not work? Is it bad? Do you think the semantics is wrong? Simon From: Johan Tibell [mailto:johan.tibell at gmail.com] Sent: 25 August 2015 13:19 To: Simon Peyton Jones Cc: ghc-devs at haskell.org Subject: Re: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing The proposed change to my library is here: https://github.com/tibbe/cassava/pull/95/files We remove the OverlappingInstances pragma and instead add an OVERLAPPABLE pragma like so: instance {-# OVERLAPPABLE #-} FromField a => FromField (Maybe a) where This causes clients of the library that previously compiled (e.g. the music-parts package) to no longer compile, due to a now lacking OVERLAPPING pragma in their code. The issue here is I'm trying to the right thing (move to new pragmas), but that causes clients to fail to compile. My question is: how do we avoid that? Would it be OK if they added the OVERLAPPING pragma first and then I change my library to use OVERLAPPABLE? On Tue, Aug 25, 2015 at 1:25 PM, Simon Peyton Jones > wrote: What's the right way to migrate code? Just switching my library to the new pragmas breaks code, so that doesn't seem very attractive. I don?t understand. Can you describe the problem more precisely, perhaps with an example? S From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Johan Tibell Sent: 25 August 2015 10:42 To: ghc-devs at haskell.org Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing It was brought to my attention that cassava, my library, uses OverlappingInstances, which is now deprecated. There's a suggested fix here: https://github.com/tibbe/cassava/pull/95. The fix seems correct but, as Mikhail points out, makes some client code no longer compile (due to a now missing OVERLAPPABLE pragma). What's the right way to migrate code? Just switching my library to the new pragmas breaks code, so that doesn't seem very attractive. Do clients have to migrate before the libraries they use? -- Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From the.dead.shall.rise at gmail.com Tue Aug 25 15:46:33 2015 From: the.dead.shall.rise at gmail.com (Mikhail Glushenkov) Date: Tue, 25 Aug 2015 17:46:33 +0200 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing In-Reply-To: References: Message-ID: Hi, On 25 August 2015 at 14:18, Johan Tibell wrote: > The proposed change to my library is here: > https://github.com/tibbe/cassava/pull/95/files > > We remove the OverlappingInstances pragma and instead add an OVERLAPPABLE > pragma like so: > > instance {-# OVERLAPPABLE #-} FromField a => FromField (Maybe a) where > > This causes clients of the library that previously compiled (e.g. the > music-parts package) to no longer compile, due to a now lacking OVERLAPPING > pragma in their code. No, it's not quite like that. Client code can start to break when {-# LANGUAGE OverlappingInstances #-} is removed, as happened with the music-parts package. Adding an OVERLAPPABLE pragma to cassava's code made that error go away. Client code can usually work around the problem of missing OVERLAPPABLE pragmas in the library by adding OVERLAPPING pragmas to their instances. The reason I suggested bumping cassava's version is that there may be some places in cassava that still need new pragmas that I've overlooked. If GHC had an option for detecting overlapping instances at definition site, that'd help, I think, since then it'd be easier to find instances that need new pragmas. From iavor.diatchki at gmail.com Tue Aug 25 16:15:31 2015 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Tue, 25 Aug 2015 09:15:31 -0700 Subject: OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas are confusing In-Reply-To: References: Message-ID: Johan, to summarize: 1. If an instance is marked as OVERLAPPABLE, then clients may overlap it without having any pragmas 2. If an instance is NOT marked OVERLAPPABLE, then clients may still overlap it, but then they have to use an explicit OVERLAPPING pragma. So you should either add OVERLAPPABLE to your library, and then clients don't need to do anything, or you should remove it, and require that clients add OVERLAPPING. Note that using this mechanism across modules can be quite error prone. For example, you have to be very careful not to use an OVERLAPPABLE instance in your library, as if you do parts of the program might end up using one instance, and other parts may end up using another instance---GHC has no way of knowing about overlapping instance in client libraries, so it will simply use the best possible *local* instance. -Iavor On Tue, Aug 25, 2015 at 8:46 AM, Mikhail Glushenkov < the.dead.shall.rise at gmail.com> wrote: > Hi, > > On 25 August 2015 at 14:18, Johan Tibell wrote: > > The proposed change to my library is here: > > https://github.com/tibbe/cassava/pull/95/files > > > > We remove the OverlappingInstances pragma and instead add an OVERLAPPABLE > > pragma like so: > > > > instance {-# OVERLAPPABLE #-} FromField a => FromField (Maybe a) > where > > > > This causes clients of the library that previously compiled (e.g. the > > music-parts package) to no longer compile, due to a now lacking > OVERLAPPING > > pragma in their code. > > No, it's not quite like that. Client code can start to break when {-# > LANGUAGE OverlappingInstances #-} is removed, as happened with the > music-parts package. Adding an OVERLAPPABLE pragma to cassava's code > made that error go away. > > Client code can usually work around the problem of missing > OVERLAPPABLE pragmas in the library by adding OVERLAPPING pragmas to > their instances. The reason I suggested bumping cassava's version is > that there may be some places in cassava that still need new pragmas > that I've overlooked. > > If GHC had an option for detecting overlapping instances at definition > site, that'd help, I think, since then it'd be easier to find > instances that need new pragmas. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndmitchell at gmail.com Tue Aug 25 20:27:35 2015 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue, 25 Aug 2015 21:27:35 +0100 Subject: Using GHC API to compile Haskell file In-Reply-To: References: <1440368677-sup-472@sabre> Message-ID: So after all the linking bugs, I decided to give up on the linking step (for now), as it's not essential to my use case. I can compile two files with: forM_ ["Test1.hs","Test.hs"] $ \file -> do runGhc (Just libdir) $ do liftIO $ print file dflags <- getSessionDynFlags (dflags, _, _) <- parseDynamicFlags dflags [] setSessionDynFlags dflags{ghcMode=OneShot, hscTarget = HscAsm, ghcLink=NoLink} setTargets [Target (TargetFile file Nothing) True Nothing] load LoadAllTargets And assuming Test.hs imports Test1.hs then everything is fine. However, if I switch the runGhc and forM lines, so I compile both in the same Ghc session, then I get an error about "Can't find interface-file declaration for variable test1" when compiling Test.hs, and when compiling Test.hs it does a fresh dependency check on Test1 and would compile it if I hadn't done so already. The motivation for using a single Ghc session is that I'd like to share things like loading the Prelude and not have to repeat that work. Is there any way of doing that with two compiles which other than sharing a cache I'd like to be separate? Thanks, Neil On Mon, Aug 24, 2015 at 8:42 AM, Neil Mitchell wrote: > Thanks Edward, that fixed the issue with GHC 7.8.3. While trying to > replicate with 7.10.2 to submit a bug report, I got a different error, > even with your fix included: > > C:\Users\NDMIT_~1\AppData\Local\Temp\ghc2428_1\ghc_4.o:ghc_3.c:(.text+0x55): > undefined reference to `ZCMain_main_closure' > > Doing another diff of the command lines, I see ghc --make includes > "Test.o" on the Link line, but the API doesn't. > > Thanks, Neil > > > On Mon, Aug 24, 2015 at 12:00 AM, Edward Z. Yang wrote: >> The problem is that the default code is trying to build a dynamically >> linked executable, but the Windows distributions don't come with dlls >> by default. >> >> Why doesn't the GHC API code pick this up? Based on snooping >> ghc/Main.hs, it's probably because you need to call parseDynamicFlags* >> which will call updateWays which will turn off -dynamic-too if the >> platform doesn't support it. >> >> GHC bug? Absolutely! Please file a ticket. >> >> Edward >> >> Excerpts from Neil Mitchell's message of 2015-08-23 05:43:28 -0700: >>> Hi, >>> >>> Is this the right place for GHC API queries? If not, is there anywhere better? >>> >>> I want to compile a Haskell module, much like `ghc --make` or `ghc -c` >>> does. The sample code on the Haskell wiki >>> (https://wiki.haskell.org/GHC/As_a_library#A_Simple_Example), >>> StackOverflow (http://stackoverflow.com/a/5631338/160673) and in GHC >>> API slides (http://sneezy.cs.nott.ac.uk/fplunch/weblog/wp-content/uploads/2008/12/ghc-api-slidesnotes.pdf) >>> says: >>> >>> import GHC >>> import GHC.Paths ( libdir ) >>> import DynFlags >>> >>> main = >>> defaultErrorHandler defaultFatalMessager defaultFlushOut $ do >>> runGhc (Just libdir) $ do >>> dflags <- getSessionDynFlags >>> setSessionDynFlags dflags >>> target <- guessTarget "Test.hs" Nothing >>> setTargets [target] >>> load LoadAllTargets >>> >>> However, given a `Test.hs` file with the contents `main = print 1`, I >>> get the error: >>> >>> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >>> cannot find -lHSbase-4.7.0.1-ghc7.8.3 >>> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >>> cannot find -lHSinteger-gmp-0.5.1.0-ghc7.8.3 >>> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >>> cannot find -lHSghc-prim-0.3.1.0-ghc7.8.3 >>> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >>> cannot find -lHSrts-ghc7.8.3 >>> C:/Program Files (x86)/MinGHC-7.8.3/ghc-7.8.3/mingw/bin/ld.exe: >>> cannot find -lffi-6 >>> collect2: ld returned 1 exit status >>> >>> Has the recipe changed? >>> >>> By turning up the verbosity, I was able to compare the command line >>> passed to the linker. The failing GHC API call contains: >>> >>> "-lHSbase-4.7.0.1-ghc7.8.3" "-lHSinteger-gmp-0.5.1.0-ghc7.8.3" >>> "-lHSghc-prim-0.3.1.0-ghc7.8.3" "-lHSrts-ghc7.8.3" "-lffi-6" >>> >>> While the succeeding ghc --make contains: >>> >>> "-lHSbase-4.7.0.1" "-lHSinteger-gmp-0.5.1.0" >>> "-lHSghc-prim-0.3.1.0" "-lHSrts" "-lCffi-6" >>> >>> Should I be getting DynFlags differently to influence those link variables? >>> >>> Thanks, Neil From mike at izbicki.me Tue Aug 25 22:50:53 2015 From: mike at izbicki.me (Mike Izbicki) Date: Tue, 25 Aug 2015 15:50:53 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: Thanks ?mer! I'm able to get dictionaries for the superclasses of a class now, but I get an error whenever I try to get a dictionary for a super-superclass. Here's the Haskell expression I'm working with: test1 :: Floating a => a -> a test1 x1 = x1+x1 The original core is: + @ a $dNum_aJu x1 x1 But my plugin is replacing it with the core: + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 The only difference is the way I'm getting the Num dictionary. The corresponding AST (annotated with variable names and types) is: App (App (App (App (Var +::forall a. Num a => a -> a -> a) (Type a) ) (App (Var $p1Fractional::forall a. Fractional a => Num a) (App (Var $p1Floating::forall a. Floating a => Fractional a) (Var $dFloating_aJq::Floating a) ) ) ) (Var x1::'a') ) (Var x1::'a') When I insert, GHC gives the following error: ghc: panic! (the 'impossible' happened) (GHC version 7.10.1 for x86_64-unknown-linux): expectJust cpeBody:collect_args What am I doing wrong with extracting these super-superclass dictionaries? I've looked up the code for cpeBody in GHC, but I can't figure out what it's trying to do, so I'm not sure why it's failing on my core. On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: > Mike, here's a piece of code that may be helpful to you: > > https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs > > Copy this module to your plugin, it doesn't have any dependencies other than > ghc itself. When your plugin is initialized, update `dynFlags_ref` with your > DynFlags as first thing to do. Then use Show instance to print AST directly. > > Horrible hack, but very useful for learning purposes. In fact, I don't know how > else we can learn what Core is generated for a given code, and reverse-engineer > to figure out details. > > Hope it helps. > > 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>> Lets say I'm running the plugin on a function with signature `Floating a => a >>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>> But if I want to add two numbers together, I need the `Num` dictionary. I >>> know I should have access to `Num` since it's a superclass of `Floating`. >>> How can I get access to these superclass dictionaries? >> >> I don't have a working code for this but this should get you started: >> >> let ord_dictionary :: Id = ... >> ord_class :: Class = ... >> in >> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >> >> I don't know how to get Class for Ord. I do `head` here because in the case of >> Ord we only have one superclass so `classSCSels` should have one Id. Then I >> apply ord_dictionary to this selector and it should return dictionary for Eq. >> >> I assumed you already have ord_dictionary, it should be passed to your function >> already if you had `(Ord a) => ` in your function. >> >> >> Now I realized you asked for getting Num from Floating. I think you should >> follow a similar path except you need two applications, first to get Fractional >> from Floating and second to get Num from Fractional: >> >> mkApps (Var (head (classSCSels fractional_class))) >> [mkApps (Var (head (classSCSels floating_class))) >> [Var floating_dictionary]] >> >> Return value should be a Num dictionary. From omeragacan at gmail.com Wed Aug 26 02:17:36 2015 From: omeragacan at gmail.com (=?UTF-8?Q?=C3=96mer_Sinan_A=C4=9Facan?=) Date: Tue, 25 Aug 2015 22:17:36 -0400 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: It seems like in your App syntax you're having a non-function in function position. You can see this by looking at what failing function (splitFunTy_maybe) is doing: splitFunTy_maybe :: Type -> Maybe (Type, Type) -- ^ Attempts to extract the argument and result types from a type ... (definition is not important) ... Then it's used like this at the error site: (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ splitFunTy_maybe fun_ty In your case this function is returning Nothing and then exceptJust is signalling the panic. Your code looked correct to me, I don't see any problems with that. Maybe you're using something wrong as selectors. Could you paste CoreExpr representation of your program? It may also be the case that the panic is caused by something else, maybe your syntax is invalidating some assumptions/invariants in GHC but it's not immediately checked etc. Working at the Core level is frustrating at times. Can I ask what kind of plugin are you working on? (Btw, how did you generate this representation of AST? Did you write it manually? If you have a pretty-printer, would you mind sharing it?) 2015-08-25 18:50 GMT-04:00 Mike Izbicki : > Thanks ?mer! > > I'm able to get dictionaries for the superclasses of a class now, but > I get an error whenever I try to get a dictionary for a > super-superclass. Here's the Haskell expression I'm working with: > > test1 :: Floating a => a -> a > test1 x1 = x1+x1 > > The original core is: > > + @ a $dNum_aJu x1 x1 > > But my plugin is replacing it with the core: > > + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 > > The only difference is the way I'm getting the Num dictionary. The > corresponding AST (annotated with variable names and types) is: > > App > (App > (App > (App > (Var +::forall a. Num a => a -> a -> a) > (Type a) > ) > (App > (Var $p1Fractional::forall a. Fractional a => Num a) > (App > (Var $p1Floating::forall a. Floating a => Fractional a) > (Var $dFloating_aJq::Floating a) > ) > ) > ) > (Var x1::'a') > ) > (Var x1::'a') > > When I insert, GHC gives the following error: > > ghc: panic! (the 'impossible' happened) > (GHC version 7.10.1 for x86_64-unknown-linux): > expectJust cpeBody:collect_args > > What am I doing wrong with extracting these super-superclass > dictionaries? I've looked up the code for cpeBody in GHC, but I can't > figure out what it's trying to do, so I'm not sure why it's failing on > my core. > > On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >> Mike, here's a piece of code that may be helpful to you: >> >> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >> >> Copy this module to your plugin, it doesn't have any dependencies other than >> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >> DynFlags as first thing to do. Then use Show instance to print AST directly. >> >> Horrible hack, but very useful for learning purposes. In fact, I don't know how >> else we can learn what Core is generated for a given code, and reverse-engineer >> to figure out details. >> >> Hope it helps. >> >> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>> How can I get access to these superclass dictionaries? >>> >>> I don't have a working code for this but this should get you started: >>> >>> let ord_dictionary :: Id = ... >>> ord_class :: Class = ... >>> in >>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>> >>> I don't know how to get Class for Ord. I do `head` here because in the case of >>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>> >>> I assumed you already have ord_dictionary, it should be passed to your function >>> already if you had `(Ord a) => ` in your function. >>> >>> >>> Now I realized you asked for getting Num from Floating. I think you should >>> follow a similar path except you need two applications, first to get Fractional >>> from Floating and second to get Num from Fractional: >>> >>> mkApps (Var (head (classSCSels fractional_class))) >>> [mkApps (Var (head (classSCSels floating_class))) >>> [Var floating_dictionary]] >>> >>> Return value should be a Num dictionary. From eir at cis.upenn.edu Wed Aug 26 02:34:16 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Tue, 25 Aug 2015 22:34:16 -0400 Subject: www.haskell.org/ghc Message-ID: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> Hi all, I want to write a URL to represent GHC. It seems that www.haskell.org/ghc is the right one. But that page is quite ugly! A full redesign is always a challenge, so I'll make a simple request: remove announcements of old releases, for some definition of old. (I suggest: all releases from current major version + last release from previous major version.) Right now, I have to scroll down to get to "What is GHC?" and it's a little embarrassing. Thanks! Richard From dongen at cs.ucc.ie Wed Aug 26 04:21:48 2015 From: dongen at cs.ucc.ie (dongen) Date: Wed, 26 Aug 2015 05:21:48 +0100 Subject: www.haskell.org/ghc In-Reply-To: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> References: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> Message-ID: <20150826042148.GA3265@csmvddesktop> * Richard Eisenberg [2015-08-25 22:34:16 -0400]: : I want to write a URL to represent GHC. It seems that : www.haskell.org/ghc is the right one. But that page is quite ugly! : A full redesign is always a challenge, so I'll make a simple request: : remove announcements of old releases, for some definition of old. : (I suggest: all releases from current major version + last release : from previous major version.) Right now, I have to scroll down to : get to "What is GHC?" and it's a little embarrassing. Thanks Richard. You could also put in a _release history_ hyperlink to a separate page. Regards, Marc van Dongen From dave.laing.80 at gmail.com Wed Aug 26 04:23:00 2015 From: dave.laing.80 at gmail.com (David Laing) Date: Wed, 26 Aug 2015 14:23:00 +1000 Subject: www.haskell.org/ghc In-Reply-To: <20150826042148.GA3265@csmvddesktop> References: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> <20150826042148.GA3265@csmvddesktop> Message-ID: Hi, A low-effort alternative would be to swap the order of the 'Latest News' and 'What is GHC?' sections. Cheers, Dave On Wed, Aug 26, 2015 at 2:21 PM, dongen wrote: > * Richard Eisenberg [2015-08-25 22:34:16 -0400]: > > : I want to write a URL to represent GHC. It seems that > : www.haskell.org/ghc is the right one. But that page is quite ugly! > : A full redesign is always a challenge, so I'll make a simple request: > : remove announcements of old releases, for some definition of old. > : (I suggest: all releases from current major version + last release > : from previous major version.) Right now, I have to scroll down to > : get to "What is GHC?" and it's a little embarrassing. > > Thanks Richard. You could also put in a _release history_ hyperlink > to a separate page. > > Regards, > > > Marc van Dongen > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at izbicki.me Wed Aug 26 04:24:58 2015 From: mike at izbicki.me (Mike Izbicki) Date: Tue, 25 Aug 2015 21:24:58 -0700 Subject: question about GHC API on GHC plugin In-Reply-To: References: <1439014742-sup-2126@sabre> Message-ID: The purpose of the plugin is to automatically improve the numerical stability of Haskell code. It is supposed to identify numeric expressions, then use Herbie (https://github.com/uwplse/herbie) to generate a numerically stable version, then rewrite the numerically stable version back into the code. The first two steps were really easy. It's the last step of inserting back into the code that I'm having tons of trouble with. Core is a lot more complicated than I thought :) I'm not sure what you mean by the CoreExpr representation? Here's the output of the pretty printer you gave: App (App (App (App (Var Id{+,r2T,ForAllTy TyVar{a} (FunTy (TyConApp Num [TyVarTy TyVar{a}]) (FunTy (TyVarTy TyVar{a}) (FunTy (TyVarTy TyVar{a}) (TyVarTy TyVar{a})))),VanillaId,Info{0,SpecInfo [] ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Type (TyVarTy TyVar{a}))) (App (Var Id{$p1Fractional,rh3,ForAllTy TyVar{a} (FunTy (TyConApp Fractional [TyVarTy TyVar{a}]) (TyConApp Num [TyVarTy TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = "Class op $p1Fractional", ru_fn = $p1Fractional, ru_nargs = 2, ru_try = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd [Str HeadStr,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many Used,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (App (Var Id{$p1Floating,rh2,ForAllTy TyVar{a} (FunTy (TyConApp Floating [TyVarTy TyVar{a}]) (TyConApp Fractional [TyVarTy TyVar{a}])),ClassOpId ,Info{1,SpecInfo [BuiltinRule {ru_name = "Class op $p1Floating", ru_fn = $p1Floating, ru_nargs = 2, ru_try = }] ,NoUnfolding,NoCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [JD {strd = Str (SProd [Str HeadStr,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy,Lazy]), absd = Use Many (UProd [Use Many Used,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs,Abs])}] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) (Var Id{$dFloating,aBM,TyConApp Floating [TyVarTy TyVar{a}],VanillaId,Info{0,SpecInfo [] ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}})))) (Var Id{x1,anU,TyVarTy TyVar{a},VanillaId,Info{0,SpecInfo [] ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}})) (Var Id{x1,anU,TyVarTy TyVar{a},VanillaId,Info{0,SpecInfo [] ,NoUnfolding,MayHaveCafRefs,NoOneShotInfo,InlinePragma {inl_src = "{-# INLINE", inl_inline = EmptyInlineSpec, inl_sat = Nothing, inl_act = AlwaysActive, inl_rule = FunLike},NoOccInfo,StrictSig (DmdType [] (Dunno NoCPR)),JD {strd = Lazy, absd = Use Many Used},0}}) You can find my pretty printer (and all the other code for the plugin) at: https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L627 The function getDictMap (https://github.com/mikeizbicki/herbie-haskell/blob/master/src/Herbie.hs#L171) is where I'm constructing the dictionaries that are getting inserted back into the Core. On Tue, Aug 25, 2015 at 7:17 PM, ?mer Sinan A?acan wrote: > It seems like in your App syntax you're having a non-function in function > position. You can see this by looking at what failing function > (splitFunTy_maybe) is doing: > > splitFunTy_maybe :: Type -> Maybe (Type, Type) > -- ^ Attempts to extract the argument and result types from a type > ... (definition is not important) ... > > Then it's used like this at the error site: > > (arg_ty, res_ty) = expectJust "cpeBody:collect_args" $ > splitFunTy_maybe fun_ty > > In your case this function is returning Nothing and then exceptJust is > signalling the panic. > > Your code looked correct to me, I don't see any problems with that. Maybe you're > using something wrong as selectors. Could you paste CoreExpr representation of > your program? > > It may also be the case that the panic is caused by something else, maybe your > syntax is invalidating some assumptions/invariants in GHC but it's not > immediately checked etc. Working at the Core level is frustrating at times. > > Can I ask what kind of plugin are you working on? > > (Btw, how did you generate this representation of AST? Did you write it > manually? If you have a pretty-printer, would you mind sharing it?) > > 2015-08-25 18:50 GMT-04:00 Mike Izbicki : >> Thanks ?mer! >> >> I'm able to get dictionaries for the superclasses of a class now, but >> I get an error whenever I try to get a dictionary for a >> super-superclass. Here's the Haskell expression I'm working with: >> >> test1 :: Floating a => a -> a >> test1 x1 = x1+x1 >> >> The original core is: >> >> + @ a $dNum_aJu x1 x1 >> >> But my plugin is replacing it with the core: >> >> + @ a ($p1Fractional ($p1Floating $dFloating_aJq)) x1 x1 >> >> The only difference is the way I'm getting the Num dictionary. The >> corresponding AST (annotated with variable names and types) is: >> >> App >> (App >> (App >> (App >> (Var +::forall a. Num a => a -> a -> a) >> (Type a) >> ) >> (App >> (Var $p1Fractional::forall a. Fractional a => Num a) >> (App >> (Var $p1Floating::forall a. Floating a => Fractional a) >> (Var $dFloating_aJq::Floating a) >> ) >> ) >> ) >> (Var x1::'a') >> ) >> (Var x1::'a') >> >> When I insert, GHC gives the following error: >> >> ghc: panic! (the 'impossible' happened) >> (GHC version 7.10.1 for x86_64-unknown-linux): >> expectJust cpeBody:collect_args >> >> What am I doing wrong with extracting these super-superclass >> dictionaries? I've looked up the code for cpeBody in GHC, but I can't >> figure out what it's trying to do, so I'm not sure why it's failing on >> my core. >> >> On Mon, Aug 24, 2015 at 7:10 PM, ?mer Sinan A?acan wrote: >>> Mike, here's a piece of code that may be helpful to you: >>> >>> https://github.com/osa1/sc-plugin/blob/master/src/Supercompilation/Show.hs >>> >>> Copy this module to your plugin, it doesn't have any dependencies other than >>> ghc itself. When your plugin is initialized, update `dynFlags_ref` with your >>> DynFlags as first thing to do. Then use Show instance to print AST directly. >>> >>> Horrible hack, but very useful for learning purposes. In fact, I don't know how >>> else we can learn what Core is generated for a given code, and reverse-engineer >>> to figure out details. >>> >>> Hope it helps. >>> >>> 2015-08-24 21:59 GMT-04:00 ?mer Sinan A?acan : >>>>> Lets say I'm running the plugin on a function with signature `Floating a => a >>>>> -> a`, then the plugin has access to the `Floating` dictionary for the type. >>>>> But if I want to add two numbers together, I need the `Num` dictionary. I >>>>> know I should have access to `Num` since it's a superclass of `Floating`. >>>>> How can I get access to these superclass dictionaries? >>>> >>>> I don't have a working code for this but this should get you started: >>>> >>>> let ord_dictionary :: Id = ... >>>> ord_class :: Class = ... >>>> in >>>> mkApps (Var (head (classSCSels ord_class))) [Var ord_dictionary] >>>> >>>> I don't know how to get Class for Ord. I do `head` here because in the case of >>>> Ord we only have one superclass so `classSCSels` should have one Id. Then I >>>> apply ord_dictionary to this selector and it should return dictionary for Eq. >>>> >>>> I assumed you already have ord_dictionary, it should be passed to your function >>>> already if you had `(Ord a) => ` in your function. >>>> >>>> >>>> Now I realized you asked for getting Num from Floating. I think you should >>>> follow a similar path except you need two applications, first to get Fractional >>>> from Floating and second to get Num from Fractional: >>>> >>>> mkApps (Var (head (classSCSels fractional_class))) >>>> [mkApps (Var (head (classSCSels floating_class))) >>>> [Var floating_dictionary]] >>>> >>>> Return value should be a Num dictionary. From simonpj at microsoft.com Wed Aug 26 08:07:44 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Wed, 26 Aug 2015 08:07:44 +0000 Subject: www.haskell.org/ghc In-Reply-To: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> References: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> Message-ID: <089f306e72e04cf89c46463f7ea7472f@DB4PR30MB030.064d.mgd.msft.net> I agree. At the bottom it says "this page is maintained by Simon Marlow"). Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | Richard Eisenberg | Sent: 26 August 2015 03:34 | To: ghc-devs at haskell.org Devs | Subject: www.haskell.org/ghc | | Hi all, | | I want to write a URL to represent GHC. It seems that | www.haskell.org/ghc is the right one. But that page is quite ugly! A | full redesign is always a challenge, so I'll make a simple request: | remove announcements of old releases, for some definition of old. (I | suggest: all releases from current major version + last release from | previous major version.) Right now, I have to scroll down to get to | "What is GHC?" and it's a little embarrassing. | | Thanks! | Richard | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From mgrabovsky at yahoo.com Wed Aug 26 09:42:46 2015 From: mgrabovsky at yahoo.com (=?UTF-8?Q?Matej_Grabovsk=C3=BD?=) Date: Wed, 26 Aug 2015 09:42:46 +0000 (UTC) Subject: www.haskell.org/ghc In-Reply-To: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> References: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> Message-ID: <249408335.831636.1440582166292.JavaMail.yahoo@mail.yahoo.com> Hi. I'm pretty sure the design of the GHC homepage was discussed previously somewhere, but I can't find the thread at the moment. Anyway, there's a repository of the website on GitHub[1]. You could perhaps try to send a pull request there. Best regards, Mat?j [1]: https://github.com/haskell-infra/ghc-homepage On Wednesday, August 26, 2015 4:32 AM, Richard Eisenberg wrote: Hi all, I want to write a URL to represent GHC. It seems that www.haskell.org/ghc is the right one. But that page is quite ugly! A full redesign is always a challenge, so I'll make a simple request: remove announcements of old releases, for some definition of old. (I suggest: all releases from current major version + last release from previous major version.) Right now, I have to scroll down to get to "What is GHC?" and it's a little embarrassing. Thanks! Richard _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From eir at cis.upenn.edu Wed Aug 26 12:26:36 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Wed, 26 Aug 2015 08:26:36 -0400 Subject: www.haskell.org/ghc In-Reply-To: <249408335.831636.1440582166292.JavaMail.yahoo@mail.yahoo.com> References: <22AE21F3-1ED3-4A9D-99D0-C7EF65DDE048@cis.upenn.edu> <249408335.831636.1440582166292.JavaMail.yahoo@mail.yahoo.com> Message-ID: <2BB366B9-AF81-47A4-8FC1-369221D2B632@cis.upenn.edu> Ah. I was not aware of that repo. I've posted a PR there. Thanks! Richard PS: Yes, I remember some thread about a redesign. But I didn't want to get involved with all that -- just a quick fix is what I was after. On Aug 26, 2015, at 5:42 AM, Matej Grabovsk? wrote: > Hi. > > I'm pretty sure the design of the GHC homepage was discussed previously somewhere, but I can't find the thread at the moment. Anyway, there's a repository of the website on GitHub[1]. You could perhaps try to send a pull request there. > Best regards, > > Mat?j > > [1]: https://github.com/haskell-infra/ghc-homepage > > > > On Wednesday, August 26, 2015 4:32 AM, Richard Eisenberg wrote: > > > Hi all, > > I want to write a URL to represent GHC. It seems that www.haskell.org/ghc is the right one. But that page is quite ugly! A full redesign is always a challenge, so I'll make a simple request: remove announcements of old releases, for some definition of old. (I suggest: all releases from current major version + last release from previous major version.) Right now, I have to scroll down to get to "What is GHC?" and it's a little embarrassing. > > Thanks! > Richard > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From oleg.grenrus at iki.fi Wed Aug 26 19:29:16 2015 From: oleg.grenrus at iki.fi (Oleg Grenrus) Date: Wed, 26 Aug 2015 22:29:16 +0300 Subject: Compiling cabal with GHC HEAD In-Reply-To: <1440001103-sup-6417@sabre> References: <7784B3F6-1EAF-4D6F-A770-BA594F39BD3B@iki.fi> <1440001103-sup-6417@sabre> Message-ID: Ah, so this happens because GHC HEAD has Cabal-1.23.0.0 bundled with it. I see. And as Cabal master is also 1.23.0.0 they don?t really work together (i.e. cabal-install uses system Cabal). Ok, maybe we can wait till GHC?s Cabal reference-commit will be bumped, if not already. Thanks for the explanation. - Oleg > On 19 Aug 2015, at 19:19, Edward Z. Yang wrote: > > Oh, this is irritating. > > The problem is we recently updated Cabal the library to get rid of > InstalledPackageInfo_ (so there is only InstalledPackageInfo now) > but it looks like in some situations cabal-install can be compiled > with an old version of Cabal (as is happening to you). I suppose > an appropriate remedy is to bump the Cabal library dependency in > cabal-install so we don't attempt to use the old Cabal; alternately > we could preprocessor macro to make it work in both cases. > > Edward > > Excerpts from Oleg Grenrus's message of 2015-08-19 07:46:56 -0700: >> I tried to fix compilation of Cabal using Cabal HEAD. It?s trivial patch: >> >> https://github.com/phadej/cabal/commit/525e0680505c74f42a321e55b357a27222790628 >> >> but it breaks build on every other released GHC: >> >> https://travis-ci.org/phadej/cabal/builds/76288656 >> >> ? >> >> The original issue GHC-7.11 complained was: >> >> Distribution/Client/Types.hs:71:10: error: >> Illegal instance declaration for >> ?PackageFixedDeps InstalledPackageInfo? >> (All instance types must be of the form (T t1 ... tn) >> where T is not a synonym. >> Use TypeSynonymInstances if you want to disable this.) >> In the instance declaration for >> ?PackageFixedDeps InstalledPackageInfo? >> >> So I had to add TypeSynonymInstances and FlexibleInstances >> >> And also had to change import of InstalledPackageInfo(exposed) in Haddock module. >> >> ? >> >> At this point I?m really confused. I cannot find ?InstalledPackageInfo_? symbol anywhere. Can someone explain what happens? > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From kazu at iij.ad.jp Thu Aug 27 01:24:17 2015 From: kazu at iij.ad.jp (Kazu Yamamoto (=?iso-2022-jp?B?GyRCOzNLXE9CSScbKEI=?=)) Date: Thu, 27 Aug 2015 10:24:17 +0900 (JST) Subject: GHC 7.10 complie time regression Message-ID: <20150827.102417.940015966115425781.kazu@iij.ad.jp> Hi, I found that the compile time of GHC 7.10 against specific packages gets much longer than previous versions. Here is example: https://travis-ci.org/kazu-yamamoto/iproute/builds/77427248 On my local MacBook Air, I see the same phenomena: GHC 7.8: cabal build 8.81s user 0.94s system 103% cpu 9.430 total GHC 7.10: cabal build 37.86s user 3.49s system 98% cpu 42.066 total Is this known regression? --Kazu From ezyang at mit.edu Thu Aug 27 06:35:21 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Wed, 26 Aug 2015 23:35:21 -0700 Subject: Compiling cabal with GHC HEAD In-Reply-To: References: <7784B3F6-1EAF-4D6F-A770-BA594F39BD3B@iki.fi> <1440001103-sup-6417@sabre> Message-ID: <1440657291-sup-2748@sabre> Yes, this is a common hazard when trying to use cabal-install with GHC HEAD. My recommendation is to use the Cabal tip distributed with GHC HEAD. Edward Excerpts from Oleg Grenrus's message of 2015-08-26 12:29:16 -0700: > Ah, so this happens because GHC HEAD has Cabal-1.23.0.0 bundled with it. I see. > And as Cabal master is also 1.23.0.0 they don?t really work together (i.e. cabal-install uses system Cabal). > > Ok, maybe we can wait till GHC?s Cabal reference-commit will be bumped, if not already. > > Thanks for the explanation. > > - Oleg > > > On 19 Aug 2015, at 19:19, Edward Z. Yang wrote: > > > > Oh, this is irritating. > > > > The problem is we recently updated Cabal the library to get rid of > > InstalledPackageInfo_ (so there is only InstalledPackageInfo now) > > but it looks like in some situations cabal-install can be compiled > > with an old version of Cabal (as is happening to you). I suppose > > an appropriate remedy is to bump the Cabal library dependency in > > cabal-install so we don't attempt to use the old Cabal; alternately > > we could preprocessor macro to make it work in both cases. > > > > Edward > > > > Excerpts from Oleg Grenrus's message of 2015-08-19 07:46:56 -0700: > >> I tried to fix compilation of Cabal using Cabal HEAD. It?s trivial patch: > >> > >> https://github.com/phadej/cabal/commit/525e0680505c74f42a321e55b357a27222790628 > >> > >> but it breaks build on every other released GHC: > >> > >> https://travis-ci.org/phadej/cabal/builds/76288656 > >> > >> ? > >> > >> The original issue GHC-7.11 complained was: > >> > >> Distribution/Client/Types.hs:71:10: error: > >> Illegal instance declaration for > >> ?PackageFixedDeps InstalledPackageInfo? > >> (All instance types must be of the form (T t1 ... tn) > >> where T is not a synonym. > >> Use TypeSynonymInstances if you want to disable this.) > >> In the instance declaration for > >> ?PackageFixedDeps InstalledPackageInfo? > >> > >> So I had to add TypeSynonymInstances and FlexibleInstances > >> > >> And also had to change import of InstalledPackageInfo(exposed) in Haddock module. > >> > >> ? > >> > >> At this point I?m really confused. I cannot find ?InstalledPackageInfo_? symbol anywhere. Can someone explain what happens? > > From simonpj at microsoft.com Thu Aug 27 08:39:03 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 27 Aug 2015 08:39:03 +0000 Subject: GHC 7.10 complie time regression In-Reply-To: <20150827.102417.940015966115425781.kazu@iij.ad.jp> References: <20150827.102417.940015966115425781.kazu@iij.ad.jp> Message-ID: <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> kazu no it's not expected to take "much longer". Can you make a ticket with a reproducible test case? Thanks! Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Kazu | Yamamoto | Sent: 27 August 2015 02:24 | To: ghc-devs at haskell.org | Subject: GHC 7.10 complie time regression | | Hi, | | I found that the compile time of GHC 7.10 against specific packages | gets much longer than previous versions. Here is example: | | https://travis-ci.org/kazu-yamamoto/iproute/builds/77427248 | | On my local MacBook Air, I see the same phenomena: | | GHC 7.8: cabal build 8.81s user 0.94s system 103% cpu 9.430 | total | GHC 7.10: cabal build 37.86s user 3.49s system 98% cpu 42.066 | total | | Is this known regression? | | --Kazu | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simonpj at microsoft.com Thu Aug 27 13:00:39 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 27 Aug 2015 13:00:39 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: Just to say that I have no idea what is going on in this thread. What is ArrayArray? What is the issue in general? Is there a ticket? Is there a wiki page? If it?s important, an ab-initio wiki page + ticket would be a good thing. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Edward Kmett Sent: 21 August 2015 05:25 To: Manuel M T Chakravarty Cc: Simon Marlow; ghc-devs Subject: Re: ArrayArrays When (ab)using them for this purpose, SmallArrayArray's would be very handy as well. Consider right now if I have something like an order-maintenance structure I have: data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} !(MutVar s (Lower s)) The former contains, logically, a mutable integer and two pointers, one for forward and one for backwards. The latter is basically the same thing with a mutable reference up pointing at the structure above. On the heap this is an object that points to a structure for the bytearray, and points to another structure for each mutvar which each point to the other 'Upper' structure. So there is a level of indirection smeared over everything. So this is a pair of doubly linked lists with an upward link from the structure below to the structure above. Converted into ArrayArray#s I'd get data Upper s = Upper (MutableArrayArray# s) w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots pointing to the previous and next previous objects, represented just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for object identity, which lets me check for the ends of the lists by tying things back on themselves. and below that data Lower s = Lower (MutableArrayArray# s) is similar, with an extra MutableArrayArray slot pointing up to an upper structure. I can then write a handful of combinators for getting out the slots in question, while it has gained a level of indirection between the wrapper to put it in * and the MutableArrayArray# s in #, that one can be basically erased by ghc. Unlike before I don't have several separate objects on the heap for each thing. I only have 2 now. The MutableArrayArray# for the object itself, and the MutableByteArray# that it references to carry around the mutable int. The only pain points are 1.) the aforementioned limitation that currently prevents me from stuffing normal boxed data through a SmallArray or Array into an ArrayArray leaving me in a little ghetto disconnected from the rest of Haskell, and 2.) the lack of SmallArrayArray's, which could let us avoid the card marking overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't help. Alternately I could just try to do really evil things and convert the whole mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, stuffing the #'d references to the other arrays directly into the SmallArray as slots, removing the limitation we see here by aping the MutableArrayArray# s API, but that gets really really dangerous! I'm pretty much willing to sacrifice almost anything on the altar of speed here, but I'd like to be able to let the GC move them and collect them which rules out simpler Ptr and Addr based solutions. -Edward On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > wrote: That?s an interesting idea. Manuel > Edward Kmett >: > > Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents. > > I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety story preventing this. > > I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked. > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection. > > -Edward > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Thu Aug 27 15:38:19 2015 From: ben at well-typed.com (Ben Gamari) Date: Thu, 27 Aug 2015 17:38:19 +0200 Subject: Planning for the 7.12 release Message-ID: <87r3mo68t0.fsf@smart-cactus.org> Hello everyone! With the 7.10.1 release nearly six months behind us and 7.10.2 out of the way, now is a good time to begin looking forward to 7.12. In keeping with the typical release pace, we are aiming to have a release candidate ready in mid-December 2015 and a final release in January 2016. The items that that we currently believe have a good chance of making it in to 7.12 are listed on the release status page [1], which I've summarized below (in no particular order), * Support for implicit parameters providing callstacks and source locations * Support for wildcards in data and type family instances * A new, type-indexed type representation, data TTypeRep (a :: k). * Introduction of visible type application * Support for reasoning about kind equalities * Support for Injective Type Families * Support for the Strict language extension * Support for Overloaded Record Fields, allowing multiple uses of the same field name and a form of type-directed name resolution. * A huge improvement to pattern matching (including much better coverage of GADTs) * Backpack is chugging along; we have a new user-facing syntax which allows multiple modules to be defined a single file, and are hoping to release at least the ability to publish multiple "units" in a single Cabal file. * Support for Applicative Do, allowing GHC to desugar do-notation to Applicative where possible. * Improved DWARF based debugging support including backtraces from Haskell code * An Improved LLVM Backend that ships with every major Tier 1 platform. These items are a bit less certain but may make it in if the authors push forward quickly enough, * Support for Type Signature Sections, allowing you to write (:: ty) as a shorthand for (\x -> x :: ty). * A (possible) overhaul of GHC's build system to use Shake instead of Make. * A DEPRECATED pragma for exports Is your pet project missing from this list? If you have a patch that you believe is on-track to make it in for 7.12, please let us know. Moreover, if you have an issue that you urgently need fixed in 7.12, please express you interest on the appropriate ticket. User feedback helps us immensely in figuring out how to best place our priorities. Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.12.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From dluposchainsky at googlemail.com Thu Aug 27 15:48:55 2015 From: dluposchainsky at googlemail.com (David Luposchainsky) Date: Thu, 27 Aug 2015 17:48:55 +0200 Subject: Planning for the 7.12 release: MonadFail In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <55DF3167.206@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hey Ben, my summer was pretty busy, but we recently fixed our MonadFail implementation to work as desired, so that should make it in as well. We'll have to survive a heroic rebase/squash that we'll probably do in September when we're back from our holidays. David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJV3zFnAAoJELrQsaT5WQUsJy4H/RVVGZBfprIKWmX+a8H6c6zr SfqkMfMZE0Q1AA1pDeQspwKUi4lOemUMexsZfjdoV2FD4YruzJ/hJl2MOiFKu6gC KsvjF7Xlxxvst9JTVaW3exr0dQNJ8sKGhkHzpvaX+ecTUQ1c6vtsJt/gMcA3U6S1 1BW4lc25OWA07nphjTkVacJflZnCUki4kNlapA3x5VX0o4yN38s7sPE1muL+7Rxw afklL9XiYJBAtGapNHP81E+iCYs5BaotJdbyCm5PcmtyxW92JMPML0BP3cfS14lA zClgyWOE8H5IRfR/8qSfECAcM81+G9WQ0XuSza5szBdX0f3PiNVri1x9qsp+3CU= =Qs4g -----END PGP SIGNATURE----- From matthewtpickering at gmail.com Thu Aug 27 15:49:12 2015 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Thu, 27 Aug 2015 17:49:12 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: Hi Ben, I think that D1152 (Record Pattern Synonyms) will be ready for 7.12. https://phabricator.haskell.org/D1152 On Thu, Aug 27, 2015 at 5:38 PM, Ben Gamari wrote: > > Hello everyone! > > With the 7.10.1 release nearly six months behind us and 7.10.2 out of the > way, now is a good time to begin looking forward to 7.12. In keeping > with the typical release pace, we are aiming to have a release > candidate ready in mid-December 2015 and a final release in January > 2016. > > The items that that we currently believe have a good chance of making it > in to 7.12 are listed on the release status page [1], which I've > summarized below (in no particular order), > > > * Support for implicit parameters providing callstacks and source > locations > > * Support for wildcards in data and type family instances > > * A new, type-indexed type representation, data TTypeRep (a :: k). > > * Introduction of visible type application > > * Support for reasoning about kind equalities > > * Support for Injective Type Families > > * Support for the Strict language extension > > * Support for Overloaded Record Fields, allowing multiple uses of > the same field name and a form of type-directed name resolution. > > * A huge improvement to pattern matching (including much better > coverage of GADTs) > > * Backpack is chugging along; we have a new user-facing syntax which > allows multiple modules to be defined a single file, and are > hoping to release at least the ability to publish multiple "units" > in a single Cabal file. > > * Support for Applicative Do, allowing GHC to desugar do-notation to > Applicative where possible. > > * Improved DWARF based debugging support including backtraces from > Haskell code > > * An Improved LLVM Backend that ships with every major Tier 1 platform. > > > These items are a bit less certain but may make it in if the authors > push forward quickly enough, > > > * Support for Type Signature Sections, allowing you to write (:: ty) > as a shorthand for (\x -> x :: ty). > > * A (possible) overhaul of GHC's build system to use Shake instead > of Make. > > * A DEPRECATED pragma for exports > > > Is your pet project missing from this list? If you have a patch that you > believe is on-track to make it in for 7.12, please let us know. > > Moreover, if you have an issue that you urgently need fixed in 7.12, > please express you interest on the appropriate ticket. User feedback > helps us immensely in figuring out how to best place our priorities. > > Cheers, > > - Ben > > > [1] https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.12.1 > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From ekmett at gmail.com Thu Aug 27 15:54:10 2015 From: ekmett at gmail.com (Edward Kmett) Date: Thu, 27 Aug 2015 11:54:10 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: An ArrayArray# is just an Array# with a modified invariant. It points directly to other unlifted ArrayArray#'s or ByteArray#'s. While those live in #, they are garbage collected objects, so this all lives on the heap. They were added to make some of the DPH stuff fast when it has to deal with nested arrays. I'm currently abusing them as a placeholder for a better thing. The Problem ----------------- Consider the scenario where you write a classic doubly-linked list in Haskell. data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) Chasing from one DLL to the next requires following 3 pointers on the heap. DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> DLL That is 3 levels of indirection. We can trim one by simply unpacking the IORef with -funbox-strict-fields or UNPACK We can trim another by adding a 'Nil' constructor for DLL and worsening our representation. data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil but now we're still stuck with a level of indirection DLL ~> MutVar# RealWorld DLL ~> DLL This means that every operation we perform on this structure will be about half of the speed of an implementation in most other languages assuming we're memory bound on loading things into cache! Making Progress ---------------------- I have been working on a number of data structures where the indirection of going from something in * out to an object in # which contains the real pointer to my target and coming back effectively doubles my runtime. We go out to the MutVar# because we are allowed to put the MutVar# onto the mutable list when we dirty it. There is a well defined write-barrier. I could change out the representation to use data DLL = DLL (MutableArray# RealWorld DLL) | Nil I can just store two pointers in the MutableArray# every time, but this doesn't help _much_ directly. It has reduced the amount of distinct addresses in memory I touch on a walk of the DLL from 3 per object to 2. I still have to go out to the heap from my DLL and get to the array object and then chase it to the next DLL and chase that to the next array. I do get my two pointers together in memory though. I'm paying for a card marking table as well, which I don't particularly need with just two pointers, but we can shed that with the "SmallMutableArray#" machinery added back in 7.10, which is just the old array code a a new data type, which can speed things up a bit when you don't have very big arrays: data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil But what if I wanted my object itself to live in # and have two mutable fields and be able to share the sme write barrier? An ArrayArray# points directly to other unlifted array types. What if we have one # -> * wrapper on the outside to deal with the impedence mismatch between the imperative world and Haskell, and then just let the ArrayArray#'s hold other arrayarrays. data DLL = DLL (MutableArrayArray# RealWorld) now I need to make up a new Nil, which I can just make be a special MutableArrayArray# I allocate on program startup. I can even abuse pattern synonyms. Alternately I can exploit the internals further to make this cheaper. Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls to directly access the preceding and next entry in the linked list. So now we have one DLL wrapper which just 'bootstraps me' into a strict world, and everything there lives in #. next :: DLL -> IO DLL next (DLL m) = IO $ \s -> case readMutableArrayArray# s of (# s', n #) -> (# s', DLL n #) It turns out GHC is quite happy to optimize all of that code to keep things unboxed. The 'DLL' wrappers get removed pretty easily when they are known strict and you chain operations of this sort! Cleaning it Up ------------------ Now I have one outermost indirection pointing to an array that points directly to other arrays. I'm stuck paying for a card marking table per object, but I can fix that by duplicating the code for MutableArrayArray# and using a SmallMutableArray#. I can hack up primops that let me store a mixture of SmallMutableArray# fields and normal ones in the data structure. Operationally, I can even do so by just unsafeCoercing the existing SmallMutableArray# primitives to change the kind of one of the arguments it takes. This is almost ideal, but not quite. I often have fields that would be best left unboxed. data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil was able to unpack the Int, but we lost that. We can currently at best point one of the entries of the SmallMutableArray# at a boxed or at a MutableByteArray# for all of our misc. data and shove the int in question in there. e.g. if I were to implement a hash-array-mapped-trie I need to store masks and administrivia as I walk down the tree. Having to go off to the side costs me the entire win from avoiding the first pointer chase. But, if like Ryan suggested, we had a heap object we could construct that had n words with unsafe access and m pointers to other heap objects, one that could put itself on the mutable list when any of those pointers changed then I could shed this last factor of two in all circumstances. Prototype ------------- Over the last few days I've put together a small prototype implementation with a few non-trivial imperative data structures for things like Tarjan's link-cut trees, the list labeling problem and order-maintenance. https://github.com/ekmett/structs Notable bits: Data.Struct.Internal.LinkCut provides an implementation of link-cut trees in this style. Data.Struct.Internal provides the rather horrifying guts that make it go fast. Once compiled with -O or -O2, if you look at the core, almost all the references to the LinkCut or Object data constructor get optimized away, and we're left with beautiful strict code directly mutating out underlying representation. At the very least I'll take this email and turn it into a short article. -Edward On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones wrote: > Just to say that I have no idea what is going on in this thread. What is > ArrayArray? What is the issue in general? Is there a ticket? Is there a > wiki page? > > > > If it?s important, an ab-initio wiki page + ticket would be a good thing. > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward > Kmett > *Sent:* 21 August 2015 05:25 > *To:* Manuel M T Chakravarty > *Cc:* Simon Marlow; ghc-devs > *Subject:* Re: ArrayArrays > > > > When (ab)using them for this purpose, SmallArrayArray's would be very > handy as well. > > > > Consider right now if I have something like an order-maintenance structure > I have: > > > > data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) > > > > data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} > !(MutVar s (Lower s)) > > > > The former contains, logically, a mutable integer and two pointers, one > for forward and one for backwards. The latter is basically the same thing > with a mutable reference up pointing at the structure above. > > > > On the heap this is an object that points to a structure for the > bytearray, and points to another structure for each mutvar which each point > to the other 'Upper' structure. So there is a level of indirection smeared > over everything. > > > > So this is a pair of doubly linked lists with an upward link from the > structure below to the structure above. > > > > Converted into ArrayArray#s I'd get > > > > data Upper s = Upper (MutableArrayArray# s) > > > > w/ the first slot being a pointer to a MutableByteArray#, and the next 2 > slots pointing to the previous and next previous objects, represented just > as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for > object identity, which lets me check for the ends of the lists by tying > things back on themselves. > > > > and below that > > > > data Lower s = Lower (MutableArrayArray# s) > > > > is similar, with an extra MutableArrayArray slot pointing up to an upper > structure. > > > > I can then write a handful of combinators for getting out the slots in > question, while it has gained a level of indirection between the wrapper to > put it in * and the MutableArrayArray# s in #, that one can be basically > erased by ghc. > > > > Unlike before I don't have several separate objects on the heap for each > thing. I only have 2 now. The MutableArrayArray# for the object itself, and > the MutableByteArray# that it references to carry around the mutable int. > > > > The only pain points are > > > > 1.) the aforementioned limitation that currently prevents me from stuffing > normal boxed data through a SmallArray or Array into an ArrayArray leaving > me in a little ghetto disconnected from the rest of Haskell, > > > > and > > > > 2.) the lack of SmallArrayArray's, which could let us avoid the card > marking overhead. These objects are all small, 3-4 pointers wide. Card > marking doesn't help. > > > > Alternately I could just try to do really evil things and convert the > whole mess to SmallArrays and then figure out how to unsafeCoerce my way to > glory, stuffing the #'d references to the other arrays directly into the > SmallArray as slots, removing the limitation we see here by aping the > MutableArrayArray# s API, but that gets really really dangerous! > > > > I'm pretty much willing to sacrifice almost anything on the altar of speed > here, but I'd like to be able to let the GC move them and collect them > which rules out simpler Ptr and Addr based solutions. > > > > -Edward > > > > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < > chak at cse.unsw.edu.au> wrote: > > That?s an interesting idea. > > Manuel > > > Edward Kmett : > > > > > Would it be possible to add unsafe primops to add Array# and SmallArray# > entries to an ArrayArray#? The fact that the ArrayArray# entries are all > directly unlifted avoiding a level of indirection for the containing > structure is amazing, but I can only currently use it if my leaf level data > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be > able to have the ability to put SmallArray# a stuff down at the leaves to > hold lifted contents. > > > > I accept fully that if I name the wrong type when I go to access one of > the fields it'll lie to me, but I suppose it'd do that if i tried to use > one of the members that held a nested ArrayArray# as a ByteArray# anyways, > so it isn't like there is a safety story preventing this. > > > > I've been hunting for ways to try to kill the indirection problems I get > with Haskell and mutable structures, and I could shoehorn a number of them > into ArrayArrays if this worked. > > > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection > compared to c/java and this could reduce that pain to just 1 level of > unnecessary indirection. > > > > -Edward > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Thu Aug 27 16:38:46 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Thu, 27 Aug 2015 16:38:46 +0000 Subject: Planning for the 7.12 release: MonadFail In-Reply-To: <55DF3167.206@gmail.com> References: <87r3mo68t0.fsf@smart-cactus.org> <55DF3167.206@gmail.com> Message-ID: Great. Is there a ticket? If not, we'll probably lose track of it. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | David Luposchainsky | Sent: 27 August 2015 16:49 | To: ghc-devs at haskell.org | Subject: Re: Planning for the 7.12 release: MonadFail | | -----BEGIN PGP SIGNED MESSAGE----- | Hash: SHA1 | | Hey Ben, | | my summer was pretty busy, but we recently fixed our MonadFail | implementation to work as desired, so that should make it in as well. | We'll have to survive a heroic rebase/squash that we'll probably do in | September when we're back from our holidays. | | David | -----BEGIN PGP SIGNATURE----- | Version: GnuPG v1 | | iQEcBAEBAgAGBQJV3zFnAAoJELrQsaT5WQUsJy4H/RVVGZBfprIKWmX+a8H6c6zr | SfqkMfMZE0Q1AA1pDeQspwKUi4lOemUMexsZfjdoV2FD4YruzJ/hJl2MOiFKu6gC | KsvjF7Xlxxvst9JTVaW3exr0dQNJ8sKGhkHzpvaX+ecTUQ1c6vtsJt/gMcA3U6S1 | 1BW4lc25OWA07nphjTkVacJflZnCUki4kNlapA3x5VX0o4yN38s7sPE1muL+7Rxw | afklL9XiYJBAtGapNHP81E+iCYs5BaotJdbyCm5PcmtyxW92JMPML0BP3cfS14lA | zClgyWOE8H5IRfR/8qSfECAcM81+G9WQ0XuSza5szBdX0f3PiNVri1x9qsp+3CU= | =Qs4g | -----END PGP SIGNATURE----- | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From hvriedel at gmail.com Thu Aug 27 16:40:29 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Thu, 27 Aug 2015 18:40:29 +0200 Subject: Planning for the 7.12 release: MonadFail In-Reply-To: (Simon Peyton Jones's message of "Thu, 27 Aug 2015 16:38:46 +0000") References: <87r3mo68t0.fsf@smart-cactus.org> <55DF3167.206@gmail.com> Message-ID: <87r3mo8z2a.fsf@gmail.com> On 2015-08-27 at 18:38:46 +0200, Simon Peyton Jones wrote: > Great. Is there a ticket? If not, we'll probably lose track of it. https://ghc.haskell.org/trac/ghc/ticket/10751 :-) From ezyang at mit.edu Thu Aug 27 17:24:12 2015 From: ezyang at mit.edu (Edward Z. Yang) Date: Thu, 27 Aug 2015 10:24:12 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: <1440695490-sup-4296@sabre> It seems to me that we should take a page from OCaml's playbook and add support for native mutable fields in objects, because this is essentially what a mix of words and pointers is. The big question, as always, is what the syntax should be. Edward Excerpts from Ryan Yates's message of 2015-08-21 06:49:47 -0700: > Hi Edward, > > I've been working on removing indirection in STM and I added a heap > object like SmallArray, but with a mix of words and pointers (as well > as a header with metadata for STM). It appears to work well now, but > it is missing the type information. All the pointers have the same > type which works fine for your Upper. In my case I use it to > represent a red-black tree node [1]. > > Also all the structures I make are fixed size and it would be nice if > the compiler could treat that fix size like a constant in code > generation. I don't know what the right design is or what would be > needed, but it seems simple enough to give the right typing > information to something like this and basically get a mutable struct. > I'm talking about this work at HIW and really hope to find someone > interested in extending this expressiveness to let us write something > that looks clear in Haskell, but gives the heap representation that we > really need for performance. From the RTS perspective I think there > are any obstacles. > > [1]: https://github.com/fryguybob/ghc-stm-benchmarks/blob/master/benchmarks/RBTree-Throughput/RBTreeNode.hs > > Ryan > > On Fri, Aug 21, 2015 at 12:25 AM, Edward Kmett wrote: > > When (ab)using them for this purpose, SmallArrayArray's would be very handy > > as well. > > > > Consider right now if I have something like an order-maintenance structure I > > have: > > > > data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > > !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) > > > > data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > > !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} > > !(MutVar s (Lower s)) > > > > The former contains, logically, a mutable integer and two pointers, one for > > forward and one for backwards. The latter is basically the same thing with a > > mutable reference up pointing at the structure above. > > > > On the heap this is an object that points to a structure for the bytearray, > > and points to another structure for each mutvar which each point to the > > other 'Upper' structure. So there is a level of indirection smeared over > > everything. > > > > So this is a pair of doubly linked lists with an upward link from the > > structure below to the structure above. > > > > Converted into ArrayArray#s I'd get > > > > data Upper s = Upper (MutableArrayArray# s) > > > > w/ the first slot being a pointer to a MutableByteArray#, and the next 2 > > slots pointing to the previous and next previous objects, represented just > > as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for > > object identity, which lets me check for the ends of the lists by tying > > things back on themselves. > > > > and below that > > > > data Lower s = Lower (MutableArrayArray# s) > > > > is similar, with an extra MutableArrayArray slot pointing up to an upper > > structure. > > > > I can then write a handful of combinators for getting out the slots in > > question, while it has gained a level of indirection between the wrapper to > > put it in * and the MutableArrayArray# s in #, that one can be basically > > erased by ghc. > > > > Unlike before I don't have several separate objects on the heap for each > > thing. I only have 2 now. The MutableArrayArray# for the object itself, and > > the MutableByteArray# that it references to carry around the mutable int. > > > > The only pain points are > > > > 1.) the aforementioned limitation that currently prevents me from stuffing > > normal boxed data through a SmallArray or Array into an ArrayArray leaving > > me in a little ghetto disconnected from the rest of Haskell, > > > > and > > > > 2.) the lack of SmallArrayArray's, which could let us avoid the card marking > > overhead. These objects are all small, 3-4 pointers wide. Card marking > > doesn't help. > > > > Alternately I could just try to do really evil things and convert the whole > > mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, > > stuffing the #'d references to the other arrays directly into the SmallArray > > as slots, removing the limitation we see here by aping the > > MutableArrayArray# s API, but that gets really really dangerous! > > > > I'm pretty much willing to sacrifice almost anything on the altar of speed > > here, but I'd like to be able to let the GC move them and collect them which > > rules out simpler Ptr and Addr based solutions. > > > > -Edward > > > > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > > wrote: > >> > >> That?s an interesting idea. > >> > >> Manuel > >> > >> > Edward Kmett : > >> > > >> > Would it be possible to add unsafe primops to add Array# and SmallArray# > >> > entries to an ArrayArray#? The fact that the ArrayArray# entries are all > >> > directly unlifted avoiding a level of indirection for the containing > >> > structure is amazing, but I can only currently use it if my leaf level data > >> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be > >> > able to have the ability to put SmallArray# a stuff down at the leaves to > >> > hold lifted contents. > >> > > >> > I accept fully that if I name the wrong type when I go to access one of > >> > the fields it'll lie to me, but I suppose it'd do that if i tried to use one > >> > of the members that held a nested ArrayArray# as a ByteArray# anyways, so it > >> > isn't like there is a safety story preventing this. > >> > > >> > I've been hunting for ways to try to kill the indirection problems I get > >> > with Haskell and mutable structures, and I could shoehorn a number of them > >> > into ArrayArrays if this worked. > >> > > >> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection > >> > compared to c/java and this could reduce that pain to just 1 level of > >> > unnecessary indirection. > >> > > >> > -Edward > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > From ekmett at gmail.com Thu Aug 27 18:36:57 2015 From: ekmett at gmail.com (Edward Kmett) Date: Thu, 27 Aug 2015 14:36:57 -0400 Subject: ArrayArrays In-Reply-To: <1440695490-sup-4296@sabre> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <1440695490-sup-4296@sabre> Message-ID: On Thu, Aug 27, 2015 at 1:24 PM, Edward Z. Yang wrote: > It seems to me that we should take a page from OCaml's playbook > and add support for native mutable fields in objects, because > this is essentially what a mix of words and pointers is. > That actually doesn't work as well as one might hope. We currently treat data constructor closures as so much tissue paper around a present. We tear them open, rip out all their contents, scatter them throughout our code and then we build a whole new data constructor closure when we're done, or we just leave them suspended in closures awaiting someone to demand we finally make a new data constructor. Half the time we don't even give back the data constructor closure and push it into update g frames and we just give back the items on the stack. With the machinery I mentioned above I get a world where every time I access an object I can know it is evaluated for real, so this means I'm not stuck 'entering an unknown closure', and getting it to give me back a slab of memory that we know is a real data constructor that i can bang away on mutable entries in. In a world where things in * could hold mutable pointers we have to care a lot more about object identity in deeply uncomfortable ways. With what I've implemented I only care about object identity between things in # that are gcptrs. The garbage collector may move them around, but it doesn't put in thunks anywhere. -Edward -------------- next part -------------- An HTML attachment was scrubbed... URL: From elliot.cameron at covenanteyes.com Thu Aug 27 23:23:52 2015 From: elliot.cameron at covenanteyes.com (Elliot Cameron) Date: Thu, 27 Aug 2015 23:23:52 +0000 Subject: Planning for the 7.12 release In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <1440717837983.74036@covenanteyes.com> I can't seem to find the exact trac ticket, but the ability to swap out "Integer" implementations at link time would be a huge relief on Windows, which suffers from various problems with dynamic linking. I believe it was originally slated for 7.12. Can someone find it? Here's what I did find: https://ghc.haskell.org/trac/ghc/wiki/ReplacingGMPNotes A related discussion: https://github.com/commercialhaskell/stack/issues/399 As it stands, we've been trying hard to find good ways to provide custom GHC variants more easily to end users. ________________________________________ From: ghc-devs on behalf of Ben Gamari Sent: Thursday, August 27, 2015 11:38 AM To: GHC developers Subject: Planning for the 7.12 release Hello everyone! With the 7.10.1 release nearly six months behind us and 7.10.2 out of the way, now is a good time to begin looking forward to 7.12. In keeping with the typical release pace, we are aiming to have a release candidate ready in mid-December 2015 and a final release in January 2016. The items that that we currently believe have a good chance of making it in to 7.12 are listed on the release status page [1], which I've summarized below (in no particular order), * Support for implicit parameters providing callstacks and source locations * Support for wildcards in data and type family instances * A new, type-indexed type representation, data TTypeRep (a :: k). * Introduction of visible type application * Support for reasoning about kind equalities * Support for Injective Type Families * Support for the Strict language extension * Support for Overloaded Record Fields, allowing multiple uses of the same field name and a form of type-directed name resolution. * A huge improvement to pattern matching (including much better coverage of GADTs) * Backpack is chugging along; we have a new user-facing syntax which allows multiple modules to be defined a single file, and are hoping to release at least the ability to publish multiple "units" in a single Cabal file. * Support for Applicative Do, allowing GHC to desugar do-notation to Applicative where possible. * Improved DWARF based debugging support including backtraces from Haskell code * An Improved LLVM Backend that ships with every major Tier 1 platform. These items are a bit less certain but may make it in if the authors push forward quickly enough, * Support for Type Signature Sections, allowing you to write (:: ty) as a shorthand for (\x -> x :: ty). * A (possible) overhaul of GHC's build system to use Shake instead of Make. * A DEPRECATED pragma for exports Is your pet project missing from this list? If you have a patch that you believe is on-track to make it in for 7.12, please let us know. Moreover, if you have an issue that you urgently need fixed in 7.12, please express you interest on the appropriate ticket. User feedback helps us immensely in figuring out how to best place our priorities. Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.12.1 From sean.leather at gmail.com Fri Aug 28 05:40:17 2015 From: sean.leather at gmail.com (Sean Leather) Date: Fri, 28 Aug 2015 07:40:17 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: On Thu, Aug 27, 2015 at 5:38 PM, Ben Gamari wrote: > These items are a bit less certain but may make it in if the authors > push forward quickly enough, > > > * Support for Type Signature Sections, allowing you to write (:: ty) > as a shorthand for (\x -> x :: ty). > Once Lennart convinced me of the usefulness of this [1], I've been finding plenty of places where I wish I had it. Who's working on it? What's the status? Is there a ticket for it? Regards, Sean [1] http://augustss.blogspot.com/2014/04/a-small-haskell-extension.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Fri Aug 28 07:20:54 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 28 Aug 2015 07:20:54 +0000 Subject: Planning for the 7.12 release In-Reply-To: References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <01013378461344a090dc67842ff5e7b3@DB4PR30MB030.064d.mgd.msft.net> Lennart always says he?s going to work on it, but he?s a busy man and nothing has actually happened. It?s a pretty easy feature to implement I think. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Sean Leather Sent: 28 August 2015 06:40 To: Ben Gamari Cc: GHC developers Subject: Re: Planning for the 7.12 release On Thu, Aug 27, 2015 at 5:38 PM, Ben Gamari wrote: These items are a bit less certain but may make it in if the authors push forward quickly enough, * Support for Type Signature Sections, allowing you to write (:: ty) as a shorthand for (\x -> x :: ty). Once Lennart convinced me of the usefulness of this [1], I've been finding plenty of places where I wish I had it. Who's working on it? What's the status? Is there a ticket for it? Regards, Sean [1] http://augustss.blogspot.com/2014/04/a-small-haskell-extension.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Fri Aug 28 09:38:32 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 11:38:32 +0200 Subject: Planning for the 7.12 release: MonadFail In-Reply-To: <55DF3167.206@gmail.com> References: <87r3mo68t0.fsf@smart-cactus.org> <55DF3167.206@gmail.com> Message-ID: <87mvxb69d3.fsf@smart-cactus.org> David Luposchainsky writes: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hey Ben, > > my summer was pretty busy, but we recently fixed our MonadFail implementation to > work as desired, so that should make it in as well. We'll have to survive a > heroic rebase/squash that we'll probably do in September when we're back from our > holidays. Great, I've added it to the list. Keep us in the loop as things progress. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From marlowsd at gmail.com Fri Aug 28 09:41:03 2015 From: marlowsd at gmail.com (Simon Marlow) Date: Fri, 28 Aug 2015 10:41:03 +0100 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <1440695490-sup-4296@sabre> Message-ID: <55E02CAF.1090006@gmail.com> On 27/08/2015 19:36, Edward Kmett wrote: > On Thu, Aug 27, 2015 at 1:24 PM, Edward Z. Yang > wrote: > > It seems to me that we should take a page from OCaml's playbook > and add support for native mutable fields in objects, because > this is essentially what a mix of words and pointers is. > > > That actually doesn't work as well as one might hope. > > We currently treat data constructor closures as so much tissue paper > around a present. We tear them open, rip out all their contents, scatter > them throughout our code and then we build a whole new data constructor > closure when we're done, or we just leave them suspended in closures > awaiting someone to demand we finally make a new data constructor. > > Half the time we don't even give back the data constructor closure and > push it into update g frames and we just give back the items on the stack. > > With the machinery I mentioned above I get a world where every time I > access an object I can know it is evaluated for real, so this means I'm > not stuck 'entering an unknown closure', and getting it to give me back > a slab of memory that we know is a real data constructor that i can bang > away on mutable entries in. > > In a world where things in * could hold mutable pointers we have to care > a lot more about object identity in deeply uncomfortable ways. > > With what I've implemented I only care about object identity between > things in # that are gcptrs. The garbage collector may move them around, > but it doesn't put in thunks anywhere. Yeah, I've actually thought about whether we could have mutable fields in constructors a couple of times, and it's far from easy for the reasons you describe. A constructor with mutable fields would need to be an object with identity, with precise control over when it is created. This is nothing like an ordinary constructor. I like the alternative approach in this thread, which is to attack the problem from the other end: start with a primitive object and make it more like a constructor. I don't see any reason why we shouldn't add primops to read/write SmallArray# and other primitive objects in an ArrayArray#. Will someone make a patch? It should be pretty straightforward. Cheers, Simon From ben at well-typed.com Fri Aug 28 09:41:52 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 11:41:52 +0200 Subject: Planning for the 7.12 release In-Reply-To: References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <87k2sf697j.fsf@smart-cactus.org> Matthew Pickering writes: > Hi Ben, > > I think that D1152 (Record Pattern Synonyms) will be ready for 7.12. > https://phabricator.haskell.org/D1152 > Ahh yes. Thanks for pointing this out. I've added it to the list. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From simonpj at microsoft.com Fri Aug 28 09:42:53 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 28 Aug 2015 09:42:53 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> Message-ID: <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> At the very least I'll take this email and turn it into a short article. Yes, please do make it into a wiki page on the GHC Trac, and maybe make a ticket for it. Thanks Simon From: Edward Kmett [mailto:ekmett at gmail.com] Sent: 27 August 2015 16:54 To: Simon Peyton Jones Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs Subject: Re: ArrayArrays An ArrayArray# is just an Array# with a modified invariant. It points directly to other unlifted ArrayArray#'s or ByteArray#'s. While those live in #, they are garbage collected objects, so this all lives on the heap. They were added to make some of the DPH stuff fast when it has to deal with nested arrays. I'm currently abusing them as a placeholder for a better thing. The Problem ----------------- Consider the scenario where you write a classic doubly-linked list in Haskell. data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) Chasing from one DLL to the next requires following 3 pointers on the heap. DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> DLL That is 3 levels of indirection. We can trim one by simply unpacking the IORef with -funbox-strict-fields or UNPACK We can trim another by adding a 'Nil' constructor for DLL and worsening our representation. data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil but now we're still stuck with a level of indirection DLL ~> MutVar# RealWorld DLL ~> DLL This means that every operation we perform on this structure will be about half of the speed of an implementation in most other languages assuming we're memory bound on loading things into cache! Making Progress ---------------------- I have been working on a number of data structures where the indirection of going from something in * out to an object in # which contains the real pointer to my target and coming back effectively doubles my runtime. We go out to the MutVar# because we are allowed to put the MutVar# onto the mutable list when we dirty it. There is a well defined write-barrier. I could change out the representation to use data DLL = DLL (MutableArray# RealWorld DLL) | Nil I can just store two pointers in the MutableArray# every time, but this doesn't help _much_ directly. It has reduced the amount of distinct addresses in memory I touch on a walk of the DLL from 3 per object to 2. I still have to go out to the heap from my DLL and get to the array object and then chase it to the next DLL and chase that to the next array. I do get my two pointers together in memory though. I'm paying for a card marking table as well, which I don't particularly need with just two pointers, but we can shed that with the "SmallMutableArray#" machinery added back in 7.10, which is just the old array code a a new data type, which can speed things up a bit when you don't have very big arrays: data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil But what if I wanted my object itself to live in # and have two mutable fields and be able to share the sme write barrier? An ArrayArray# points directly to other unlifted array types. What if we have one # -> * wrapper on the outside to deal with the impedence mismatch between the imperative world and Haskell, and then just let the ArrayArray#'s hold other arrayarrays. data DLL = DLL (MutableArrayArray# RealWorld) now I need to make up a new Nil, which I can just make be a special MutableArrayArray# I allocate on program startup. I can even abuse pattern synonyms. Alternately I can exploit the internals further to make this cheaper. Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls to directly access the preceding and next entry in the linked list. So now we have one DLL wrapper which just 'bootstraps me' into a strict world, and everything there lives in #. next :: DLL -> IO DLL next (DLL m) = IO $ \s -> case readMutableArrayArray# s of (# s', n #) -> (# s', DLL n #) It turns out GHC is quite happy to optimize all of that code to keep things unboxed. The 'DLL' wrappers get removed pretty easily when they are known strict and you chain operations of this sort! Cleaning it Up ------------------ Now I have one outermost indirection pointing to an array that points directly to other arrays. I'm stuck paying for a card marking table per object, but I can fix that by duplicating the code for MutableArrayArray# and using a SmallMutableArray#. I can hack up primops that let me store a mixture of SmallMutableArray# fields and normal ones in the data structure. Operationally, I can even do so by just unsafeCoercing the existing SmallMutableArray# primitives to change the kind of one of the arguments it takes. This is almost ideal, but not quite. I often have fields that would be best left unboxed. data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil was able to unpack the Int, but we lost that. We can currently at best point one of the entries of the SmallMutableArray# at a boxed or at a MutableByteArray# for all of our misc. data and shove the int in question in there. e.g. if I were to implement a hash-array-mapped-trie I need to store masks and administrivia as I walk down the tree. Having to go off to the side costs me the entire win from avoiding the first pointer chase. But, if like Ryan suggested, we had a heap object we could construct that had n words with unsafe access and m pointers to other heap objects, one that could put itself on the mutable list when any of those pointers changed then I could shed this last factor of two in all circumstances. Prototype ------------- Over the last few days I've put together a small prototype implementation with a few non-trivial imperative data structures for things like Tarjan's link-cut trees, the list labeling problem and order-maintenance. https://github.com/ekmett/structs Notable bits: Data.Struct.Internal.LinkCut provides an implementation of link-cut trees in this style. Data.Struct.Internal provides the rather horrifying guts that make it go fast. Once compiled with -O or -O2, if you look at the core, almost all the references to the LinkCut or Object data constructor get optimized away, and we're left with beautiful strict code directly mutating out underlying representation. At the very least I'll take this email and turn it into a short article. -Edward On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > wrote: Just to say that I have no idea what is going on in this thread. What is ArrayArray? What is the issue in general? Is there a ticket? Is there a wiki page? If it?s important, an ab-initio wiki page + ticket would be a good thing. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Edward Kmett Sent: 21 August 2015 05:25 To: Manuel M T Chakravarty Cc: Simon Marlow; ghc-devs Subject: Re: ArrayArrays When (ab)using them for this purpose, SmallArrayArray's would be very handy as well. Consider right now if I have something like an order-maintenance structure I have: data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} !(MutVar s (Lower s)) The former contains, logically, a mutable integer and two pointers, one for forward and one for backwards. The latter is basically the same thing with a mutable reference up pointing at the structure above. On the heap this is an object that points to a structure for the bytearray, and points to another structure for each mutvar which each point to the other 'Upper' structure. So there is a level of indirection smeared over everything. So this is a pair of doubly linked lists with an upward link from the structure below to the structure above. Converted into ArrayArray#s I'd get data Upper s = Upper (MutableArrayArray# s) w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots pointing to the previous and next previous objects, represented just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for object identity, which lets me check for the ends of the lists by tying things back on themselves. and below that data Lower s = Lower (MutableArrayArray# s) is similar, with an extra MutableArrayArray slot pointing up to an upper structure. I can then write a handful of combinators for getting out the slots in question, while it has gained a level of indirection between the wrapper to put it in * and the MutableArrayArray# s in #, that one can be basically erased by ghc. Unlike before I don't have several separate objects on the heap for each thing. I only have 2 now. The MutableArrayArray# for the object itself, and the MutableByteArray# that it references to carry around the mutable int. The only pain points are 1.) the aforementioned limitation that currently prevents me from stuffing normal boxed data through a SmallArray or Array into an ArrayArray leaving me in a little ghetto disconnected from the rest of Haskell, and 2.) the lack of SmallArrayArray's, which could let us avoid the card marking overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't help. Alternately I could just try to do really evil things and convert the whole mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, stuffing the #'d references to the other arrays directly into the SmallArray as slots, removing the limitation we see here by aping the MutableArrayArray# s API, but that gets really really dangerous! I'm pretty much willing to sacrifice almost anything on the altar of speed here, but I'd like to be able to let the GC move them and collect them which rules out simpler Ptr and Addr based solutions. -Edward On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > wrote: That?s an interesting idea. Manuel > Edward Kmett >: > > Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents. > > I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety story preventing this. > > I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked. > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection. > > -Edward > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Fri Aug 28 09:48:08 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 11:48:08 +0200 Subject: Planning for the 7.12 release In-Reply-To: <1440717837983.74036@covenanteyes.com> References: <87r3mo68t0.fsf@smart-cactus.org> <1440717837983.74036@covenanteyes.com> Message-ID: <87h9nj68x3.fsf@smart-cactus.org> Elliot Cameron writes: > I can't seem to find the exact trac ticket, but the ability to swap > out "Integer" implementations at link time would be a huge relief on > Windows, which suffers from various problems with dynamic linking. I > believe it was originally slated for 7.12. Can someone find it? Here's > what I did find: > https://ghc.haskell.org/trac/ghc/wiki/ReplacingGMPNotes > > A related discussion: > https://github.com/commercialhaskell/stack/issues/399 > > As it stands, we've been trying hard to find good ways to provide > custom GHC variants more easily to end users. > Hmm, interesting. I'm not sure how realistic it is to make this a link-time option, however, considering that we may inline bindings from whatever integer package we compile against into the user's program. Herbert, do you have any thoughts on this? Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From _deepfire at feelingofgreen.ru Fri Aug 28 10:31:13 2015 From: _deepfire at feelingofgreen.ru (Kosyrev Serge) Date: Fri, 28 Aug 2015 13:31:13 +0300 Subject: Planning for the 7.12 release In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> (sfid-20150827_195933_009643_694E3FB7) (Ben Gamari's message of "Thu, 27 Aug 2015 17:38:19 +0200") References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <87h9nj4scu.fsf@feelingofgreen.ru> Ben Gamari writes: > These items are a bit less certain but may make it in if the authors > push forward quickly enough, [..] > * A (possible) overhaul of GHC's build system to use Shake instead > of Make. Is there a breakdown of what remains to be done on this front? -- ? ???????e? / respectfully, ??????? ?????? From hvriedel at gmail.com Fri Aug 28 10:48:39 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Fri, 28 Aug 2015 12:48:39 +0200 Subject: Planning for the 7.12 release In-Reply-To: <01013378461344a090dc67842ff5e7b3@DB4PR30MB030.064d.mgd.msft.net> (Simon Peyton Jones's message of "Fri, 28 Aug 2015 07:20:54 +0000") References: <87r3mo68t0.fsf@smart-cactus.org> <01013378461344a090dc67842ff5e7b3@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <877fof7koo.fsf@gmail.com> For the record, I wanted to implement this feature a couple of months ago but then got side-tracked. If somebody wants to pick it up (which would be great), please lemme know. In the meantime I've created https://ghc.haskell.org/trac/ghc/ticket/10803 and write up the little information I have on this topic already On 2015-08-28 at 09:20:54 +0200, Simon Peyton Jones wrote: > Lennart always says he?s going to work on it, but he?s a busy man and nothing has actually happened. > > It?s a pretty easy feature to implement I think. > > Simon > > From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Sean Leather > Sent: 28 August 2015 06:40 > To: Ben Gamari > Cc: GHC developers > Subject: Re: Planning for the 7.12 release > > On Thu, Aug 27, 2015 at 5:38 PM, Ben Gamari wrote: > These items are a bit less certain but may make it in if the authors > push forward quickly enough, > > > * Support for Type Signature Sections, allowing you to write (:: ty) > as a shorthand for (\x -> x :: ty). > > Once Lennart convinced me of the usefulness of this [1], I've been finding plenty of places where I wish I had it. Who's working on it? What's the status? Is there a ticket for it? > > Regards, > Sean > > [1] http://augustss.blogspot.com/2014/04/a-small-haskell-extension.html > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -- "Elegance is not optional" -- Richard O'Keefe From corentin.dupont at gmail.com Fri Aug 28 10:54:48 2015 From: corentin.dupont at gmail.com (Corentin Dupont) Date: Fri, 28 Aug 2015 12:54:48 +0200 Subject: Fwd: Thread-safe Hint In-Reply-To: <311B1CE2-1D74-4381-9489-517CD1181DE0@gmail.com> References: <311B1CE2-1D74-4381-9489-517CD1181DE0@gmail.com> Message-ID: Hello, I am wondering if GHC is now thread-safe. I am using Hint, and it reports that GHC is not thread-safe, and that I can't safely run two instances of the interpreter simultaneously. Is that still the case? Thanks! Corentin ---------- Forwarded message ---------- From: Daniel Gor?n Date: Thu, Aug 27, 2015 at 5:09 PM Subject: Re: Thread-safe Hint To: Corentin Dupont Hi Corentin, sorry for the late reply. Until relatively recently, the problem was still on. But I too remember seeing something related to this issue being fixed (iirc, the problem was the runtime linker, which used global state), so perhaps it is already fixed in 7.10. If you can verify this, it shouldn?t be hard to show the error message only on old versions of ghc. I?ll be away for a couple of weeks, but if you want to look into this and send a patch, I?ll merge it when I return. Cheers, Daniel > On 24 Aug 2015, at 10:43 am, Corentin Dupont wrote: > > Hello Daniel, > I noticed the following message in Hint: > This version of GHC is not thread-safe,can't safely run two instances of the interpreter simultaneously. > > Is it still the case with recent versions of GHC? > It would be neat to be able to launch several instances of the interpreter. In my game Nomyx I have several "match-up" going on and having one instance of the interpreter would be nicer. Otherwise I am obliged to reset the interpret each time I want to interpret something, which is time consuming (2-3 seconds). > > Thanks, > C -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Fri Aug 28 10:55:52 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 28 Aug 2015 10:55:52 +0000 Subject: Planning for the 7.12 release In-Reply-To: <87h9nj4scu.fsf@feelingofgreen.ru> References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> Message-ID: <8aa9d66f973241e7861629223aedfa47@DB4PR30MB030.064d.mgd.msft.net> Andrey Mokhov is the man to ask. I'm copying him. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of | Kosyrev Serge | Sent: 28 August 2015 11:31 | To: Ben Gamari | Cc: GHC developers | Subject: Re: Planning for the 7.12 release | | Ben Gamari writes: | > These items are a bit less certain but may make it in if the authors | > push forward quickly enough, | | [..] | | > * A (possible) overhaul of GHC's build system to use Shake | instead | > of Make. | | Is there a breakdown of what remains to be done on this front? | | -- | ? ???????e? / respectfully, | ??????? ?????? | _______________________________________________ | ghc-devs mailing list | ghc-devs at haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From hvriedel at gmail.com Fri Aug 28 10:58:19 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Fri, 28 Aug 2015 12:58:19 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87h9nj4scu.fsf@feelingofgreen.ru> (Kosyrev Serge's message of "Fri, 28 Aug 2015 13:31:13 +0300") References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> Message-ID: <871ten7k8k.fsf@gmail.com> On 2015-08-28 at 12:31:13 +0200, Kosyrev Serge wrote: [...] >> * A (possible) overhaul of GHC's build system to use Shake instead >> of Make. > > Is there a breakdown of what remains to be done on this front? Btw, here's the GitHub repo were you can track the progress: https://github.com/snowleopard/shaking-up-ghc From ben at well-typed.com Fri Aug 28 11:46:44 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 13:46:44 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87h9nj4scu.fsf@feelingofgreen.ru> References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> Message-ID: <87bndrvdnf.fsf@smart-cactus.org> Kosyrev Serge <_deepfire at feelingofgreen.ru> writes: > Ben Gamari writes: >> These items are a bit less certain but may make it in if the authors >> push forward quickly enough, > > [..] > >> * A (possible) overhaul of GHC's build system to use Shake instead >> of Make. > > Is there a breakdown of what remains to be done on this front? > I'm actually not entirely clear on the general plan regarding the Shake-up and perhaps this is a good time to discuss it. How gradual of a transition do we envision this will be? Specifically, for how long do we want the two build systems to coexist (if at all)? If this is intended to be an immediate wholesale switch to Shake I would be very skeptical of merging for 7.12 as three months is, in my opinion, very little time to test such a sweeping change. However, a long transition time is also not terribly desirable as maintaining two largely independent build systems potentially carries significant cost. Can we in principle expect the Shake build system to build all of the configurations GHC currently supports from day-one (including, for instance, cross compilation)? The batch files in the repository suggest that it has been used on Windows but has it been tested on our other Tier 1 platforms? I just attempted to use the current state of the repository and sadly found that things fell apart pretty quickly [1]. I would love to see this happen, but obviously we need to tread carefully when performing such a major overhaul to code so central to the project. Moreover, I would really like to minimize the probability that we increase the maintenance burden of our build infrastructure. I think if there is clear communication regarding what remains to be done and motivation to quickly finish these items then Shakification is still an possibility for 7.12. Otherwise I personally think we may want to be a bit conservative. End users can always check out the shaking-up-ghc repository into their GHC trees themselves if they want to try using it. I'm certainly willing to consider other opinions, however. Cheers, - Ben [1] After I manually ran ./boot, $ _shake/build --lint --directory ".." $@ ... # various output from ./configure Reading shake/cfg/system.config... Reading package dependencies... Error when running Shake build system: * OracleQ (PackageDataKey ("libraries/bin-package-db/dist-boot/package-data.mk","libraries_bin-package-db_dist-boot_LIB_NAME")) * libraries/bin-package-db/dist-boot/package-data.mk * libraries/bin-package-db/dist-boot/package-data.mk libraries/bin-package-db/dist-boot/haddock-prologue.txt libraries/bin-package-db/dist-boot/inplace-pkg-config libraries/bin-package-db/dist-boot/setup-config libraries/bin-package-db/dist-boot/build/autogen/cabal_macros.h * libraries/binary/dist-boot/package-data.mk * libraries/binary/dist-boot/package-data.mk libraries/binary/dist-boot/haddock-prologue.txt libraries/binary/dist-boot/inplace-pkg-config libraries/binary/dist-boot/setup-config libraries/binary/dist-boot/build/autogen/cabal_macros.h * /mnt/work/ghc/ghc-shake/inplace/bin/ghc-cabal Error, file does not exist and no rule available: /mnt/work/ghc/ghc-shake/inplace/bin/ghc-cabal -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From mail at nh2.me Fri Aug 28 12:15:50 2015 From: mail at nh2.me (=?windows-1252?Q?Niklas_Hamb=FCchen?=) Date: Fri, 28 Aug 2015 14:15:50 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87bndrvdnf.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> <87bndrvdnf.fsf@smart-cactus.org> Message-ID: <55E050F6.7060102@nh2.me> On 28/08/15 13:46, Ben Gamari wrote: > If this is intended to be an immediate wholesale switch to Shake I would > be very skeptical of merging for 7.12 as three months is, in my > opinion, very little time to test such a sweeping change. One thing I don't understand in this discussion: Why need the change of GHC's build system be tied to any release at all? Since it's nothing really affecting GHC users, only GHC devs, can't it happen at any point in time between releases? From simonpj at microsoft.com Fri Aug 28 12:24:05 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 28 Aug 2015 12:24:05 +0000 Subject: Planning for the 7.12 release In-Reply-To: <87bndrvdnf.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> <87bndrvdnf.fsf@smart-cactus.org> Message-ID: <73f40469b133402ead3b0c09d5d1cd08@DB4PR30MB030.064d.mgd.msft.net> | If this is intended to be an immediate wholesale switch to Shake I | would be very skeptical of merging for 7.12 as three months is, in my | opinion, very little time to test such a sweeping change. No there is no chance that we switch over to Shake for 7.12. At the moment it's nowhere near ready for prime time. But I do hope that we'll have material progress on a side-by-side build system before 7.12 comes out. It probably won't do everything but it should do some things well. I hope. Simon | | However, a long transition time is also not terribly desirable as | maintaining two largely independent build systems potentially carries | significant cost. | | Can we in principle expect the Shake build system to build all of the | configurations GHC currently supports from day-one (including, for | instance, cross compilation)? The batch files in the repository | suggest that it has been used on Windows but has it been tested on our | other Tier 1 platforms? | | I just attempted to use the current state of the repository and sadly | found that things fell apart pretty quickly [1]. | | I would love to see this happen, but obviously we need to tread | carefully when performing such a major overhaul to code so central to | the project. Moreover, I would really like to minimize the probability | that we increase the maintenance burden of our build infrastructure. | | I think if there is clear communication regarding what remains to be | done and motivation to quickly finish these items then Shakification | is still an possibility for 7.12. | | Otherwise I personally think we may want to be a bit conservative. End | users can always check out the shaking-up-ghc repository into their | GHC trees themselves if they want to try using it. I'm certainly | willing to consider other opinions, however. | | Cheers, | | - Ben | | | [1] After I manually ran ./boot, | | $ _shake/build --lint --directory ".." $@ | ... # various output from ./configure | Reading shake/cfg/system.config... | Reading package dependencies... | Error when running Shake build system: | * OracleQ (PackageDataKey ("libraries/bin-package-db/dist- | boot/package-data.mk","libraries_bin-package-db_dist-boot_LIB_NAME")) | * libraries/bin-package-db/dist-boot/package-data.mk | * libraries/bin-package-db/dist-boot/package-data.mk | libraries/bin-package-db/dist-boot/haddock-prologue.txt libraries/bin- | package-db/dist-boot/inplace-pkg-config libraries/bin-package-db/dist- | boot/setup-config libraries/bin-package-db/dist- | boot/build/autogen/cabal_macros.h | * libraries/binary/dist-boot/package-data.mk | * libraries/binary/dist-boot/package-data.mk | libraries/binary/dist-boot/haddock-prologue.txt libraries/binary/dist- | boot/inplace-pkg-config libraries/binary/dist-boot/setup-config | libraries/binary/dist-boot/build/autogen/cabal_macros.h | * /mnt/work/ghc/ghc-shake/inplace/bin/ghc-cabal | Error, file does not exist and no rule available: | /mnt/work/ghc/ghc-shake/inplace/bin/ghc-cabal From ryan.gl.scott at gmail.com Fri Aug 28 15:01:11 2015 From: ryan.gl.scott at gmail.com (Ryan Scott) Date: Fri, 28 Aug 2015 11:01:11 -0400 Subject: Planning for the 7.12 release Message-ID: Something I wanted to try to implement before GHC 7.12 are extensions to DeriveFunctor, DeriveFoldable, and DeriveTraversable which would allow for automatic derivation of Bifunctor, Bifoldable, and Bitraversable. (This has already been implemented using Template Haskell in an unreleased version of the bifunctors package [1], so implementing it in GHC should be straightforward.) I haven't made an issue for this yet since I was waiting on the results of #10448 [2] (which would put Bifoldable and Bitraversable in base, in addition to Bifunctor, which was added in 7.10). Currently, Edward Kmett is listed as the owner of that issue, but I would be happy to tackle it if it means we could get it in time for GHC 7.12. Ryan S. ----- [1] https://github.com/ekmett/bifunctors/tree/8e975aead363802610dedccf414b884f9b39b1f4 [2] https://ghc.haskell.org/trac/ghc/ticket/10448 From greg at gregweber.info Fri Aug 28 15:43:22 2015 From: greg at gregweber.info (Greg Weber) Date: Fri, 28 Aug 2015 08:43:22 -0700 Subject: Planning for the 7.12 release In-Reply-To: <87r3mo68t0.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: Can we call this GHC 8.0 instead of 7.12 ? Overloaded record fields and backtraces are a huge missing piece to Haskell. It would be nice to have the bump to celebrate this occasion and say that Haskell 8 is "ready". I have had a hard time seriously recommending Haskell due to those last missing features. Now I should be able to say without reservation: "use Haskell > 8; it is great!" On Thu, Aug 27, 2015 at 8:38 AM, Ben Gamari wrote: > > Hello everyone! > > With the 7.10.1 release nearly six months behind us and 7.10.2 out of the > way, now is a good time to begin looking forward to 7.12. In keeping > with the typical release pace, we are aiming to have a release > candidate ready in mid-December 2015 and a final release in January > 2016. > > The items that that we currently believe have a good chance of making it > in to 7.12 are listed on the release status page [1], which I've > summarized below (in no particular order), > > > * Support for implicit parameters providing callstacks and source > locations > > * Support for wildcards in data and type family instances > > * A new, type-indexed type representation, data TTypeRep (a :: k). > > * Introduction of visible type application > > * Support for reasoning about kind equalities > > * Support for Injective Type Families > > * Support for the Strict language extension > > * Support for Overloaded Record Fields, allowing multiple uses of > the same field name and a form of type-directed name resolution. > > * A huge improvement to pattern matching (including much better > coverage of GADTs) > > * Backpack is chugging along; we have a new user-facing syntax which > allows multiple modules to be defined a single file, and are > hoping to release at least the ability to publish multiple "units" > in a single Cabal file. > > * Support for Applicative Do, allowing GHC to desugar do-notation to > Applicative where possible. > > * Improved DWARF based debugging support including backtraces from > Haskell code > > * An Improved LLVM Backend that ships with every major Tier 1 platform. > > > These items are a bit less certain but may make it in if the authors > push forward quickly enough, > > > * Support for Type Signature Sections, allowing you to write (:: ty) > as a shorthand for (\x -> x :: ty). > > * A (possible) overhaul of GHC's build system to use Shake instead > of Make. > > * A DEPRECATED pragma for exports > > > Is your pet project missing from this list? If you have a patch that you > believe is on-track to make it in for 7.12, please let us know. > > Moreover, if you have an issue that you urgently need fixed in 7.12, > please express you interest on the appropriate ticket. User feedback > helps us immensely in figuring out how to best place our priorities. > > Cheers, > > - Ben > > > [1] https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.12.1 > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonpj at microsoft.com Fri Aug 28 15:45:08 2015 From: simonpj at microsoft.com (Simon Peyton Jones) Date: Fri, 28 Aug 2015 15:45:08 +0000 Subject: Planning for the 7.12 release In-Reply-To: References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: Actually that?s a good idea. Simon From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Greg Weber Sent: 28 August 2015 16:43 To: Ben Gamari Cc: GHC developers Subject: Re: Planning for the 7.12 release Can we call this GHC 8.0 instead of 7.12 ? Overloaded record fields and backtraces are a huge missing piece to Haskell. It would be nice to have the bump to celebrate this occasion and say that Haskell 8 is "ready". I have had a hard time seriously recommending Haskell due to those last missing features. Now I should be able to say without reservation: "use Haskell > 8; it is great!" On Thu, Aug 27, 2015 at 8:38 AM, Ben Gamari > wrote: Hello everyone! With the 7.10.1 release nearly six months behind us and 7.10.2 out of the way, now is a good time to begin looking forward to 7.12. In keeping with the typical release pace, we are aiming to have a release candidate ready in mid-December 2015 and a final release in January 2016. The items that that we currently believe have a good chance of making it in to 7.12 are listed on the release status page [1], which I've summarized below (in no particular order), * Support for implicit parameters providing callstacks and source locations * Support for wildcards in data and type family instances * A new, type-indexed type representation, data TTypeRep (a :: k). * Introduction of visible type application * Support for reasoning about kind equalities * Support for Injective Type Families * Support for the Strict language extension * Support for Overloaded Record Fields, allowing multiple uses of the same field name and a form of type-directed name resolution. * A huge improvement to pattern matching (including much better coverage of GADTs) * Backpack is chugging along; we have a new user-facing syntax which allows multiple modules to be defined a single file, and are hoping to release at least the ability to publish multiple "units" in a single Cabal file. * Support for Applicative Do, allowing GHC to desugar do-notation to Applicative where possible. * Improved DWARF based debugging support including backtraces from Haskell code * An Improved LLVM Backend that ships with every major Tier 1 platform. These items are a bit less certain but may make it in if the authors push forward quickly enough, * Support for Type Signature Sections, allowing you to write (:: ty) as a shorthand for (\x -> x :: ty). * A (possible) overhaul of GHC's build system to use Shake instead of Make. * A DEPRECATED pragma for exports Is your pet project missing from this list? If you have a patch that you believe is on-track to make it in for 7.12, please let us know. Moreover, if you have an issue that you urgently need fixed in 7.12, please express you interest on the appropriate ticket. User feedback helps us immensely in figuring out how to best place our priorities. Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.12.1 _______________________________________________ ghc-devs mailing list ghc-devs at haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Fri Aug 28 17:26:55 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 19:26:55 +0200 Subject: Planning for the 7.12 release In-Reply-To: <73f40469b133402ead3b0c09d5d1cd08@DB4PR30MB030.064d.mgd.msft.net> References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> <87bndrvdnf.fsf@smart-cactus.org> <73f40469b133402ead3b0c09d5d1cd08@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <871tenuxwg.fsf@smart-cactus.org> Simon Peyton Jones writes: > | If this is intended to be an immediate wholesale switch to Shake I > | would be very skeptical of merging for 7.12 as three months is, in my > | opinion, very little time to test such a sweeping change. > > No there is no chance that we switch over to Shake for 7.12. At the > moment it's nowhere near ready for prime time. > Thanks for clarifying! > But I do hope that we'll have material progress on a side-by-side > build system before 7.12 comes out. It probably won't do everything > but it should do some things well. I hope. > Andrey, I look forward to seeing an update on things. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From ben at well-typed.com Fri Aug 28 17:33:13 2015 From: ben at well-typed.com (Ben Gamari) Date: Fri, 28 Aug 2015 19:33:13 +0200 Subject: Planning for the 7.12 release In-Reply-To: References: <87r3mo68t0.fsf@smart-cactus.org> Message-ID: <87y4gvtj1i.fsf@smart-cactus.org> Simon Peyton Jones writes: > Actually that?s a good idea. > > Simon > > > From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Greg Weber > Sent: 28 August 2015 16:43 > To: Ben Gamari > Cc: GHC developers > Subject: Re: Planning for the 7.12 release > > Can we call this GHC 8.0 instead of 7.12 ? > Overloaded record fields and backtraces are a huge missing piece to > Haskell. It would be nice to have the bump to celebrate this occasion > and say that Haskell 8 is "ready". I have had a hard time seriously > recommending Haskell due to those last missing features. Now I should > be able to say without reservation: "use Haskell > 8; it is great!" > I was discussing this very matter yesterday with a few folks. I think we certainly have enough features in this release to do a major bump. I half-jokingly suggested that 8.0 should only come with Phase 2 of Richard's Dependent Haskell work, but I'm willing to settle for merely kind equality. I think doing a major bump would be a great idea. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From ekmett at gmail.com Fri Aug 28 17:55:12 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 13:55:12 -0400 Subject: ArrayArrays In-Reply-To: <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I posted a summary article on "what this lets you do" to https://www.fpcomplete.com/user/edwardk/unlifted-structures I can see about making a more proposal/feature-oriented summary for the Haskell Wiki. It may have to wait until after ICFP though. -Edward On Fri, Aug 28, 2015 at 5:42 AM, Simon Peyton Jones wrote: > At the very least I'll take this email and turn it into a short article. > > Yes, please do make it into a wiki page on the GHC Trac, and maybe make a > ticket for it. > > > Thanks > > > > Simon > > > > *From:* Edward Kmett [mailto:ekmett at gmail.com] > *Sent:* 27 August 2015 16:54 > *To:* Simon Peyton Jones > *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs > *Subject:* Re: ArrayArrays > > > > An ArrayArray# is just an Array# with a modified invariant. It points > directly to other unlifted ArrayArray#'s or ByteArray#'s. > > > > While those live in #, they are garbage collected objects, so this all > lives on the heap. > > > > They were added to make some of the DPH stuff fast when it has to deal > with nested arrays. > > > > I'm currently abusing them as a placeholder for a better thing. > > > > The Problem > > ----------------- > > > > Consider the scenario where you write a classic doubly-linked list in > Haskell. > > > > data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > > > > Chasing from one DLL to the next requires following 3 pointers on the heap. > > > > DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> > DLL > > > > That is 3 levels of indirection. > > > > We can trim one by simply unpacking the IORef with -funbox-strict-fields > or UNPACK > > > > We can trim another by adding a 'Nil' constructor for DLL and worsening > our representation. > > > > data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > > > > but now we're still stuck with a level of indirection > > > > DLL ~> MutVar# RealWorld DLL ~> DLL > > > > This means that every operation we perform on this structure will be about > half of the speed of an implementation in most other languages assuming > we're memory bound on loading things into cache! > > > > Making Progress > > ---------------------- > > > > I have been working on a number of data structures where the indirection > of going from something in * out to an object in # which contains the real > pointer to my target and coming back effectively doubles my runtime. > > > > We go out to the MutVar# because we are allowed to put the MutVar# onto > the mutable list when we dirty it. There is a well defined write-barrier. > > > > I could change out the representation to use > > > > data DLL = DLL (MutableArray# RealWorld DLL) | Nil > > > > I can just store two pointers in the MutableArray# every time, but this > doesn't help _much_ directly. It has reduced the amount of distinct > addresses in memory I touch on a walk of the DLL from 3 per object to 2. > > > > I still have to go out to the heap from my DLL and get to the array object > and then chase it to the next DLL and chase that to the next array. I do > get my two pointers together in memory though. I'm paying for a card > marking table as well, which I don't particularly need with just two > pointers, but we can shed that with the "SmallMutableArray#" machinery > added back in 7.10, which is just the old array code a a new data type, > which can speed things up a bit when you don't have very big arrays: > > > > data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > > > > But what if I wanted my object itself to live in # and have two mutable > fields and be able to share the sme write barrier? > > > > An ArrayArray# points directly to other unlifted array types. What if we > have one # -> * wrapper on the outside to deal with the impedence mismatch > between the imperative world and Haskell, and then just let the > ArrayArray#'s hold other arrayarrays. > > > > data DLL = DLL (MutableArrayArray# RealWorld) > > > > now I need to make up a new Nil, which I can just make be a special > MutableArrayArray# I allocate on program startup. I can even abuse pattern > synonyms. Alternately I can exploit the internals further to make this > cheaper. > > > > Then I can use the readMutableArrayArray# and writeMutableArrayArray# > calls to directly access the preceding and next entry in the linked list. > > > > So now we have one DLL wrapper which just 'bootstraps me' into a strict > world, and everything there lives in #. > > > > next :: DLL -> IO DLL > > next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > > (# s', n #) -> (# s', DLL n #) > > > > It turns out GHC is quite happy to optimize all of that code to keep > things unboxed. The 'DLL' wrappers get removed pretty easily when they are > known strict and you chain operations of this sort! > > > > Cleaning it Up > > ------------------ > > > > Now I have one outermost indirection pointing to an array that points > directly to other arrays. > > > > I'm stuck paying for a card marking table per object, but I can fix that > by duplicating the code for MutableArrayArray# and using a > SmallMutableArray#. I can hack up primops that let me store a mixture of > SmallMutableArray# fields and normal ones in the data structure. > Operationally, I can even do so by just unsafeCoercing the existing > SmallMutableArray# primitives to change the kind of one of the arguments it > takes. > > > > This is almost ideal, but not quite. I often have fields that would be > best left unboxed. > > > > data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > > > > was able to unpack the Int, but we lost that. We can currently at best > point one of the entries of the SmallMutableArray# at a boxed or at a > MutableByteArray# for all of our misc. data and shove the int in question > in there. > > > > e.g. if I were to implement a hash-array-mapped-trie I need to store masks > and administrivia as I walk down the tree. Having to go off to the side > costs me the entire win from avoiding the first pointer chase. > > > > But, if like Ryan suggested, we had a heap object we could construct that > had n words with unsafe access and m pointers to other heap objects, one > that could put itself on the mutable list when any of those pointers > changed then I could shed this last factor of two in all circumstances. > > > > Prototype > > ------------- > > > > Over the last few days I've put together a small prototype implementation > with a few non-trivial imperative data structures for things like Tarjan's > link-cut trees, the list labeling problem and order-maintenance. > > > > https://github.com/ekmett/structs > > > > Notable bits: > > > > Data.Struct.Internal.LinkCut > > provides an implementation of link-cut trees in this style. > > > > Data.Struct.Internal > > provides the rather horrifying guts that make it go fast. > > > > Once compiled with -O or -O2, if you look at the core, almost all the > references to the LinkCut or Object data constructor get optimized away, > and we're left with beautiful strict code directly mutating out underlying > representation. > > > > At the very least I'll take this email and turn it into a short article. > > > > -Edward > > > > On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > wrote: > > Just to say that I have no idea what is going on in this thread. What is > ArrayArray? What is the issue in general? Is there a ticket? Is there a > wiki page? > > > > If it?s important, an ab-initio wiki page + ticket would be a good thing. > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward > Kmett > *Sent:* 21 August 2015 05:25 > *To:* Manuel M T Chakravarty > *Cc:* Simon Marlow; ghc-devs > *Subject:* Re: ArrayArrays > > > > When (ab)using them for this purpose, SmallArrayArray's would be very > handy as well. > > > > Consider right now if I have something like an order-maintenance structure > I have: > > > > data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) > > > > data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} > !(MutVar s (Lower s)) > > > > The former contains, logically, a mutable integer and two pointers, one > for forward and one for backwards. The latter is basically the same thing > with a mutable reference up pointing at the structure above. > > > > On the heap this is an object that points to a structure for the > bytearray, and points to another structure for each mutvar which each point > to the other 'Upper' structure. So there is a level of indirection smeared > over everything. > > > > So this is a pair of doubly linked lists with an upward link from the > structure below to the structure above. > > > > Converted into ArrayArray#s I'd get > > > > data Upper s = Upper (MutableArrayArray# s) > > > > w/ the first slot being a pointer to a MutableByteArray#, and the next 2 > slots pointing to the previous and next previous objects, represented just > as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for > object identity, which lets me check for the ends of the lists by tying > things back on themselves. > > > > and below that > > > > data Lower s = Lower (MutableArrayArray# s) > > > > is similar, with an extra MutableArrayArray slot pointing up to an upper > structure. > > > > I can then write a handful of combinators for getting out the slots in > question, while it has gained a level of indirection between the wrapper to > put it in * and the MutableArrayArray# s in #, that one can be basically > erased by ghc. > > > > Unlike before I don't have several separate objects on the heap for each > thing. I only have 2 now. The MutableArrayArray# for the object itself, and > the MutableByteArray# that it references to carry around the mutable int. > > > > The only pain points are > > > > 1.) the aforementioned limitation that currently prevents me from stuffing > normal boxed data through a SmallArray or Array into an ArrayArray leaving > me in a little ghetto disconnected from the rest of Haskell, > > > > and > > > > 2.) the lack of SmallArrayArray's, which could let us avoid the card > marking overhead. These objects are all small, 3-4 pointers wide. Card > marking doesn't help. > > > > Alternately I could just try to do really evil things and convert the > whole mess to SmallArrays and then figure out how to unsafeCoerce my way to > glory, stuffing the #'d references to the other arrays directly into the > SmallArray as slots, removing the limitation we see here by aping the > MutableArrayArray# s API, but that gets really really dangerous! > > > > I'm pretty much willing to sacrifice almost anything on the altar of speed > here, but I'd like to be able to let the GC move them and collect them > which rules out simpler Ptr and Addr based solutions. > > > > -Edward > > > > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < > chak at cse.unsw.edu.au> wrote: > > That?s an interesting idea. > > Manuel > > > Edward Kmett : > > > > > Would it be possible to add unsafe primops to add Array# and SmallArray# > entries to an ArrayArray#? The fact that the ArrayArray# entries are all > directly unlifted avoiding a level of indirection for the containing > structure is amazing, but I can only currently use it if my leaf level data > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be > able to have the ability to put SmallArray# a stuff down at the leaves to > hold lifted contents. > > > > I accept fully that if I name the wrong type when I go to access one of > the fields it'll lie to me, but I suppose it'd do that if i tried to use > one of the members that held a nested ArrayArray# as a ByteArray# anyways, > so it isn't like there is a safety story preventing this. > > > > I've been hunting for ways to try to kill the indirection problems I get > with Haskell and mutable structures, and I could shoehorn a number of them > into ArrayArrays if this worked. > > > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection > compared to c/java and this could reduce that pain to just 1 level of > unnecessary indirection. > > > > -Edward > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elliot.cameron at covenanteyes.com Fri Aug 28 19:36:58 2015 From: elliot.cameron at covenanteyes.com (Elliot Cameron) Date: Fri, 28 Aug 2015 19:36:58 +0000 Subject: Planning for the 7.12 release In-Reply-To: <87h9nj68x3.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <1440717837983.74036@covenanteyes.com>,<87h9nj68x3.fsf@smart-cactus.org> Message-ID: <1440790625443.31084@covenanteyes.com> I'm going by my rather poor memory for this. Frankly, I don't really care where the option sits, as long as I don't need a separate build of GHC to avoid LGPL. ________________________________________ From: Ben Gamari Sent: Friday, August 28, 2015 5:48 AM To: Elliot Cameron; GHC developers Cc: Herbert Valerio Riedel Subject: Re: Planning for the 7.12 release Elliot Cameron writes: > I can't seem to find the exact trac ticket, but the ability to swap > out "Integer" implementations at link time would be a huge relief on > Windows, which suffers from various problems with dynamic linking. I > believe it was originally slated for 7.12. Can someone find it? Here's > what I did find: > https://ghc.haskell.org/trac/ghc/wiki/ReplacingGMPNotes > > A related discussion: > https://github.com/commercialhaskell/stack/issues/399 > > As it stands, we've been trying hard to find good ways to provide > custom GHC variants more easily to end users. > Hmm, interesting. I'm not sure how realistic it is to make this a link-time option, however, considering that we may inline bindings from whatever integer package we compile against into the user's program. Herbert, do you have any thoughts on this? Cheers, - Ben From hvriedel at gmail.com Fri Aug 28 20:42:14 2015 From: hvriedel at gmail.com (Herbert Valerio Riedel) Date: Fri, 28 Aug 2015 22:42:14 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87h9nj68x3.fsf@smart-cactus.org> (Ben Gamari's message of "Fri, 28 Aug 2015 11:48:08 +0200") References: <87r3mo68t0.fsf@smart-cactus.org> <1440717837983.74036@covenanteyes.com> <87h9nj68x3.fsf@smart-cactus.org> Message-ID: <87si735emx.fsf@gmail.com> On 2015-08-28 at 11:48:08 +0200, Ben Gamari wrote: [...] > Hmm, interesting. I'm not sure how realistic it is to make this a > link-time option, however, considering that we may inline bindings from > whatever integer package we compile against into the user's program. > > Herbert, do you have any thoughts on this? It's quite realistic, because we can do this at the C ABI level, where inlining is not an issue. We wouldn't switch between various Haskell packages, but rather between different C libraries, all providing the same API at the C-ABI level. This was hinted at in https://ghc.haskell.org/trac/ghc/wiki/Design/IntegerGmp2 The first step towards that is finding an alternative bignum library (maybe 'bsdnt') that can be modified to operate on the same array-of-limb representation as GMP, and get this working as a GHC build-time configuration option of the `integer-gmp`-package. Once this is accomplished, turning this into a `ghc` link-time flag should be relatively easy. Cheers, hvr From roma at ro-che.info Fri Aug 28 20:51:17 2015 From: roma at ro-che.info (Roman Cheplyaka) Date: Fri, 28 Aug 2015 23:51:17 +0300 Subject: Planning for the 7.12 release In-Reply-To: <87y4gvtj1i.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <87y4gvtj1i.fsf@smart-cactus.org> Message-ID: <55E0C9C5.7070707@ro-che.info> On 28/08/15 20:33, Ben Gamari wrote: > I half-jokingly suggested that 8.0 should only come with Phase 2 of > Richard's Dependent Haskell work Ah, that would be perfect. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rrnewton at gmail.com Fri Aug 28 21:30:45 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Fri, 28 Aug 2015 21:30:45 +0000 Subject: ArrayArrays In-Reply-To: <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I like the possibility of a general solution for mutable structs (like Ed said), and I'm trying to fully understand why it's hard. So, we can't unpack MutVar into constructors because of object identity problems. But what about directly supporting an extensible set of unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? That may be too much work, but is it problematic otherwise? Needless to say, this is also critical if we ever want best in class lockfree mutable structures, just like their Stm and sequential counterparts. On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones wrote: > At the very least I'll take this email and turn it into a short article. > > Yes, please do make it into a wiki page on the GHC Trac, and maybe make a > ticket for it. > > > Thanks > > > > Simon > > > > *From:* Edward Kmett [mailto:ekmett at gmail.com] > *Sent:* 27 August 2015 16:54 > *To:* Simon Peyton Jones > *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs > *Subject:* Re: ArrayArrays > > > > An ArrayArray# is just an Array# with a modified invariant. It points > directly to other unlifted ArrayArray#'s or ByteArray#'s. > > > > While those live in #, they are garbage collected objects, so this all > lives on the heap. > > > > They were added to make some of the DPH stuff fast when it has to deal > with nested arrays. > > > > I'm currently abusing them as a placeholder for a better thing. > > > > The Problem > > ----------------- > > > > Consider the scenario where you write a classic doubly-linked list in > Haskell. > > > > data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > > > > Chasing from one DLL to the next requires following 3 pointers on the heap. > > > > DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> > DLL > > > > That is 3 levels of indirection. > > > > We can trim one by simply unpacking the IORef with -funbox-strict-fields > or UNPACK > > > > We can trim another by adding a 'Nil' constructor for DLL and worsening > our representation. > > > > data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > > > > but now we're still stuck with a level of indirection > > > > DLL ~> MutVar# RealWorld DLL ~> DLL > > > > This means that every operation we perform on this structure will be about > half of the speed of an implementation in most other languages assuming > we're memory bound on loading things into cache! > > > > Making Progress > > ---------------------- > > > > I have been working on a number of data structures where the indirection > of going from something in * out to an object in # which contains the real > pointer to my target and coming back effectively doubles my runtime. > > > > We go out to the MutVar# because we are allowed to put the MutVar# onto > the mutable list when we dirty it. There is a well defined write-barrier. > > > > I could change out the representation to use > > > > data DLL = DLL (MutableArray# RealWorld DLL) | Nil > > > > I can just store two pointers in the MutableArray# every time, but this > doesn't help _much_ directly. It has reduced the amount of distinct > addresses in memory I touch on a walk of the DLL from 3 per object to 2. > > > > I still have to go out to the heap from my DLL and get to the array object > and then chase it to the next DLL and chase that to the next array. I do > get my two pointers together in memory though. I'm paying for a card > marking table as well, which I don't particularly need with just two > pointers, but we can shed that with the "SmallMutableArray#" machinery > added back in 7.10, which is just the old array code a a new data type, > which can speed things up a bit when you don't have very big arrays: > > > > data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > > > > But what if I wanted my object itself to live in # and have two mutable > fields and be able to share the sme write barrier? > > > > An ArrayArray# points directly to other unlifted array types. What if we > have one # -> * wrapper on the outside to deal with the impedence mismatch > between the imperative world and Haskell, and then just let the > ArrayArray#'s hold other arrayarrays. > > > > data DLL = DLL (MutableArrayArray# RealWorld) > > > > now I need to make up a new Nil, which I can just make be a special > MutableArrayArray# I allocate on program startup. I can even abuse pattern > synonyms. Alternately I can exploit the internals further to make this > cheaper. > > > > Then I can use the readMutableArrayArray# and writeMutableArrayArray# > calls to directly access the preceding and next entry in the linked list. > > > > So now we have one DLL wrapper which just 'bootstraps me' into a strict > world, and everything there lives in #. > > > > next :: DLL -> IO DLL > > next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > > (# s', n #) -> (# s', DLL n #) > > > > It turns out GHC is quite happy to optimize all of that code to keep > things unboxed. The 'DLL' wrappers get removed pretty easily when they are > known strict and you chain operations of this sort! > > > > Cleaning it Up > > ------------------ > > > > Now I have one outermost indirection pointing to an array that points > directly to other arrays. > > > > I'm stuck paying for a card marking table per object, but I can fix that > by duplicating the code for MutableArrayArray# and using a > SmallMutableArray#. I can hack up primops that let me store a mixture of > SmallMutableArray# fields and normal ones in the data structure. > Operationally, I can even do so by just unsafeCoercing the existing > SmallMutableArray# primitives to change the kind of one of the arguments it > takes. > > > > This is almost ideal, but not quite. I often have fields that would be > best left unboxed. > > > > data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > > > > was able to unpack the Int, but we lost that. We can currently at best > point one of the entries of the SmallMutableArray# at a boxed or at a > MutableByteArray# for all of our misc. data and shove the int in question > in there. > > > > e.g. if I were to implement a hash-array-mapped-trie I need to store masks > and administrivia as I walk down the tree. Having to go off to the side > costs me the entire win from avoiding the first pointer chase. > > > > But, if like Ryan suggested, we had a heap object we could construct that > had n words with unsafe access and m pointers to other heap objects, one > that could put itself on the mutable list when any of those pointers > changed then I could shed this last factor of two in all circumstances. > > > > Prototype > > ------------- > > > > Over the last few days I've put together a small prototype implementation > with a few non-trivial imperative data structures for things like Tarjan's > link-cut trees, the list labeling problem and order-maintenance. > > > > https://github.com/ekmett/structs > > > > Notable bits: > > > > Data.Struct.Internal.LinkCut > > provides an implementation of link-cut trees in this style. > > > > Data.Struct.Internal > > provides the rather horrifying guts that make it go fast. > > > > Once compiled with -O or -O2, if you look at the core, almost all the > references to the LinkCut or Object data constructor get optimized away, > and we're left with beautiful strict code directly mutating out underlying > representation. > > > > At the very least I'll take this email and turn it into a short article. > > > > -Edward > > > > On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > wrote: > > Just to say that I have no idea what is going on in this thread. What is > ArrayArray? What is the issue in general? Is there a ticket? Is there a > wiki page? > > > > If it?s important, an ab-initio wiki page + ticket would be a good thing. > > > > Simon > > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward > Kmett > *Sent:* 21 August 2015 05:25 > *To:* Manuel M T Chakravarty > *Cc:* Simon Marlow; ghc-devs > *Subject:* Re: ArrayArrays > > > > When (ab)using them for this purpose, SmallArrayArray's would be very > handy as well. > > > > Consider right now if I have something like an order-maintenance structure > I have: > > > > data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} > !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) > > > > data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} > !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} > !(MutVar s (Lower s)) > > > > The former contains, logically, a mutable integer and two pointers, one > for forward and one for backwards. The latter is basically the same thing > with a mutable reference up pointing at the structure above. > > > > On the heap this is an object that points to a structure for the > bytearray, and points to another structure for each mutvar which each point > to the other 'Upper' structure. So there is a level of indirection smeared > over everything. > > > > So this is a pair of doubly linked lists with an upward link from the > structure below to the structure above. > > > > Converted into ArrayArray#s I'd get > > > > data Upper s = Upper (MutableArrayArray# s) > > > > w/ the first slot being a pointer to a MutableByteArray#, and the next 2 > slots pointing to the previous and next previous objects, represented just > as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for > object identity, which lets me check for the ends of the lists by tying > things back on themselves. > > > > and below that > > > > data Lower s = Lower (MutableArrayArray# s) > > > > is similar, with an extra MutableArrayArray slot pointing up to an upper > structure. > > > > I can then write a handful of combinators for getting out the slots in > question, while it has gained a level of indirection between the wrapper to > put it in * and the MutableArrayArray# s in #, that one can be basically > erased by ghc. > > > > Unlike before I don't have several separate objects on the heap for each > thing. I only have 2 now. The MutableArrayArray# for the object itself, and > the MutableByteArray# that it references to carry around the mutable int. > > > > The only pain points are > > > > 1.) the aforementioned limitation that currently prevents me from stuffing > normal boxed data through a SmallArray or Array into an ArrayArray leaving > me in a little ghetto disconnected from the rest of Haskell, > > > > and > > > > 2.) the lack of SmallArrayArray's, which could let us avoid the card > marking overhead. These objects are all small, 3-4 pointers wide. Card > marking doesn't help. > > > > Alternately I could just try to do really evil things and convert the > whole mess to SmallArrays and then figure out how to unsafeCoerce my way to > glory, stuffing the #'d references to the other arrays directly into the > SmallArray as slots, removing the limitation we see here by aping the > MutableArrayArray# s API, but that gets really really dangerous! > > > > I'm pretty much willing to sacrifice almost anything on the altar of speed > here, but I'd like to be able to let the GC move them and collect them > which rules out simpler Ptr and Addr based solutions. > > > > -Edward > > > > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < > chak at cse.unsw.edu.au> wrote: > > That?s an interesting idea. > > Manuel > > > Edward Kmett : > > > > > Would it be possible to add unsafe primops to add Array# and SmallArray# > entries to an ArrayArray#? The fact that the ArrayArray# entries are all > directly unlifted avoiding a level of indirection for the containing > structure is amazing, but I can only currently use it if my leaf level data > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be > able to have the ability to put SmallArray# a stuff down at the leaves to > hold lifted contents. > > > > I accept fully that if I name the wrong type when I go to access one of > the fields it'll lie to me, but I suppose it'd do that if i tried to use > one of the members that held a nested ArrayArray# as a ByteArray# anyways, > so it isn't like there is a safety story preventing this. > > > > I've been hunting for ways to try to kill the indirection problems I get > with Haskell and mutable structures, and I could shoehorn a number of them > into ArrayArrays if this worked. > > > > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection > compared to c/java and this could reduce that pain to just 1 level of > unnecessary indirection. > > > > -Edward > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Fri Aug 28 21:43:06 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 17:43:06 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Some form of MutableStruct# with a known number of words and a known number of pointers is basically what Ryan Yates was suggesting above, but where the word counts were stored in the objects themselves. Given that it'd have a couple of words for those counts it'd likely want to be something we build in addition to MutVar# rather than a replacement. On the other hand, if we had to fix those numbers and build info tables that knew them, and typechecker support, for instance, it'd get rather invasive. Also, a number of things that we can do with the 'sized' versions above, like working with evil unsized c-style arrays directly inline at the end of the structure cease to be possible, so it isn't even a pure win if we did the engineering effort. I think 90% of the needs I have are covered just by adding the one primitive. The last 10% gets pretty invasive. -Edward On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton wrote: > I like the possibility of a general solution for mutable structs (like Ed > said), and I'm trying to fully understand why it's hard. > > So, we can't unpack MutVar into constructors because of object identity > problems. But what about directly supporting an extensible set of unlifted > MutStruct# objects, generalizing (and even replacing) MutVar#? That may be > too much work, but is it problematic otherwise? > > Needless to say, this is also critical if we ever want best in class > lockfree mutable structures, just like their Stm and sequential > counterparts. > > On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > wrote: > >> At the very least I'll take this email and turn it into a short article. >> >> Yes, please do make it into a wiki page on the GHC Trac, and maybe make a >> ticket for it. >> >> >> Thanks >> >> >> >> Simon >> >> >> >> *From:* Edward Kmett [mailto:ekmett at gmail.com] >> *Sent:* 27 August 2015 16:54 >> *To:* Simon Peyton Jones >> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >> *Subject:* Re: ArrayArrays >> >> >> >> An ArrayArray# is just an Array# with a modified invariant. It points >> directly to other unlifted ArrayArray#'s or ByteArray#'s. >> >> >> >> While those live in #, they are garbage collected objects, so this all >> lives on the heap. >> >> >> >> They were added to make some of the DPH stuff fast when it has to deal >> with nested arrays. >> >> >> >> I'm currently abusing them as a placeholder for a better thing. >> >> >> >> The Problem >> >> ----------------- >> >> >> >> Consider the scenario where you write a classic doubly-linked list in >> Haskell. >> >> >> >> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >> >> >> Chasing from one DLL to the next requires following 3 pointers on the >> heap. >> >> >> >> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> >> DLL >> >> >> >> That is 3 levels of indirection. >> >> >> >> We can trim one by simply unpacking the IORef with -funbox-strict-fields >> or UNPACK >> >> >> >> We can trim another by adding a 'Nil' constructor for DLL and worsening >> our representation. >> >> >> >> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >> >> >> but now we're still stuck with a level of indirection >> >> >> >> DLL ~> MutVar# RealWorld DLL ~> DLL >> >> >> >> This means that every operation we perform on this structure will be >> about half of the speed of an implementation in most other languages >> assuming we're memory bound on loading things into cache! >> >> >> >> Making Progress >> >> ---------------------- >> >> >> >> I have been working on a number of data structures where the indirection >> of going from something in * out to an object in # which contains the real >> pointer to my target and coming back effectively doubles my runtime. >> >> >> >> We go out to the MutVar# because we are allowed to put the MutVar# onto >> the mutable list when we dirty it. There is a well defined write-barrier. >> >> >> >> I could change out the representation to use >> >> >> >> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >> >> >> I can just store two pointers in the MutableArray# every time, but this >> doesn't help _much_ directly. It has reduced the amount of distinct >> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >> >> >> >> I still have to go out to the heap from my DLL and get to the array >> object and then chase it to the next DLL and chase that to the next array. >> I do get my two pointers together in memory though. I'm paying for a card >> marking table as well, which I don't particularly need with just two >> pointers, but we can shed that with the "SmallMutableArray#" machinery >> added back in 7.10, which is just the old array code a a new data type, >> which can speed things up a bit when you don't have very big arrays: >> >> >> >> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >> >> >> But what if I wanted my object itself to live in # and have two mutable >> fields and be able to share the sme write barrier? >> >> >> >> An ArrayArray# points directly to other unlifted array types. What if we >> have one # -> * wrapper on the outside to deal with the impedence mismatch >> between the imperative world and Haskell, and then just let the >> ArrayArray#'s hold other arrayarrays. >> >> >> >> data DLL = DLL (MutableArrayArray# RealWorld) >> >> >> >> now I need to make up a new Nil, which I can just make be a special >> MutableArrayArray# I allocate on program startup. I can even abuse pattern >> synonyms. Alternately I can exploit the internals further to make this >> cheaper. >> >> >> >> Then I can use the readMutableArrayArray# and writeMutableArrayArray# >> calls to directly access the preceding and next entry in the linked list. >> >> >> >> So now we have one DLL wrapper which just 'bootstraps me' into a strict >> world, and everything there lives in #. >> >> >> >> next :: DLL -> IO DLL >> >> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >> (# s', n #) -> (# s', DLL n #) >> >> >> >> It turns out GHC is quite happy to optimize all of that code to keep >> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >> known strict and you chain operations of this sort! >> >> >> >> Cleaning it Up >> >> ------------------ >> >> >> >> Now I have one outermost indirection pointing to an array that points >> directly to other arrays. >> >> >> >> I'm stuck paying for a card marking table per object, but I can fix that >> by duplicating the code for MutableArrayArray# and using a >> SmallMutableArray#. I can hack up primops that let me store a mixture of >> SmallMutableArray# fields and normal ones in the data structure. >> Operationally, I can even do so by just unsafeCoercing the existing >> SmallMutableArray# primitives to change the kind of one of the arguments it >> takes. >> >> >> >> This is almost ideal, but not quite. I often have fields that would be >> best left unboxed. >> >> >> >> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >> >> >> was able to unpack the Int, but we lost that. We can currently at best >> point one of the entries of the SmallMutableArray# at a boxed or at a >> MutableByteArray# for all of our misc. data and shove the int in question >> in there. >> >> >> >> e.g. if I were to implement a hash-array-mapped-trie I need to store >> masks and administrivia as I walk down the tree. Having to go off to the >> side costs me the entire win from avoiding the first pointer chase. >> >> >> >> But, if like Ryan suggested, we had a heap object we could construct that >> had n words with unsafe access and m pointers to other heap objects, one >> that could put itself on the mutable list when any of those pointers >> changed then I could shed this last factor of two in all circumstances. >> >> >> >> Prototype >> >> ------------- >> >> >> >> Over the last few days I've put together a small prototype implementation >> with a few non-trivial imperative data structures for things like Tarjan's >> link-cut trees, the list labeling problem and order-maintenance. >> >> >> >> https://github.com/ekmett/structs >> >> >> >> Notable bits: >> >> >> >> Data.Struct.Internal.LinkCut >> >> provides an implementation of link-cut trees in this style. >> >> >> >> Data.Struct.Internal >> >> provides the rather horrifying guts that make it go fast. >> >> >> >> Once compiled with -O or -O2, if you look at the core, almost all the >> references to the LinkCut or Object data constructor get optimized away, >> and we're left with beautiful strict code directly mutating out underlying >> representation. >> >> >> >> At the very least I'll take this email and turn it into a short article. >> >> >> >> -Edward >> >> >> >> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >> simonpj at microsoft.com> wrote: >> >> Just to say that I have no idea what is going on in this thread. What is >> ArrayArray? What is the issue in general? Is there a ticket? Is there a >> wiki page? >> >> >> >> If it?s important, an ab-initio wiki page + ticket would be a good thing. >> >> >> >> Simon >> >> >> >> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward >> Kmett >> *Sent:* 21 August 2015 05:25 >> *To:* Manuel M T Chakravarty >> *Cc:* Simon Marlow; ghc-devs >> *Subject:* Re: ArrayArrays >> >> >> >> When (ab)using them for this purpose, SmallArrayArray's would be very >> handy as well. >> >> >> >> Consider right now if I have something like an order-maintenance >> structure I have: >> >> >> >> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} >> !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >> >> >> >> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} >> !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} >> !(MutVar s (Lower s)) >> >> >> >> The former contains, logically, a mutable integer and two pointers, one >> for forward and one for backwards. The latter is basically the same thing >> with a mutable reference up pointing at the structure above. >> >> >> >> On the heap this is an object that points to a structure for the >> bytearray, and points to another structure for each mutvar which each point >> to the other 'Upper' structure. So there is a level of indirection smeared >> over everything. >> >> >> >> So this is a pair of doubly linked lists with an upward link from the >> structure below to the structure above. >> >> >> >> Converted into ArrayArray#s I'd get >> >> >> >> data Upper s = Upper (MutableArrayArray# s) >> >> >> >> w/ the first slot being a pointer to a MutableByteArray#, and the next 2 >> slots pointing to the previous and next previous objects, represented just >> as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for >> object identity, which lets me check for the ends of the lists by tying >> things back on themselves. >> >> >> >> and below that >> >> >> >> data Lower s = Lower (MutableArrayArray# s) >> >> >> >> is similar, with an extra MutableArrayArray slot pointing up to an upper >> structure. >> >> >> >> I can then write a handful of combinators for getting out the slots in >> question, while it has gained a level of indirection between the wrapper to >> put it in * and the MutableArrayArray# s in #, that one can be basically >> erased by ghc. >> >> >> >> Unlike before I don't have several separate objects on the heap for each >> thing. I only have 2 now. The MutableArrayArray# for the object itself, and >> the MutableByteArray# that it references to carry around the mutable int. >> >> >> >> The only pain points are >> >> >> >> 1.) the aforementioned limitation that currently prevents me from >> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >> leaving me in a little ghetto disconnected from the rest of Haskell, >> >> >> >> and >> >> >> >> 2.) the lack of SmallArrayArray's, which could let us avoid the card >> marking overhead. These objects are all small, 3-4 pointers wide. Card >> marking doesn't help. >> >> >> >> Alternately I could just try to do really evil things and convert the >> whole mess to SmallArrays and then figure out how to unsafeCoerce my way to >> glory, stuffing the #'d references to the other arrays directly into the >> SmallArray as slots, removing the limitation we see here by aping the >> MutableArrayArray# s API, but that gets really really dangerous! >> >> >> >> I'm pretty much willing to sacrifice almost anything on the altar of >> speed here, but I'd like to be able to let the GC move them and collect >> them which rules out simpler Ptr and Addr based solutions. >> >> >> >> -Edward >> >> >> >> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >> chak at cse.unsw.edu.au> wrote: >> >> That?s an interesting idea. >> >> Manuel >> >> > Edward Kmett : >> >> > >> > Would it be possible to add unsafe primops to add Array# and >> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >> entries are all directly unlifted avoiding a level of indirection for the >> containing structure is amazing, but I can only currently use it if my leaf >> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >> nice to be able to have the ability to put SmallArray# a stuff down at the >> leaves to hold lifted contents. >> > >> > I accept fully that if I name the wrong type when I go to access one of >> the fields it'll lie to me, but I suppose it'd do that if i tried to use >> one of the members that held a nested ArrayArray# as a ByteArray# anyways, >> so it isn't like there is a safety story preventing this. >> > >> > I've been hunting for ways to try to kill the indirection problems I >> get with Haskell and mutable structures, and I could shoehorn a number of >> them into ArrayArrays if this worked. >> > >> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection >> compared to c/java and this could reduce that pain to just 1 level of >> unnecessary indirection. >> > >> > -Edward >> >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrnewton at gmail.com Fri Aug 28 21:51:30 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Fri, 28 Aug 2015 21:51:30 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: So that primitive is an array like thing (Same pointed type, unbounded length) with extra payload. I can see how we can do without structs if we have arrays, especially with the extra payload at front. But wouldn't the general solution for structs be one that that allows new user data type defs for # types? On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: > Some form of MutableStruct# with a known number of words and a known > number of pointers is basically what Ryan Yates was suggesting above, but > where the word counts were stored in the objects themselves. > > Given that it'd have a couple of words for those counts it'd likely want > to be something we build in addition to MutVar# rather than a replacement. > > On the other hand, if we had to fix those numbers and build info tables > that knew them, and typechecker support, for instance, it'd get rather > invasive. > > Also, a number of things that we can do with the 'sized' versions above, > like working with evil unsized c-style arrays directly inline at the end of > the structure cease to be possible, so it isn't even a pure win if we did > the engineering effort. > > I think 90% of the needs I have are covered just by adding the one > primitive. The last 10% gets pretty invasive. > > -Edward > > On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton wrote: > >> I like the possibility of a general solution for mutable structs (like Ed >> said), and I'm trying to fully understand why it's hard. >> >> So, we can't unpack MutVar into constructors because of object identity >> problems. But what about directly supporting an extensible set of unlifted >> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be >> too much work, but is it problematic otherwise? >> >> Needless to say, this is also critical if we ever want best in class >> lockfree mutable structures, just like their Stm and sequential >> counterparts. >> >> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> wrote: >> >>> At the very least I'll take this email and turn it into a short article. >>> >>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make >>> a ticket for it. >>> >>> >>> Thanks >>> >>> >>> >>> Simon >>> >>> >>> >>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>> *Sent:* 27 August 2015 16:54 >>> *To:* Simon Peyton Jones >>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>> *Subject:* Re: ArrayArrays >>> >>> >>> >>> An ArrayArray# is just an Array# with a modified invariant. It points >>> directly to other unlifted ArrayArray#'s or ByteArray#'s. >>> >>> >>> >>> While those live in #, they are garbage collected objects, so this all >>> lives on the heap. >>> >>> >>> >>> They were added to make some of the DPH stuff fast when it has to deal >>> with nested arrays. >>> >>> >>> >>> I'm currently abusing them as a placeholder for a better thing. >>> >>> >>> >>> The Problem >>> >>> ----------------- >>> >>> >>> >>> Consider the scenario where you write a classic doubly-linked list in >>> Haskell. >>> >>> >>> >>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>> >>> >>> >>> Chasing from one DLL to the next requires following 3 pointers on the >>> heap. >>> >>> >>> >>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL >>> ~> DLL >>> >>> >>> >>> That is 3 levels of indirection. >>> >>> >>> >>> We can trim one by simply unpacking the IORef with -funbox-strict-fields >>> or UNPACK >>> >>> >>> >>> We can trim another by adding a 'Nil' constructor for DLL and worsening >>> our representation. >>> >>> >>> >>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>> >>> >>> >>> but now we're still stuck with a level of indirection >>> >>> >>> >>> DLL ~> MutVar# RealWorld DLL ~> DLL >>> >>> >>> >>> This means that every operation we perform on this structure will be >>> about half of the speed of an implementation in most other languages >>> assuming we're memory bound on loading things into cache! >>> >>> >>> >>> Making Progress >>> >>> ---------------------- >>> >>> >>> >>> I have been working on a number of data structures where the indirection >>> of going from something in * out to an object in # which contains the real >>> pointer to my target and coming back effectively doubles my runtime. >>> >>> >>> >>> We go out to the MutVar# because we are allowed to put the MutVar# onto >>> the mutable list when we dirty it. There is a well defined write-barrier. >>> >>> >>> >>> I could change out the representation to use >>> >>> >>> >>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>> >>> >>> >>> I can just store two pointers in the MutableArray# every time, but this >>> doesn't help _much_ directly. It has reduced the amount of distinct >>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>> >>> >>> >>> I still have to go out to the heap from my DLL and get to the array >>> object and then chase it to the next DLL and chase that to the next array. >>> I do get my two pointers together in memory though. I'm paying for a card >>> marking table as well, which I don't particularly need with just two >>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>> added back in 7.10, which is just the old array code a a new data type, >>> which can speed things up a bit when you don't have very big arrays: >>> >>> >>> >>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>> >>> >>> >>> But what if I wanted my object itself to live in # and have two mutable >>> fields and be able to share the sme write barrier? >>> >>> >>> >>> An ArrayArray# points directly to other unlifted array types. What if we >>> have one # -> * wrapper on the outside to deal with the impedence mismatch >>> between the imperative world and Haskell, and then just let the >>> ArrayArray#'s hold other arrayarrays. >>> >>> >>> >>> data DLL = DLL (MutableArrayArray# RealWorld) >>> >>> >>> >>> now I need to make up a new Nil, which I can just make be a special >>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>> synonyms. Alternately I can exploit the internals further to make this >>> cheaper. >>> >>> >>> >>> Then I can use the readMutableArrayArray# and writeMutableArrayArray# >>> calls to directly access the preceding and next entry in the linked list. >>> >>> >>> >>> So now we have one DLL wrapper which just 'bootstraps me' into a strict >>> world, and everything there lives in #. >>> >>> >>> >>> next :: DLL -> IO DLL >>> >>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>> >>> (# s', n #) -> (# s', DLL n #) >>> >>> >>> >>> It turns out GHC is quite happy to optimize all of that code to keep >>> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >>> known strict and you chain operations of this sort! >>> >>> >>> >>> Cleaning it Up >>> >>> ------------------ >>> >>> >>> >>> Now I have one outermost indirection pointing to an array that points >>> directly to other arrays. >>> >>> >>> >>> I'm stuck paying for a card marking table per object, but I can fix that >>> by duplicating the code for MutableArrayArray# and using a >>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>> SmallMutableArray# fields and normal ones in the data structure. >>> Operationally, I can even do so by just unsafeCoercing the existing >>> SmallMutableArray# primitives to change the kind of one of the arguments it >>> takes. >>> >>> >>> >>> This is almost ideal, but not quite. I often have fields that would be >>> best left unboxed. >>> >>> >>> >>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>> >>> >>> >>> was able to unpack the Int, but we lost that. We can currently at best >>> point one of the entries of the SmallMutableArray# at a boxed or at a >>> MutableByteArray# for all of our misc. data and shove the int in question >>> in there. >>> >>> >>> >>> e.g. if I were to implement a hash-array-mapped-trie I need to store >>> masks and administrivia as I walk down the tree. Having to go off to the >>> side costs me the entire win from avoiding the first pointer chase. >>> >>> >>> >>> But, if like Ryan suggested, we had a heap object we could construct >>> that had n words with unsafe access and m pointers to other heap objects, >>> one that could put itself on the mutable list when any of those pointers >>> changed then I could shed this last factor of two in all circumstances. >>> >>> >>> >>> Prototype >>> >>> ------------- >>> >>> >>> >>> Over the last few days I've put together a small prototype >>> implementation with a few non-trivial imperative data structures for things >>> like Tarjan's link-cut trees, the list labeling problem and >>> order-maintenance. >>> >>> >>> >>> https://github.com/ekmett/structs >>> >>> >>> >>> Notable bits: >>> >>> >>> >>> Data.Struct.Internal.LinkCut >>> >>> provides an implementation of link-cut trees in this style. >>> >>> >>> >>> Data.Struct.Internal >>> >>> provides the rather horrifying guts that make it go fast. >>> >>> >>> >>> Once compiled with -O or -O2, if you look at the core, almost all the >>> references to the LinkCut or Object data constructor get optimized away, >>> and we're left with beautiful strict code directly mutating out underlying >>> representation. >>> >>> >>> >>> At the very least I'll take this email and turn it into a short article. >>> >>> >>> >>> -Edward >>> >>> >>> >>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>> simonpj at microsoft.com> wrote: >>> >>> Just to say that I have no idea what is going on in this thread. What >>> is ArrayArray? What is the issue in general? Is there a ticket? Is there >>> a wiki page? >>> >>> >>> >>> If it?s important, an ab-initio wiki page + ticket would be a good thing. >>> >>> >>> >>> Simon >>> >>> >>> >>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward >>> Kmett >>> *Sent:* 21 August 2015 05:25 >>> *To:* Manuel M T Chakravarty >>> *Cc:* Simon Marlow; ghc-devs >>> *Subject:* Re: ArrayArrays >>> >>> >>> >>> When (ab)using them for this purpose, SmallArrayArray's would be very >>> handy as well. >>> >>> >>> >>> Consider right now if I have something like an order-maintenance >>> structure I have: >>> >>> >>> >>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} >>> !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>> >>> >>> >>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} >>> !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} >>> !(MutVar s (Lower s)) >>> >>> >>> >>> The former contains, logically, a mutable integer and two pointers, one >>> for forward and one for backwards. The latter is basically the same thing >>> with a mutable reference up pointing at the structure above. >>> >>> >>> >>> On the heap this is an object that points to a structure for the >>> bytearray, and points to another structure for each mutvar which each point >>> to the other 'Upper' structure. So there is a level of indirection smeared >>> over everything. >>> >>> >>> >>> So this is a pair of doubly linked lists with an upward link from the >>> structure below to the structure above. >>> >>> >>> >>> Converted into ArrayArray#s I'd get >>> >>> >>> >>> data Upper s = Upper (MutableArrayArray# s) >>> >>> >>> >>> w/ the first slot being a pointer to a MutableByteArray#, and the next 2 >>> slots pointing to the previous and next previous objects, represented just >>> as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for >>> object identity, which lets me check for the ends of the lists by tying >>> things back on themselves. >>> >>> >>> >>> and below that >>> >>> >>> >>> data Lower s = Lower (MutableArrayArray# s) >>> >>> >>> >>> is similar, with an extra MutableArrayArray slot pointing up to an upper >>> structure. >>> >>> >>> >>> I can then write a handful of combinators for getting out the slots in >>> question, while it has gained a level of indirection between the wrapper to >>> put it in * and the MutableArrayArray# s in #, that one can be basically >>> erased by ghc. >>> >>> >>> >>> Unlike before I don't have several separate objects on the heap for each >>> thing. I only have 2 now. The MutableArrayArray# for the object itself, and >>> the MutableByteArray# that it references to carry around the mutable int. >>> >>> >>> >>> The only pain points are >>> >>> >>> >>> 1.) the aforementioned limitation that currently prevents me from >>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>> leaving me in a little ghetto disconnected from the rest of Haskell, >>> >>> >>> >>> and >>> >>> >>> >>> 2.) the lack of SmallArrayArray's, which could let us avoid the card >>> marking overhead. These objects are all small, 3-4 pointers wide. Card >>> marking doesn't help. >>> >>> >>> >>> Alternately I could just try to do really evil things and convert the >>> whole mess to SmallArrays and then figure out how to unsafeCoerce my way to >>> glory, stuffing the #'d references to the other arrays directly into the >>> SmallArray as slots, removing the limitation we see here by aping the >>> MutableArrayArray# s API, but that gets really really dangerous! >>> >>> >>> >>> I'm pretty much willing to sacrifice almost anything on the altar of >>> speed here, but I'd like to be able to let the GC move them and collect >>> them which rules out simpler Ptr and Addr based solutions. >>> >>> >>> >>> -Edward >>> >>> >>> >>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>> chak at cse.unsw.edu.au> wrote: >>> >>> That?s an interesting idea. >>> >>> Manuel >>> >>> > Edward Kmett : >>> >>> > >>> > Would it be possible to add unsafe primops to add Array# and >>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>> entries are all directly unlifted avoiding a level of indirection for the >>> containing structure is amazing, but I can only currently use it if my leaf >>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>> nice to be able to have the ability to put SmallArray# a stuff down at the >>> leaves to hold lifted contents. >>> > >>> > I accept fully that if I name the wrong type when I go to access one >>> of the fields it'll lie to me, but I suppose it'd do that if i tried to use >>> one of the members that held a nested ArrayArray# as a ByteArray# anyways, >>> so it isn't like there is a safety story preventing this. >>> > >>> > I've been hunting for ways to try to kill the indirection problems I >>> get with Haskell and mutable structures, and I could shoehorn a number of >>> them into ArrayArrays if this worked. >>> > >>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>> indirection compared to c/java and this could reduce that pain to just 1 >>> level of unnecessary indirection. >>> > >>> > -Edward >>> >>> > _______________________________________________ >>> > ghc-devs mailing list >>> > ghc-devs at haskell.org >>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >>> >>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Fri Aug 28 22:07:44 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 18:07:44 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: I think both are useful, but the one you suggest requires a lot more plumbing and doesn't subsume all of the usecases of the other. -Edward On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton wrote: > So that primitive is an array like thing (Same pointed type, unbounded > length) with extra payload. > > I can see how we can do without structs if we have arrays, especially with > the extra payload at front. But wouldn't the general solution for structs > be one that that allows new user data type defs for # types? > > > > On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: > >> Some form of MutableStruct# with a known number of words and a known >> number of pointers is basically what Ryan Yates was suggesting above, but >> where the word counts were stored in the objects themselves. >> >> Given that it'd have a couple of words for those counts it'd likely want >> to be something we build in addition to MutVar# rather than a replacement. >> >> On the other hand, if we had to fix those numbers and build info tables >> that knew them, and typechecker support, for instance, it'd get rather >> invasive. >> >> Also, a number of things that we can do with the 'sized' versions above, >> like working with evil unsized c-style arrays directly inline at the end of >> the structure cease to be possible, so it isn't even a pure win if we did >> the engineering effort. >> >> I think 90% of the needs I have are covered just by adding the one >> primitive. The last 10% gets pretty invasive. >> >> -Edward >> >> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton wrote: >> >>> I like the possibility of a general solution for mutable structs (like >>> Ed said), and I'm trying to fully understand why it's hard. >>> >>> So, we can't unpack MutVar into constructors because of object identity >>> problems. But what about directly supporting an extensible set of unlifted >>> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be >>> too much work, but is it problematic otherwise? >>> >>> Needless to say, this is also critical if we ever want best in class >>> lockfree mutable structures, just like their Stm and sequential >>> counterparts. >>> >>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>> simonpj at microsoft.com> wrote: >>> >>>> At the very least I'll take this email and turn it into a short article. >>>> >>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make >>>> a ticket for it. >>>> >>>> >>>> Thanks >>>> >>>> >>>> >>>> Simon >>>> >>>> >>>> >>>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>>> *Sent:* 27 August 2015 16:54 >>>> *To:* Simon Peyton Jones >>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>> *Subject:* Re: ArrayArrays >>>> >>>> >>>> >>>> An ArrayArray# is just an Array# with a modified invariant. It points >>>> directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>> >>>> >>>> >>>> While those live in #, they are garbage collected objects, so this all >>>> lives on the heap. >>>> >>>> >>>> >>>> They were added to make some of the DPH stuff fast when it has to deal >>>> with nested arrays. >>>> >>>> >>>> >>>> I'm currently abusing them as a placeholder for a better thing. >>>> >>>> >>>> >>>> The Problem >>>> >>>> ----------------- >>>> >>>> >>>> >>>> Consider the scenario where you write a classic doubly-linked list in >>>> Haskell. >>>> >>>> >>>> >>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>> >>>> >>>> >>>> Chasing from one DLL to the next requires following 3 pointers on the >>>> heap. >>>> >>>> >>>> >>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL >>>> ~> DLL >>>> >>>> >>>> >>>> That is 3 levels of indirection. >>>> >>>> >>>> >>>> We can trim one by simply unpacking the IORef with >>>> -funbox-strict-fields or UNPACK >>>> >>>> >>>> >>>> We can trim another by adding a 'Nil' constructor for DLL and worsening >>>> our representation. >>>> >>>> >>>> >>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>> >>>> >>>> >>>> but now we're still stuck with a level of indirection >>>> >>>> >>>> >>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>> >>>> >>>> >>>> This means that every operation we perform on this structure will be >>>> about half of the speed of an implementation in most other languages >>>> assuming we're memory bound on loading things into cache! >>>> >>>> >>>> >>>> Making Progress >>>> >>>> ---------------------- >>>> >>>> >>>> >>>> I have been working on a number of data structures where the >>>> indirection of going from something in * out to an object in # which >>>> contains the real pointer to my target and coming back effectively doubles >>>> my runtime. >>>> >>>> >>>> >>>> We go out to the MutVar# because we are allowed to put the MutVar# onto >>>> the mutable list when we dirty it. There is a well defined write-barrier. >>>> >>>> >>>> >>>> I could change out the representation to use >>>> >>>> >>>> >>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>> >>>> >>>> >>>> I can just store two pointers in the MutableArray# every time, but this >>>> doesn't help _much_ directly. It has reduced the amount of distinct >>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>> >>>> >>>> >>>> I still have to go out to the heap from my DLL and get to the array >>>> object and then chase it to the next DLL and chase that to the next array. >>>> I do get my two pointers together in memory though. I'm paying for a card >>>> marking table as well, which I don't particularly need with just two >>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>> added back in 7.10, which is just the old array code a a new data type, >>>> which can speed things up a bit when you don't have very big arrays: >>>> >>>> >>>> >>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>> >>>> >>>> >>>> But what if I wanted my object itself to live in # and have two mutable >>>> fields and be able to share the sme write barrier? >>>> >>>> >>>> >>>> An ArrayArray# points directly to other unlifted array types. What if >>>> we have one # -> * wrapper on the outside to deal with the impedence >>>> mismatch between the imperative world and Haskell, and then just let the >>>> ArrayArray#'s hold other arrayarrays. >>>> >>>> >>>> >>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>> >>>> >>>> >>>> now I need to make up a new Nil, which I can just make be a special >>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>> synonyms. Alternately I can exploit the internals further to make this >>>> cheaper. >>>> >>>> >>>> >>>> Then I can use the readMutableArrayArray# and writeMutableArrayArray# >>>> calls to directly access the preceding and next entry in the linked list. >>>> >>>> >>>> >>>> So now we have one DLL wrapper which just 'bootstraps me' into a strict >>>> world, and everything there lives in #. >>>> >>>> >>>> >>>> next :: DLL -> IO DLL >>>> >>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>> >>>> (# s', n #) -> (# s', DLL n #) >>>> >>>> >>>> >>>> It turns out GHC is quite happy to optimize all of that code to keep >>>> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >>>> known strict and you chain operations of this sort! >>>> >>>> >>>> >>>> Cleaning it Up >>>> >>>> ------------------ >>>> >>>> >>>> >>>> Now I have one outermost indirection pointing to an array that points >>>> directly to other arrays. >>>> >>>> >>>> >>>> I'm stuck paying for a card marking table per object, but I can fix >>>> that by duplicating the code for MutableArrayArray# and using a >>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>> SmallMutableArray# fields and normal ones in the data structure. >>>> Operationally, I can even do so by just unsafeCoercing the existing >>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>> takes. >>>> >>>> >>>> >>>> This is almost ideal, but not quite. I often have fields that would be >>>> best left unboxed. >>>> >>>> >>>> >>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>> >>>> >>>> >>>> was able to unpack the Int, but we lost that. We can currently at best >>>> point one of the entries of the SmallMutableArray# at a boxed or at a >>>> MutableByteArray# for all of our misc. data and shove the int in question >>>> in there. >>>> >>>> >>>> >>>> e.g. if I were to implement a hash-array-mapped-trie I need to store >>>> masks and administrivia as I walk down the tree. Having to go off to the >>>> side costs me the entire win from avoiding the first pointer chase. >>>> >>>> >>>> >>>> But, if like Ryan suggested, we had a heap object we could construct >>>> that had n words with unsafe access and m pointers to other heap objects, >>>> one that could put itself on the mutable list when any of those pointers >>>> changed then I could shed this last factor of two in all circumstances. >>>> >>>> >>>> >>>> Prototype >>>> >>>> ------------- >>>> >>>> >>>> >>>> Over the last few days I've put together a small prototype >>>> implementation with a few non-trivial imperative data structures for things >>>> like Tarjan's link-cut trees, the list labeling problem and >>>> order-maintenance. >>>> >>>> >>>> >>>> https://github.com/ekmett/structs >>>> >>>> >>>> >>>> Notable bits: >>>> >>>> >>>> >>>> Data.Struct.Internal.LinkCut >>>> >>>> provides an implementation of link-cut trees in this style. >>>> >>>> >>>> >>>> Data.Struct.Internal >>>> >>>> provides the rather horrifying guts that make it go fast. >>>> >>>> >>>> >>>> Once compiled with -O or -O2, if you look at the core, almost all the >>>> references to the LinkCut or Object data constructor get optimized away, >>>> and we're left with beautiful strict code directly mutating out underlying >>>> representation. >>>> >>>> >>>> >>>> At the very least I'll take this email and turn it into a short article. >>>> >>>> >>>> >>>> -Edward >>>> >>>> >>>> >>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>> simonpj at microsoft.com> wrote: >>>> >>>> Just to say that I have no idea what is going on in this thread. What >>>> is ArrayArray? What is the issue in general? Is there a ticket? Is there >>>> a wiki page? >>>> >>>> >>>> >>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>> thing. >>>> >>>> >>>> >>>> Simon >>>> >>>> >>>> >>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward >>>> Kmett >>>> *Sent:* 21 August 2015 05:25 >>>> *To:* Manuel M T Chakravarty >>>> *Cc:* Simon Marlow; ghc-devs >>>> *Subject:* Re: ArrayArrays >>>> >>>> >>>> >>>> When (ab)using them for this purpose, SmallArrayArray's would be very >>>> handy as well. >>>> >>>> >>>> >>>> Consider right now if I have something like an order-maintenance >>>> structure I have: >>>> >>>> >>>> >>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK >>>> #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>> >>>> >>>> >>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK >>>> #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK >>>> #-} !(MutVar s (Lower s)) >>>> >>>> >>>> >>>> The former contains, logically, a mutable integer and two pointers, one >>>> for forward and one for backwards. The latter is basically the same thing >>>> with a mutable reference up pointing at the structure above. >>>> >>>> >>>> >>>> On the heap this is an object that points to a structure for the >>>> bytearray, and points to another structure for each mutvar which each point >>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>> over everything. >>>> >>>> >>>> >>>> So this is a pair of doubly linked lists with an upward link from the >>>> structure below to the structure above. >>>> >>>> >>>> >>>> Converted into ArrayArray#s I'd get >>>> >>>> >>>> >>>> data Upper s = Upper (MutableArrayArray# s) >>>> >>>> >>>> >>>> w/ the first slot being a pointer to a MutableByteArray#, and the next >>>> 2 slots pointing to the previous and next previous objects, represented >>>> just as their MutableArrayArray#s. I can use sameMutableArrayArray# on >>>> these for object identity, which lets me check for the ends of the lists by >>>> tying things back on themselves. >>>> >>>> >>>> >>>> and below that >>>> >>>> >>>> >>>> data Lower s = Lower (MutableArrayArray# s) >>>> >>>> >>>> >>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>> upper structure. >>>> >>>> >>>> >>>> I can then write a handful of combinators for getting out the slots in >>>> question, while it has gained a level of indirection between the wrapper to >>>> put it in * and the MutableArrayArray# s in #, that one can be basically >>>> erased by ghc. >>>> >>>> >>>> >>>> Unlike before I don't have several separate objects on the heap for >>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>> itself, and the MutableByteArray# that it references to carry around the >>>> mutable int. >>>> >>>> >>>> >>>> The only pain points are >>>> >>>> >>>> >>>> 1.) the aforementioned limitation that currently prevents me from >>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>> >>>> >>>> >>>> and >>>> >>>> >>>> >>>> 2.) the lack of SmallArrayArray's, which could let us avoid the card >>>> marking overhead. These objects are all small, 3-4 pointers wide. Card >>>> marking doesn't help. >>>> >>>> >>>> >>>> Alternately I could just try to do really evil things and convert the >>>> whole mess to SmallArrays and then figure out how to unsafeCoerce my way to >>>> glory, stuffing the #'d references to the other arrays directly into the >>>> SmallArray as slots, removing the limitation we see here by aping the >>>> MutableArrayArray# s API, but that gets really really dangerous! >>>> >>>> >>>> >>>> I'm pretty much willing to sacrifice almost anything on the altar of >>>> speed here, but I'd like to be able to let the GC move them and collect >>>> them which rules out simpler Ptr and Addr based solutions. >>>> >>>> >>>> >>>> -Edward >>>> >>>> >>>> >>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>> chak at cse.unsw.edu.au> wrote: >>>> >>>> That?s an interesting idea. >>>> >>>> Manuel >>>> >>>> > Edward Kmett : >>>> >>>> > >>>> > Would it be possible to add unsafe primops to add Array# and >>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>> entries are all directly unlifted avoiding a level of indirection for the >>>> containing structure is amazing, but I can only currently use it if my leaf >>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>>> nice to be able to have the ability to put SmallArray# a stuff down at the >>>> leaves to hold lifted contents. >>>> > >>>> > I accept fully that if I name the wrong type when I go to access one >>>> of the fields it'll lie to me, but I suppose it'd do that if i tried to use >>>> one of the members that held a nested ArrayArray# as a ByteArray# anyways, >>>> so it isn't like there is a safety story preventing this. >>>> > >>>> > I've been hunting for ways to try to kill the indirection problems I >>>> get with Haskell and mutable structures, and I could shoehorn a number of >>>> them into ArrayArrays if this worked. >>>> > >>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>> indirection compared to c/java and this could reduce that pain to just 1 >>>> level of unnecessary indirection. >>>> > >>>> > -Edward >>>> >>>> > _______________________________________________ >>>> > ghc-devs mailing list >>>> > ghc-devs at haskell.org >>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrnewton at gmail.com Fri Aug 28 22:14:21 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Fri, 28 Aug 2015 22:14:21 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Yes. And for the short term I can imagine places we will settle with arrays even if it means tracking lengths unnecessarily and unsafeCoercing pointers whose types don't actually match their siblings. Is there anything to recommend the hacks mentioned for fixed sized array objects *other* than using them to fake structs? (Much to derecommend, as you mentioned!) On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: > I think both are useful, but the one you suggest requires a lot more > plumbing and doesn't subsume all of the usecases of the other. > > -Edward > > On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton wrote: > >> So that primitive is an array like thing (Same pointed type, unbounded >> length) with extra payload. >> >> I can see how we can do without structs if we have arrays, especially >> with the extra payload at front. But wouldn't the general solution for >> structs be one that that allows new user data type defs for # types? >> >> >> >> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >> >>> Some form of MutableStruct# with a known number of words and a known >>> number of pointers is basically what Ryan Yates was suggesting above, but >>> where the word counts were stored in the objects themselves. >>> >>> Given that it'd have a couple of words for those counts it'd likely want >>> to be something we build in addition to MutVar# rather than a replacement. >>> >>> On the other hand, if we had to fix those numbers and build info tables >>> that knew them, and typechecker support, for instance, it'd get rather >>> invasive. >>> >>> Also, a number of things that we can do with the 'sized' versions above, >>> like working with evil unsized c-style arrays directly inline at the end of >>> the structure cease to be possible, so it isn't even a pure win if we did >>> the engineering effort. >>> >>> I think 90% of the needs I have are covered just by adding the one >>> primitive. The last 10% gets pretty invasive. >>> >>> -Edward >>> >>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton wrote: >>> >>>> I like the possibility of a general solution for mutable structs (like >>>> Ed said), and I'm trying to fully understand why it's hard. >>>> >>>> So, we can't unpack MutVar into constructors because of object identity >>>> problems. But what about directly supporting an extensible set of unlifted >>>> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be >>>> too much work, but is it problematic otherwise? >>>> >>>> Needless to say, this is also critical if we ever want best in class >>>> lockfree mutable structures, just like their Stm and sequential >>>> counterparts. >>>> >>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>>> simonpj at microsoft.com> wrote: >>>> >>>>> At the very least I'll take this email and turn it into a short >>>>> article. >>>>> >>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>> make a ticket for it. >>>>> >>>>> >>>>> Thanks >>>>> >>>>> >>>>> >>>>> Simon >>>>> >>>>> >>>>> >>>>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>>>> *Sent:* 27 August 2015 16:54 >>>>> *To:* Simon Peyton Jones >>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>> *Subject:* Re: ArrayArrays >>>>> >>>>> >>>>> >>>>> An ArrayArray# is just an Array# with a modified invariant. It points >>>>> directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>> >>>>> >>>>> >>>>> While those live in #, they are garbage collected objects, so this all >>>>> lives on the heap. >>>>> >>>>> >>>>> >>>>> They were added to make some of the DPH stuff fast when it has to deal >>>>> with nested arrays. >>>>> >>>>> >>>>> >>>>> I'm currently abusing them as a placeholder for a better thing. >>>>> >>>>> >>>>> >>>>> The Problem >>>>> >>>>> ----------------- >>>>> >>>>> >>>>> >>>>> Consider the scenario where you write a classic doubly-linked list in >>>>> Haskell. >>>>> >>>>> >>>>> >>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>> >>>>> >>>>> >>>>> Chasing from one DLL to the next requires following 3 pointers on the >>>>> heap. >>>>> >>>>> >>>>> >>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL >>>>> ~> DLL >>>>> >>>>> >>>>> >>>>> That is 3 levels of indirection. >>>>> >>>>> >>>>> >>>>> We can trim one by simply unpacking the IORef with >>>>> -funbox-strict-fields or UNPACK >>>>> >>>>> >>>>> >>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>> worsening our representation. >>>>> >>>>> >>>>> >>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>> >>>>> >>>>> >>>>> but now we're still stuck with a level of indirection >>>>> >>>>> >>>>> >>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>> >>>>> >>>>> >>>>> This means that every operation we perform on this structure will be >>>>> about half of the speed of an implementation in most other languages >>>>> assuming we're memory bound on loading things into cache! >>>>> >>>>> >>>>> >>>>> Making Progress >>>>> >>>>> ---------------------- >>>>> >>>>> >>>>> >>>>> I have been working on a number of data structures where the >>>>> indirection of going from something in * out to an object in # which >>>>> contains the real pointer to my target and coming back effectively doubles >>>>> my runtime. >>>>> >>>>> >>>>> >>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>> onto the mutable list when we dirty it. There is a well defined >>>>> write-barrier. >>>>> >>>>> >>>>> >>>>> I could change out the representation to use >>>>> >>>>> >>>>> >>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>> >>>>> >>>>> >>>>> I can just store two pointers in the MutableArray# every time, but >>>>> this doesn't help _much_ directly. It has reduced the amount of distinct >>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>> >>>>> >>>>> >>>>> I still have to go out to the heap from my DLL and get to the array >>>>> object and then chase it to the next DLL and chase that to the next array. >>>>> I do get my two pointers together in memory though. I'm paying for a card >>>>> marking table as well, which I don't particularly need with just two >>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>>> added back in 7.10, which is just the old array code a a new data type, >>>>> which can speed things up a bit when you don't have very big arrays: >>>>> >>>>> >>>>> >>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>> >>>>> >>>>> >>>>> But what if I wanted my object itself to live in # and have two >>>>> mutable fields and be able to share the sme write barrier? >>>>> >>>>> >>>>> >>>>> An ArrayArray# points directly to other unlifted array types. What if >>>>> we have one # -> * wrapper on the outside to deal with the impedence >>>>> mismatch between the imperative world and Haskell, and then just let the >>>>> ArrayArray#'s hold other arrayarrays. >>>>> >>>>> >>>>> >>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>> >>>>> >>>>> >>>>> now I need to make up a new Nil, which I can just make be a special >>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>>> synonyms. Alternately I can exploit the internals further to make this >>>>> cheaper. >>>>> >>>>> >>>>> >>>>> Then I can use the readMutableArrayArray# and writeMutableArrayArray# >>>>> calls to directly access the preceding and next entry in the linked list. >>>>> >>>>> >>>>> >>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>> strict world, and everything there lives in #. >>>>> >>>>> >>>>> >>>>> next :: DLL -> IO DLL >>>>> >>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>> >>>>> (# s', n #) -> (# s', DLL n #) >>>>> >>>>> >>>>> >>>>> It turns out GHC is quite happy to optimize all of that code to keep >>>>> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >>>>> known strict and you chain operations of this sort! >>>>> >>>>> >>>>> >>>>> Cleaning it Up >>>>> >>>>> ------------------ >>>>> >>>>> >>>>> >>>>> Now I have one outermost indirection pointing to an array that points >>>>> directly to other arrays. >>>>> >>>>> >>>>> >>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>> that by duplicating the code for MutableArrayArray# and using a >>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>>> takes. >>>>> >>>>> >>>>> >>>>> This is almost ideal, but not quite. I often have fields that would be >>>>> best left unboxed. >>>>> >>>>> >>>>> >>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>> >>>>> >>>>> >>>>> was able to unpack the Int, but we lost that. We can currently at best >>>>> point one of the entries of the SmallMutableArray# at a boxed or at a >>>>> MutableByteArray# for all of our misc. data and shove the int in question >>>>> in there. >>>>> >>>>> >>>>> >>>>> e.g. if I were to implement a hash-array-mapped-trie I need to store >>>>> masks and administrivia as I walk down the tree. Having to go off to the >>>>> side costs me the entire win from avoiding the first pointer chase. >>>>> >>>>> >>>>> >>>>> But, if like Ryan suggested, we had a heap object we could construct >>>>> that had n words with unsafe access and m pointers to other heap objects, >>>>> one that could put itself on the mutable list when any of those pointers >>>>> changed then I could shed this last factor of two in all circumstances. >>>>> >>>>> >>>>> >>>>> Prototype >>>>> >>>>> ------------- >>>>> >>>>> >>>>> >>>>> Over the last few days I've put together a small prototype >>>>> implementation with a few non-trivial imperative data structures for things >>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>> order-maintenance. >>>>> >>>>> >>>>> >>>>> https://github.com/ekmett/structs >>>>> >>>>> >>>>> >>>>> Notable bits: >>>>> >>>>> >>>>> >>>>> Data.Struct.Internal.LinkCut >>>>> >>>>> provides an implementation of link-cut trees in this style. >>>>> >>>>> >>>>> >>>>> Data.Struct.Internal >>>>> >>>>> provides the rather horrifying guts that make it go fast. >>>>> >>>>> >>>>> >>>>> Once compiled with -O or -O2, if you look at the core, almost all the >>>>> references to the LinkCut or Object data constructor get optimized away, >>>>> and we're left with beautiful strict code directly mutating out underlying >>>>> representation. >>>>> >>>>> >>>>> >>>>> At the very least I'll take this email and turn it into a short >>>>> article. >>>>> >>>>> >>>>> >>>>> -Edward >>>>> >>>>> >>>>> >>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>>> simonpj at microsoft.com> wrote: >>>>> >>>>> Just to say that I have no idea what is going on in this thread. What >>>>> is ArrayArray? What is the issue in general? Is there a ticket? Is there >>>>> a wiki page? >>>>> >>>>> >>>>> >>>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>>> thing. >>>>> >>>>> >>>>> >>>>> Simon >>>>> >>>>> >>>>> >>>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Edward >>>>> Kmett >>>>> *Sent:* 21 August 2015 05:25 >>>>> *To:* Manuel M T Chakravarty >>>>> *Cc:* Simon Marlow; ghc-devs >>>>> *Subject:* Re: ArrayArrays >>>>> >>>>> >>>>> >>>>> When (ab)using them for this purpose, SmallArrayArray's would be very >>>>> handy as well. >>>>> >>>>> >>>>> >>>>> Consider right now if I have something like an order-maintenance >>>>> structure I have: >>>>> >>>>> >>>>> >>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK >>>>> #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>> >>>>> >>>>> >>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK >>>>> #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK >>>>> #-} !(MutVar s (Lower s)) >>>>> >>>>> >>>>> >>>>> The former contains, logically, a mutable integer and two pointers, >>>>> one for forward and one for backwards. The latter is basically the same >>>>> thing with a mutable reference up pointing at the structure above. >>>>> >>>>> >>>>> >>>>> On the heap this is an object that points to a structure for the >>>>> bytearray, and points to another structure for each mutvar which each point >>>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>>> over everything. >>>>> >>>>> >>>>> >>>>> So this is a pair of doubly linked lists with an upward link from the >>>>> structure below to the structure above. >>>>> >>>>> >>>>> >>>>> Converted into ArrayArray#s I'd get >>>>> >>>>> >>>>> >>>>> data Upper s = Upper (MutableArrayArray# s) >>>>> >>>>> >>>>> >>>>> w/ the first slot being a pointer to a MutableByteArray#, and the next >>>>> 2 slots pointing to the previous and next previous objects, represented >>>>> just as their MutableArrayArray#s. I can use sameMutableArrayArray# on >>>>> these for object identity, which lets me check for the ends of the lists by >>>>> tying things back on themselves. >>>>> >>>>> >>>>> >>>>> and below that >>>>> >>>>> >>>>> >>>>> data Lower s = Lower (MutableArrayArray# s) >>>>> >>>>> >>>>> >>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>> upper structure. >>>>> >>>>> >>>>> >>>>> I can then write a handful of combinators for getting out the slots in >>>>> question, while it has gained a level of indirection between the wrapper to >>>>> put it in * and the MutableArrayArray# s in #, that one can be basically >>>>> erased by ghc. >>>>> >>>>> >>>>> >>>>> Unlike before I don't have several separate objects on the heap for >>>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>>> itself, and the MutableByteArray# that it references to carry around the >>>>> mutable int. >>>>> >>>>> >>>>> >>>>> The only pain points are >>>>> >>>>> >>>>> >>>>> 1.) the aforementioned limitation that currently prevents me from >>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>> >>>>> >>>>> >>>>> and >>>>> >>>>> >>>>> >>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the card >>>>> marking overhead. These objects are all small, 3-4 pointers wide. Card >>>>> marking doesn't help. >>>>> >>>>> >>>>> >>>>> Alternately I could just try to do really evil things and convert the >>>>> whole mess to SmallArrays and then figure out how to unsafeCoerce my way to >>>>> glory, stuffing the #'d references to the other arrays directly into the >>>>> SmallArray as slots, removing the limitation we see here by aping the >>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>> >>>>> >>>>> >>>>> I'm pretty much willing to sacrifice almost anything on the altar of >>>>> speed here, but I'd like to be able to let the GC move them and collect >>>>> them which rules out simpler Ptr and Addr based solutions. >>>>> >>>>> >>>>> >>>>> -Edward >>>>> >>>>> >>>>> >>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>>> chak at cse.unsw.edu.au> wrote: >>>>> >>>>> That?s an interesting idea. >>>>> >>>>> Manuel >>>>> >>>>> > Edward Kmett : >>>>> >>>>> > >>>>> > Would it be possible to add unsafe primops to add Array# and >>>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>>> entries are all directly unlifted avoiding a level of indirection for the >>>>> containing structure is amazing, but I can only currently use it if my leaf >>>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>>>> nice to be able to have the ability to put SmallArray# a stuff down at the >>>>> leaves to hold lifted contents. >>>>> > >>>>> > I accept fully that if I name the wrong type when I go to access one >>>>> of the fields it'll lie to me, but I suppose it'd do that if i tried to use >>>>> one of the members that held a nested ArrayArray# as a ByteArray# anyways, >>>>> so it isn't like there is a safety story preventing this. >>>>> > >>>>> > I've been hunting for ways to try to kill the indirection problems I >>>>> get with Haskell and mutable structures, and I could shoehorn a number of >>>>> them into ArrayArrays if this worked. >>>>> > >>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>> indirection compared to c/java and this could reduce that pain to just 1 >>>>> level of unnecessary indirection. >>>>> > >>>>> > -Edward >>>>> >>>>> > _______________________________________________ >>>>> > ghc-devs mailing list >>>>> > ghc-devs at haskell.org >>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> ghc-devs mailing list >>>>> ghc-devs at haskell.org >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Fri Aug 28 22:36:20 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 18:36:20 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Well, on the plus side you'd save 16 bytes per object, which adds up if they were small enough and there are enough of them. You get a bit better locality of reference in terms of what fits in the first cache line of them. -Edward On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton wrote: > Yes. And for the short term I can imagine places we will settle with > arrays even if it means tracking lengths unnecessarily and unsafeCoercing > pointers whose types don't actually match their siblings. > > Is there anything to recommend the hacks mentioned for fixed sized array > objects *other* than using them to fake structs? (Much to derecommend, as > you mentioned!) > > On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: > >> I think both are useful, but the one you suggest requires a lot more >> plumbing and doesn't subsume all of the usecases of the other. >> >> -Edward >> >> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton wrote: >> >>> So that primitive is an array like thing (Same pointed type, unbounded >>> length) with extra payload. >>> >>> I can see how we can do without structs if we have arrays, especially >>> with the extra payload at front. But wouldn't the general solution for >>> structs be one that that allows new user data type defs for # types? >>> >>> >>> >>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >>> >>>> Some form of MutableStruct# with a known number of words and a known >>>> number of pointers is basically what Ryan Yates was suggesting above, but >>>> where the word counts were stored in the objects themselves. >>>> >>>> Given that it'd have a couple of words for those counts it'd likely >>>> want to be something we build in addition to MutVar# rather than a >>>> replacement. >>>> >>>> On the other hand, if we had to fix those numbers and build info tables >>>> that knew them, and typechecker support, for instance, it'd get rather >>>> invasive. >>>> >>>> Also, a number of things that we can do with the 'sized' versions >>>> above, like working with evil unsized c-style arrays directly inline at the >>>> end of the structure cease to be possible, so it isn't even a pure win if >>>> we did the engineering effort. >>>> >>>> I think 90% of the needs I have are covered just by adding the one >>>> primitive. The last 10% gets pretty invasive. >>>> >>>> -Edward >>>> >>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>> wrote: >>>> >>>>> I like the possibility of a general solution for mutable structs (like >>>>> Ed said), and I'm trying to fully understand why it's hard. >>>>> >>>>> So, we can't unpack MutVar into constructors because of object >>>>> identity problems. But what about directly supporting an extensible set of >>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? >>>>> That may be too much work, but is it problematic otherwise? >>>>> >>>>> Needless to say, this is also critical if we ever want best in class >>>>> lockfree mutable structures, just like their Stm and sequential >>>>> counterparts. >>>>> >>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>>>> simonpj at microsoft.com> wrote: >>>>> >>>>>> At the very least I'll take this email and turn it into a short >>>>>> article. >>>>>> >>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>>> make a ticket for it. >>>>>> >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> >>>>>> Simon >>>>>> >>>>>> >>>>>> >>>>>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>>>>> *Sent:* 27 August 2015 16:54 >>>>>> *To:* Simon Peyton Jones >>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>> *Subject:* Re: ArrayArrays >>>>>> >>>>>> >>>>>> >>>>>> An ArrayArray# is just an Array# with a modified invariant. It points >>>>>> directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>> >>>>>> >>>>>> >>>>>> While those live in #, they are garbage collected objects, so this >>>>>> all lives on the heap. >>>>>> >>>>>> >>>>>> >>>>>> They were added to make some of the DPH stuff fast when it has to >>>>>> deal with nested arrays. >>>>>> >>>>>> >>>>>> >>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>> >>>>>> >>>>>> >>>>>> The Problem >>>>>> >>>>>> ----------------- >>>>>> >>>>>> >>>>>> >>>>>> Consider the scenario where you write a classic doubly-linked list in >>>>>> Haskell. >>>>>> >>>>>> >>>>>> >>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>> >>>>>> >>>>>> >>>>>> Chasing from one DLL to the next requires following 3 pointers on the >>>>>> heap. >>>>>> >>>>>> >>>>>> >>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe >>>>>> DLL ~> DLL >>>>>> >>>>>> >>>>>> >>>>>> That is 3 levels of indirection. >>>>>> >>>>>> >>>>>> >>>>>> We can trim one by simply unpacking the IORef with >>>>>> -funbox-strict-fields or UNPACK >>>>>> >>>>>> >>>>>> >>>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>>> worsening our representation. >>>>>> >>>>>> >>>>>> >>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>> >>>>>> >>>>>> >>>>>> but now we're still stuck with a level of indirection >>>>>> >>>>>> >>>>>> >>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>> >>>>>> >>>>>> >>>>>> This means that every operation we perform on this structure will be >>>>>> about half of the speed of an implementation in most other languages >>>>>> assuming we're memory bound on loading things into cache! >>>>>> >>>>>> >>>>>> >>>>>> Making Progress >>>>>> >>>>>> ---------------------- >>>>>> >>>>>> >>>>>> >>>>>> I have been working on a number of data structures where the >>>>>> indirection of going from something in * out to an object in # which >>>>>> contains the real pointer to my target and coming back effectively doubles >>>>>> my runtime. >>>>>> >>>>>> >>>>>> >>>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>>> onto the mutable list when we dirty it. There is a well defined >>>>>> write-barrier. >>>>>> >>>>>> >>>>>> >>>>>> I could change out the representation to use >>>>>> >>>>>> >>>>>> >>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>> >>>>>> >>>>>> >>>>>> I can just store two pointers in the MutableArray# every time, but >>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct >>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>>> >>>>>> >>>>>> >>>>>> I still have to go out to the heap from my DLL and get to the array >>>>>> object and then chase it to the next DLL and chase that to the next array. >>>>>> I do get my two pointers together in memory though. I'm paying for a card >>>>>> marking table as well, which I don't particularly need with just two >>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>>>> added back in 7.10, which is just the old array code a a new data type, >>>>>> which can speed things up a bit when you don't have very big arrays: >>>>>> >>>>>> >>>>>> >>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>> >>>>>> >>>>>> >>>>>> But what if I wanted my object itself to live in # and have two >>>>>> mutable fields and be able to share the sme write barrier? >>>>>> >>>>>> >>>>>> >>>>>> An ArrayArray# points directly to other unlifted array types. What if >>>>>> we have one # -> * wrapper on the outside to deal with the impedence >>>>>> mismatch between the imperative world and Haskell, and then just let the >>>>>> ArrayArray#'s hold other arrayarrays. >>>>>> >>>>>> >>>>>> >>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>> >>>>>> >>>>>> >>>>>> now I need to make up a new Nil, which I can just make be a special >>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>>>> synonyms. Alternately I can exploit the internals further to make this >>>>>> cheaper. >>>>>> >>>>>> >>>>>> >>>>>> Then I can use the readMutableArrayArray# and writeMutableArrayArray# >>>>>> calls to directly access the preceding and next entry in the linked list. >>>>>> >>>>>> >>>>>> >>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>>> strict world, and everything there lives in #. >>>>>> >>>>>> >>>>>> >>>>>> next :: DLL -> IO DLL >>>>>> >>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>> >>>>>> (# s', n #) -> (# s', DLL n #) >>>>>> >>>>>> >>>>>> >>>>>> It turns out GHC is quite happy to optimize all of that code to keep >>>>>> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >>>>>> known strict and you chain operations of this sort! >>>>>> >>>>>> >>>>>> >>>>>> Cleaning it Up >>>>>> >>>>>> ------------------ >>>>>> >>>>>> >>>>>> >>>>>> Now I have one outermost indirection pointing to an array that points >>>>>> directly to other arrays. >>>>>> >>>>>> >>>>>> >>>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>>> that by duplicating the code for MutableArrayArray# and using a >>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>>>> takes. >>>>>> >>>>>> >>>>>> >>>>>> This is almost ideal, but not quite. I often have fields that would >>>>>> be best left unboxed. >>>>>> >>>>>> >>>>>> >>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>> >>>>>> >>>>>> >>>>>> was able to unpack the Int, but we lost that. We can currently at >>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a >>>>>> MutableByteArray# for all of our misc. data and shove the int in question >>>>>> in there. >>>>>> >>>>>> >>>>>> >>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to store >>>>>> masks and administrivia as I walk down the tree. Having to go off to the >>>>>> side costs me the entire win from avoiding the first pointer chase. >>>>>> >>>>>> >>>>>> >>>>>> But, if like Ryan suggested, we had a heap object we could construct >>>>>> that had n words with unsafe access and m pointers to other heap objects, >>>>>> one that could put itself on the mutable list when any of those pointers >>>>>> changed then I could shed this last factor of two in all circumstances. >>>>>> >>>>>> >>>>>> >>>>>> Prototype >>>>>> >>>>>> ------------- >>>>>> >>>>>> >>>>>> >>>>>> Over the last few days I've put together a small prototype >>>>>> implementation with a few non-trivial imperative data structures for things >>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>>> order-maintenance. >>>>>> >>>>>> >>>>>> >>>>>> https://github.com/ekmett/structs >>>>>> >>>>>> >>>>>> >>>>>> Notable bits: >>>>>> >>>>>> >>>>>> >>>>>> Data.Struct.Internal.LinkCut >>>>>> >>>>>> provides an implementation of link-cut trees in this style. >>>>>> >>>>>> >>>>>> >>>>>> Data.Struct.Internal >>>>>> >>>>>> provides the rather horrifying guts that make it go fast. >>>>>> >>>>>> >>>>>> >>>>>> Once compiled with -O or -O2, if you look at the core, almost all the >>>>>> references to the LinkCut or Object data constructor get optimized away, >>>>>> and we're left with beautiful strict code directly mutating out underlying >>>>>> representation. >>>>>> >>>>>> >>>>>> >>>>>> At the very least I'll take this email and turn it into a short >>>>>> article. >>>>>> >>>>>> >>>>>> >>>>>> -Edward >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>>>> simonpj at microsoft.com> wrote: >>>>>> >>>>>> Just to say that I have no idea what is going on in this thread. >>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? Is >>>>>> there a wiki page? >>>>>> >>>>>> >>>>>> >>>>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>>>> thing. >>>>>> >>>>>> >>>>>> >>>>>> Simon >>>>>> >>>>>> >>>>>> >>>>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of >>>>>> *Edward Kmett >>>>>> *Sent:* 21 August 2015 05:25 >>>>>> *To:* Manuel M T Chakravarty >>>>>> *Cc:* Simon Marlow; ghc-devs >>>>>> *Subject:* Re: ArrayArrays >>>>>> >>>>>> >>>>>> >>>>>> When (ab)using them for this purpose, SmallArrayArray's would be very >>>>>> handy as well. >>>>>> >>>>>> >>>>>> >>>>>> Consider right now if I have something like an order-maintenance >>>>>> structure I have: >>>>>> >>>>>> >>>>>> >>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK >>>>>> #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>> >>>>>> >>>>>> >>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK >>>>>> #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK >>>>>> #-} !(MutVar s (Lower s)) >>>>>> >>>>>> >>>>>> >>>>>> The former contains, logically, a mutable integer and two pointers, >>>>>> one for forward and one for backwards. The latter is basically the same >>>>>> thing with a mutable reference up pointing at the structure above. >>>>>> >>>>>> >>>>>> >>>>>> On the heap this is an object that points to a structure for the >>>>>> bytearray, and points to another structure for each mutvar which each point >>>>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>>>> over everything. >>>>>> >>>>>> >>>>>> >>>>>> So this is a pair of doubly linked lists with an upward link from the >>>>>> structure below to the structure above. >>>>>> >>>>>> >>>>>> >>>>>> Converted into ArrayArray#s I'd get >>>>>> >>>>>> >>>>>> >>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>> >>>>>> >>>>>> >>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the >>>>>> next 2 slots pointing to the previous and next previous objects, >>>>>> represented just as their MutableArrayArray#s. I can use >>>>>> sameMutableArrayArray# on these for object identity, which lets me check >>>>>> for the ends of the lists by tying things back on themselves. >>>>>> >>>>>> >>>>>> >>>>>> and below that >>>>>> >>>>>> >>>>>> >>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>> >>>>>> >>>>>> >>>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>>> upper structure. >>>>>> >>>>>> >>>>>> >>>>>> I can then write a handful of combinators for getting out the slots >>>>>> in question, while it has gained a level of indirection between the wrapper >>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically >>>>>> erased by ghc. >>>>>> >>>>>> >>>>>> >>>>>> Unlike before I don't have several separate objects on the heap for >>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>>>> itself, and the MutableByteArray# that it references to carry around the >>>>>> mutable int. >>>>>> >>>>>> >>>>>> >>>>>> The only pain points are >>>>>> >>>>>> >>>>>> >>>>>> 1.) the aforementioned limitation that currently prevents me from >>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>> >>>>>> >>>>>> >>>>>> and >>>>>> >>>>>> >>>>>> >>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the card >>>>>> marking overhead. These objects are all small, 3-4 pointers wide. Card >>>>>> marking doesn't help. >>>>>> >>>>>> >>>>>> >>>>>> Alternately I could just try to do really evil things and convert the >>>>>> whole mess to SmallArrays and then figure out how to unsafeCoerce my way to >>>>>> glory, stuffing the #'d references to the other arrays directly into the >>>>>> SmallArray as slots, removing the limitation we see here by aping the >>>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>>> >>>>>> >>>>>> >>>>>> I'm pretty much willing to sacrifice almost anything on the altar of >>>>>> speed here, but I'd like to be able to let the GC move them and collect >>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>>> >>>>>> >>>>>> >>>>>> -Edward >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>>>> chak at cse.unsw.edu.au> wrote: >>>>>> >>>>>> That?s an interesting idea. >>>>>> >>>>>> Manuel >>>>>> >>>>>> > Edward Kmett : >>>>>> >>>>>> > >>>>>> > Would it be possible to add unsafe primops to add Array# and >>>>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>>>> entries are all directly unlifted avoiding a level of indirection for the >>>>>> containing structure is amazing, but I can only currently use it if my leaf >>>>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>>>>> nice to be able to have the ability to put SmallArray# a stuff down at the >>>>>> leaves to hold lifted contents. >>>>>> > >>>>>> > I accept fully that if I name the wrong type when I go to access >>>>>> one of the fields it'll lie to me, but I suppose it'd do that if i tried to >>>>>> use one of the members that held a nested ArrayArray# as a ByteArray# >>>>>> anyways, so it isn't like there is a safety story preventing this. >>>>>> > >>>>>> > I've been hunting for ways to try to kill the indirection problems >>>>>> I get with Haskell and mutable structures, and I could shoehorn a number of >>>>>> them into ArrayArrays if this worked. >>>>>> > >>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>>> indirection compared to c/java and this could reduce that pain to just 1 >>>>>> level of unnecessary indirection. >>>>>> > >>>>>> > -Edward >>>>>> >>>>>> > _______________________________________________ >>>>>> > ghc-devs mailing list >>>>>> > ghc-devs at haskell.org >>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ghc-devs mailing list >>>>>> ghc-devs at haskell.org >>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Fri Aug 28 22:39:10 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 18:39:10 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: Also there are 4 different "things" here, basically depending on two independent questions: a.) if you want to shove the sizes into the info table, and b.) if you want cardmarking. Versions with/without cardmarking for different sizes can be done pretty easily, but as noted, the infotable variants are pretty invasive. -Edward On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: > Well, on the plus side you'd save 16 bytes per object, which adds up if > they were small enough and there are enough of them. You get a bit better > locality of reference in terms of what fits in the first cache line of them. > > -Edward > > On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton wrote: > >> Yes. And for the short term I can imagine places we will settle with >> arrays even if it means tracking lengths unnecessarily and unsafeCoercing >> pointers whose types don't actually match their siblings. >> >> Is there anything to recommend the hacks mentioned for fixed sized array >> objects *other* than using them to fake structs? (Much to derecommend, as >> you mentioned!) >> >> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: >> >>> I think both are useful, but the one you suggest requires a lot more >>> plumbing and doesn't subsume all of the usecases of the other. >>> >>> -Edward >>> >>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton wrote: >>> >>>> So that primitive is an array like thing (Same pointed type, unbounded >>>> length) with extra payload. >>>> >>>> I can see how we can do without structs if we have arrays, especially >>>> with the extra payload at front. But wouldn't the general solution for >>>> structs be one that that allows new user data type defs for # types? >>>> >>>> >>>> >>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >>>> >>>>> Some form of MutableStruct# with a known number of words and a known >>>>> number of pointers is basically what Ryan Yates was suggesting above, but >>>>> where the word counts were stored in the objects themselves. >>>>> >>>>> Given that it'd have a couple of words for those counts it'd likely >>>>> want to be something we build in addition to MutVar# rather than a >>>>> replacement. >>>>> >>>>> On the other hand, if we had to fix those numbers and build info >>>>> tables that knew them, and typechecker support, for instance, it'd get >>>>> rather invasive. >>>>> >>>>> Also, a number of things that we can do with the 'sized' versions >>>>> above, like working with evil unsized c-style arrays directly inline at the >>>>> end of the structure cease to be possible, so it isn't even a pure win if >>>>> we did the engineering effort. >>>>> >>>>> I think 90% of the needs I have are covered just by adding the one >>>>> primitive. The last 10% gets pretty invasive. >>>>> >>>>> -Edward >>>>> >>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>>> wrote: >>>>> >>>>>> I like the possibility of a general solution for mutable structs >>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >>>>>> >>>>>> So, we can't unpack MutVar into constructors because of object >>>>>> identity problems. But what about directly supporting an extensible set of >>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? >>>>>> That may be too much work, but is it problematic otherwise? >>>>>> >>>>>> Needless to say, this is also critical if we ever want best in class >>>>>> lockfree mutable structures, just like their Stm and sequential >>>>>> counterparts. >>>>>> >>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>>>>> simonpj at microsoft.com> wrote: >>>>>> >>>>>>> At the very least I'll take this email and turn it into a short >>>>>>> article. >>>>>>> >>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>>>> make a ticket for it. >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> >>>>>>> Simon >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>>>>>> *Sent:* 27 August 2015 16:54 >>>>>>> *To:* Simon Peyton Jones >>>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>>> *Subject:* Re: ArrayArrays >>>>>>> >>>>>>> >>>>>>> >>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>>> >>>>>>> >>>>>>> >>>>>>> While those live in #, they are garbage collected objects, so this >>>>>>> all lives on the heap. >>>>>>> >>>>>>> >>>>>>> >>>>>>> They were added to make some of the DPH stuff fast when it has to >>>>>>> deal with nested arrays. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>>> >>>>>>> >>>>>>> >>>>>>> The Problem >>>>>>> >>>>>>> ----------------- >>>>>>> >>>>>>> >>>>>>> >>>>>>> Consider the scenario where you write a classic doubly-linked list >>>>>>> in Haskell. >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Chasing from one DLL to the next requires following 3 pointers on >>>>>>> the heap. >>>>>>> >>>>>>> >>>>>>> >>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe >>>>>>> DLL ~> DLL >>>>>>> >>>>>>> >>>>>>> >>>>>>> That is 3 levels of indirection. >>>>>>> >>>>>>> >>>>>>> >>>>>>> We can trim one by simply unpacking the IORef with >>>>>>> -funbox-strict-fields or UNPACK >>>>>>> >>>>>>> >>>>>>> >>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>>>> worsening our representation. >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>>> >>>>>>> >>>>>>> >>>>>>> but now we're still stuck with a level of indirection >>>>>>> >>>>>>> >>>>>>> >>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>>> >>>>>>> >>>>>>> >>>>>>> This means that every operation we perform on this structure will be >>>>>>> about half of the speed of an implementation in most other languages >>>>>>> assuming we're memory bound on loading things into cache! >>>>>>> >>>>>>> >>>>>>> >>>>>>> Making Progress >>>>>>> >>>>>>> ---------------------- >>>>>>> >>>>>>> >>>>>>> >>>>>>> I have been working on a number of data structures where the >>>>>>> indirection of going from something in * out to an object in # which >>>>>>> contains the real pointer to my target and coming back effectively doubles >>>>>>> my runtime. >>>>>>> >>>>>>> >>>>>>> >>>>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>>>> onto the mutable list when we dirty it. There is a well defined >>>>>>> write-barrier. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I could change out the representation to use >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>>> >>>>>>> >>>>>>> >>>>>>> I can just store two pointers in the MutableArray# every time, but >>>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct >>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I still have to go out to the heap from my DLL and get to the array >>>>>>> object and then chase it to the next DLL and chase that to the next array. >>>>>>> I do get my two pointers together in memory though. I'm paying for a card >>>>>>> marking table as well, which I don't particularly need with just two >>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>>>>> added back in 7.10, which is just the old array code a a new data type, >>>>>>> which can speed things up a bit when you don't have very big arrays: >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>>> >>>>>>> >>>>>>> >>>>>>> But what if I wanted my object itself to live in # and have two >>>>>>> mutable fields and be able to share the sme write barrier? >>>>>>> >>>>>>> >>>>>>> >>>>>>> An ArrayArray# points directly to other unlifted array types. What >>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence >>>>>>> mismatch between the imperative world and Haskell, and then just let the >>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>>> >>>>>>> >>>>>>> >>>>>>> now I need to make up a new Nil, which I can just make be a special >>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>>>>> synonyms. Alternately I can exploit the internals further to make this >>>>>>> cheaper. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Then I can use the readMutableArrayArray# and >>>>>>> writeMutableArrayArray# calls to directly access the preceding and next >>>>>>> entry in the linked list. >>>>>>> >>>>>>> >>>>>>> >>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>>>> strict world, and everything there lives in #. >>>>>>> >>>>>>> >>>>>>> >>>>>>> next :: DLL -> IO DLL >>>>>>> >>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>>> >>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>>> >>>>>>> >>>>>>> >>>>>>> It turns out GHC is quite happy to optimize all of that code to keep >>>>>>> things unboxed. The 'DLL' wrappers get removed pretty easily when they are >>>>>>> known strict and you chain operations of this sort! >>>>>>> >>>>>>> >>>>>>> >>>>>>> Cleaning it Up >>>>>>> >>>>>>> ------------------ >>>>>>> >>>>>>> >>>>>>> >>>>>>> Now I have one outermost indirection pointing to an array that >>>>>>> points directly to other arrays. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>>>> that by duplicating the code for MutableArrayArray# and using a >>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>>>>> takes. >>>>>>> >>>>>>> >>>>>>> >>>>>>> This is almost ideal, but not quite. I often have fields that would >>>>>>> be best left unboxed. >>>>>>> >>>>>>> >>>>>>> >>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>>> >>>>>>> >>>>>>> >>>>>>> was able to unpack the Int, but we lost that. We can currently at >>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a >>>>>>> MutableByteArray# for all of our misc. data and shove the int in question >>>>>>> in there. >>>>>>> >>>>>>> >>>>>>> >>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to store >>>>>>> masks and administrivia as I walk down the tree. Having to go off to the >>>>>>> side costs me the entire win from avoiding the first pointer chase. >>>>>>> >>>>>>> >>>>>>> >>>>>>> But, if like Ryan suggested, we had a heap object we could construct >>>>>>> that had n words with unsafe access and m pointers to other heap objects, >>>>>>> one that could put itself on the mutable list when any of those pointers >>>>>>> changed then I could shed this last factor of two in all circumstances. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Prototype >>>>>>> >>>>>>> ------------- >>>>>>> >>>>>>> >>>>>>> >>>>>>> Over the last few days I've put together a small prototype >>>>>>> implementation with a few non-trivial imperative data structures for things >>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>>>> order-maintenance. >>>>>>> >>>>>>> >>>>>>> >>>>>>> https://github.com/ekmett/structs >>>>>>> >>>>>>> >>>>>>> >>>>>>> Notable bits: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Data.Struct.Internal.LinkCut >>>>>>> >>>>>>> provides an implementation of link-cut trees in this style. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Data.Struct.Internal >>>>>>> >>>>>>> provides the rather horrifying guts that make it go fast. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once compiled with -O or -O2, if you look at the core, almost all >>>>>>> the references to the LinkCut or Object data constructor get optimized >>>>>>> away, and we're left with beautiful strict code directly mutating out >>>>>>> underlying representation. >>>>>>> >>>>>>> >>>>>>> >>>>>>> At the very least I'll take this email and turn it into a short >>>>>>> article. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -Edward >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>>>>> simonpj at microsoft.com> wrote: >>>>>>> >>>>>>> Just to say that I have no idea what is going on in this thread. >>>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? Is >>>>>>> there a wiki page? >>>>>>> >>>>>>> >>>>>>> >>>>>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>>>>> thing. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Simon >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf >>>>>>> Of *Edward Kmett >>>>>>> *Sent:* 21 August 2015 05:25 >>>>>>> *To:* Manuel M T Chakravarty >>>>>>> *Cc:* Simon Marlow; ghc-devs >>>>>>> *Subject:* Re: ArrayArrays >>>>>>> >>>>>>> >>>>>>> >>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >>>>>>> very handy as well. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Consider right now if I have something like an order-maintenance >>>>>>> structure I have: >>>>>>> >>>>>>> >>>>>>> >>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK >>>>>>> #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>>> >>>>>>> >>>>>>> >>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK >>>>>>> #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK >>>>>>> #-} !(MutVar s (Lower s)) >>>>>>> >>>>>>> >>>>>>> >>>>>>> The former contains, logically, a mutable integer and two pointers, >>>>>>> one for forward and one for backwards. The latter is basically the same >>>>>>> thing with a mutable reference up pointing at the structure above. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On the heap this is an object that points to a structure for the >>>>>>> bytearray, and points to another structure for each mutvar which each point >>>>>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>>>>> over everything. >>>>>>> >>>>>>> >>>>>>> >>>>>>> So this is a pair of doubly linked lists with an upward link from >>>>>>> the structure below to the structure above. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Converted into ArrayArray#s I'd get >>>>>>> >>>>>>> >>>>>>> >>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>>> >>>>>>> >>>>>>> >>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the >>>>>>> next 2 slots pointing to the previous and next previous objects, >>>>>>> represented just as their MutableArrayArray#s. I can use >>>>>>> sameMutableArrayArray# on these for object identity, which lets me check >>>>>>> for the ends of the lists by tying things back on themselves. >>>>>>> >>>>>>> >>>>>>> >>>>>>> and below that >>>>>>> >>>>>>> >>>>>>> >>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>>> >>>>>>> >>>>>>> >>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>>>> upper structure. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I can then write a handful of combinators for getting out the slots >>>>>>> in question, while it has gained a level of indirection between the wrapper >>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically >>>>>>> erased by ghc. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Unlike before I don't have several separate objects on the heap for >>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>>>>> itself, and the MutableByteArray# that it references to carry around the >>>>>>> mutable int. >>>>>>> >>>>>>> >>>>>>> >>>>>>> The only pain points are >>>>>>> >>>>>>> >>>>>>> >>>>>>> 1.) the aforementioned limitation that currently prevents me from >>>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>>> >>>>>>> >>>>>>> >>>>>>> and >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the card >>>>>>> marking overhead. These objects are all small, 3-4 pointers wide. Card >>>>>>> marking doesn't help. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Alternately I could just try to do really evil things and convert >>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce my >>>>>>> way to glory, stuffing the #'d references to the other arrays directly into >>>>>>> the SmallArray as slots, removing the limitation we see here by aping the >>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm pretty much willing to sacrifice almost anything on the altar of >>>>>>> speed here, but I'd like to be able to let the GC move them and collect >>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -Edward >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>>>>> chak at cse.unsw.edu.au> wrote: >>>>>>> >>>>>>> That?s an interesting idea. >>>>>>> >>>>>>> Manuel >>>>>>> >>>>>>> > Edward Kmett : >>>>>>> >>>>>>> > >>>>>>> > Would it be possible to add unsafe primops to add Array# and >>>>>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>>>>> entries are all directly unlifted avoiding a level of indirection for the >>>>>>> containing structure is amazing, but I can only currently use it if my leaf >>>>>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>>>>>> nice to be able to have the ability to put SmallArray# a stuff down at the >>>>>>> leaves to hold lifted contents. >>>>>>> > >>>>>>> > I accept fully that if I name the wrong type when I go to access >>>>>>> one of the fields it'll lie to me, but I suppose it'd do that if i tried to >>>>>>> use one of the members that held a nested ArrayArray# as a ByteArray# >>>>>>> anyways, so it isn't like there is a safety story preventing this. >>>>>>> > >>>>>>> > I've been hunting for ways to try to kill the indirection problems >>>>>>> I get with Haskell and mutable structures, and I could shoehorn a number of >>>>>>> them into ArrayArrays if this worked. >>>>>>> > >>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>>>> indirection compared to c/java and this could reduce that pain to just 1 >>>>>>> level of unnecessary indirection. >>>>>>> > >>>>>>> > -Edward >>>>>>> >>>>>>> > _______________________________________________ >>>>>>> > ghc-devs mailing list >>>>>>> > ghc-devs at haskell.org >>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ghc-devs mailing list >>>>>>> ghc-devs at haskell.org >>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>> >>>>>> >>>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mle+hs at mega-nerd.com Fri Aug 28 22:50:49 2015 From: mle+hs at mega-nerd.com (Erik de Castro Lopo) Date: Sat, 29 Aug 2015 08:50:49 +1000 Subject: GHC 7.10 complie time regression In-Reply-To: <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> References: <20150827.102417.940015966115425781.kazu@iij.ad.jp> <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <20150829085049.17bfe1a91b231a4a303c0fd5@mega-nerd.com> Simon Peyton Jones wrote: > no it's not expected to take "much longer". Can you make a ticket with > a reproducible test case? An make sure you are using ghc 7.10.2 and not 7.10.1 because 7.10.2 had some signifcant fixes for these kinds of issues. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ From rrnewton at gmail.com Fri Aug 28 23:25:50 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Fri, 28 Aug 2015 16:25:50 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: You presumably also save a bounds check on reads by hard-coding the sizes? On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett wrote: > Also there are 4 different "things" here, basically depending on two > independent questions: > > a.) if you want to shove the sizes into the info table, and > b.) if you want cardmarking. > > Versions with/without cardmarking for different sizes can be done pretty > easily, but as noted, the infotable variants are pretty invasive. > > -Edward > > On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: > >> Well, on the plus side you'd save 16 bytes per object, which adds up if >> they were small enough and there are enough of them. You get a bit better >> locality of reference in terms of what fits in the first cache line of them. >> >> -Edward >> >> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton wrote: >> >>> Yes. And for the short term I can imagine places we will settle with >>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing >>> pointers whose types don't actually match their siblings. >>> >>> Is there anything to recommend the hacks mentioned for fixed sized array >>> objects *other* than using them to fake structs? (Much to derecommend, as >>> you mentioned!) >>> >>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: >>> >>>> I think both are useful, but the one you suggest requires a lot more >>>> plumbing and doesn't subsume all of the usecases of the other. >>>> >>>> -Edward >>>> >>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >>>> wrote: >>>> >>>>> So that primitive is an array like thing (Same pointed type, unbounded >>>>> length) with extra payload. >>>>> >>>>> I can see how we can do without structs if we have arrays, especially >>>>> with the extra payload at front. But wouldn't the general solution for >>>>> structs be one that that allows new user data type defs for # types? >>>>> >>>>> >>>>> >>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >>>>> >>>>>> Some form of MutableStruct# with a known number of words and a known >>>>>> number of pointers is basically what Ryan Yates was suggesting above, but >>>>>> where the word counts were stored in the objects themselves. >>>>>> >>>>>> Given that it'd have a couple of words for those counts it'd likely >>>>>> want to be something we build in addition to MutVar# rather than a >>>>>> replacement. >>>>>> >>>>>> On the other hand, if we had to fix those numbers and build info >>>>>> tables that knew them, and typechecker support, for instance, it'd get >>>>>> rather invasive. >>>>>> >>>>>> Also, a number of things that we can do with the 'sized' versions >>>>>> above, like working with evil unsized c-style arrays directly inline at the >>>>>> end of the structure cease to be possible, so it isn't even a pure win if >>>>>> we did the engineering effort. >>>>>> >>>>>> I think 90% of the needs I have are covered just by adding the one >>>>>> primitive. The last 10% gets pretty invasive. >>>>>> >>>>>> -Edward >>>>>> >>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>>>> wrote: >>>>>> >>>>>>> I like the possibility of a general solution for mutable structs >>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >>>>>>> >>>>>>> So, we can't unpack MutVar into constructors because of object >>>>>>> identity problems. But what about directly supporting an extensible set of >>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? >>>>>>> That may be too much work, but is it problematic otherwise? >>>>>>> >>>>>>> Needless to say, this is also critical if we ever want best in class >>>>>>> lockfree mutable structures, just like their Stm and sequential >>>>>>> counterparts. >>>>>>> >>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>>>>>> simonpj at microsoft.com> wrote: >>>>>>> >>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>> article. >>>>>>>> >>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>>>>> make a ticket for it. >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Simon >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* Edward Kmett [mailto:ekmett at gmail.com] >>>>>>>> *Sent:* 27 August 2015 16:54 >>>>>>>> *To:* Simon Peyton Jones >>>>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>>>> *Subject:* Re: ArrayArrays >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> While those live in #, they are garbage collected objects, so this >>>>>>>> all lives on the heap. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> They were added to make some of the DPH stuff fast when it has to >>>>>>>> deal with nested arrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The Problem >>>>>>>> >>>>>>>> ----------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Consider the scenario where you write a classic doubly-linked list >>>>>>>> in Haskell. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Chasing from one DLL to the next requires following 3 pointers on >>>>>>>> the heap. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe >>>>>>>> DLL ~> DLL >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> That is 3 levels of indirection. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We can trim one by simply unpacking the IORef with >>>>>>>> -funbox-strict-fields or UNPACK >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>>>>> worsening our representation. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> but now we're still stuck with a level of indirection >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This means that every operation we perform on this structure will >>>>>>>> be about half of the speed of an implementation in most other languages >>>>>>>> assuming we're memory bound on loading things into cache! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Making Progress >>>>>>>> >>>>>>>> ---------------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I have been working on a number of data structures where the >>>>>>>> indirection of going from something in * out to an object in # which >>>>>>>> contains the real pointer to my target and coming back effectively doubles >>>>>>>> my runtime. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>>>>> onto the mutable list when we dirty it. There is a well defined >>>>>>>> write-barrier. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I could change out the representation to use >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I can just store two pointers in the MutableArray# every time, but >>>>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct >>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I still have to go out to the heap from my DLL and get to the array >>>>>>>> object and then chase it to the next DLL and chase that to the next array. >>>>>>>> I do get my two pointers together in memory though. I'm paying for a card >>>>>>>> marking table as well, which I don't particularly need with just two >>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>>>>>> added back in 7.10, which is just the old array code a a new data type, >>>>>>>> which can speed things up a bit when you don't have very big arrays: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> But what if I wanted my object itself to live in # and have two >>>>>>>> mutable fields and be able to share the sme write barrier? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> An ArrayArray# points directly to other unlifted array types. What >>>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence >>>>>>>> mismatch between the imperative world and Haskell, and then just let the >>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> now I need to make up a new Nil, which I can just make be a special >>>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>>>>>> synonyms. Alternately I can exploit the internals further to make this >>>>>>>> cheaper. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Then I can use the readMutableArrayArray# and >>>>>>>> writeMutableArrayArray# calls to directly access the preceding and next >>>>>>>> entry in the linked list. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>>>>> strict world, and everything there lives in #. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> next :: DLL -> IO DLL >>>>>>>> >>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>>>> >>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It turns out GHC is quite happy to optimize all of that code to >>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty easily when they >>>>>>>> are known strict and you chain operations of this sort! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Cleaning it Up >>>>>>>> >>>>>>>> ------------------ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Now I have one outermost indirection pointing to an array that >>>>>>>> points directly to other arrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>>>>> that by duplicating the code for MutableArrayArray# and using a >>>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>>>>>> takes. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This is almost ideal, but not quite. I often have fields that would >>>>>>>> be best left unboxed. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> was able to unpack the Int, but we lost that. We can currently at >>>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a >>>>>>>> MutableByteArray# for all of our misc. data and shove the int in question >>>>>>>> in there. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >>>>>>>> store masks and administrivia as I walk down the tree. Having to go off to >>>>>>>> the side costs me the entire win from avoiding the first pointer chase. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> But, if like Ryan suggested, we had a heap object we could >>>>>>>> construct that had n words with unsafe access and m pointers to other heap >>>>>>>> objects, one that could put itself on the mutable list when any of those >>>>>>>> pointers changed then I could shed this last factor of two in all >>>>>>>> circumstances. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Prototype >>>>>>>> >>>>>>>> ------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Over the last few days I've put together a small prototype >>>>>>>> implementation with a few non-trivial imperative data structures for things >>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>>>>> order-maintenance. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/ekmett/structs >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Notable bits: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Data.Struct.Internal.LinkCut >>>>>>>> >>>>>>>> provides an implementation of link-cut trees in this style. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Data.Struct.Internal >>>>>>>> >>>>>>>> provides the rather horrifying guts that make it go fast. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all >>>>>>>> the references to the LinkCut or Object data constructor get optimized >>>>>>>> away, and we're left with beautiful strict code directly mutating out >>>>>>>> underlying representation. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>> article. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Edward >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>>>>>> simonpj at microsoft.com> wrote: >>>>>>>> >>>>>>>> Just to say that I have no idea what is going on in this thread. >>>>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? Is >>>>>>>> there a wiki page? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>>>>>> thing. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Simon >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf >>>>>>>> Of *Edward Kmett >>>>>>>> *Sent:* 21 August 2015 05:25 >>>>>>>> *To:* Manuel M T Chakravarty >>>>>>>> *Cc:* Simon Marlow; ghc-devs >>>>>>>> *Subject:* Re: ArrayArrays >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >>>>>>>> very handy as well. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Consider right now if I have something like an order-maintenance >>>>>>>> structure I have: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# >>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The former contains, logically, a mutable integer and two pointers, >>>>>>>> one for forward and one for backwards. The latter is basically the same >>>>>>>> thing with a mutable reference up pointing at the structure above. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On the heap this is an object that points to a structure for the >>>>>>>> bytearray, and points to another structure for each mutvar which each point >>>>>>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>>>>>> over everything. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So this is a pair of doubly linked lists with an upward link from >>>>>>>> the structure below to the structure above. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Converted into ArrayArray#s I'd get >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the >>>>>>>> next 2 slots pointing to the previous and next previous objects, >>>>>>>> represented just as their MutableArrayArray#s. I can use >>>>>>>> sameMutableArrayArray# on these for object identity, which lets me check >>>>>>>> for the ends of the lists by tying things back on themselves. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and below that >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>>>>> upper structure. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I can then write a handful of combinators for getting out the slots >>>>>>>> in question, while it has gained a level of indirection between the wrapper >>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically >>>>>>>> erased by ghc. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Unlike before I don't have several separate objects on the heap for >>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>>>>>> itself, and the MutableByteArray# that it references to carry around the >>>>>>>> mutable int. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The only pain points are >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 1.) the aforementioned limitation that currently prevents me from >>>>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >>>>>>>> card marking overhead. These objects are all small, 3-4 pointers wide. Card >>>>>>>> marking doesn't help. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Alternately I could just try to do really evil things and convert >>>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce my >>>>>>>> way to glory, stuffing the #'d references to the other arrays directly into >>>>>>>> the SmallArray as slots, removing the limitation we see here by aping the >>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar >>>>>>>> of speed here, but I'd like to be able to let the GC move them and collect >>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Edward >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>>>>>> chak at cse.unsw.edu.au> wrote: >>>>>>>> >>>>>>>> That?s an interesting idea. >>>>>>>> >>>>>>>> Manuel >>>>>>>> >>>>>>>> > Edward Kmett : >>>>>>>> >>>>>>>> > >>>>>>>> > Would it be possible to add unsafe primops to add Array# and >>>>>>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>>>>>> entries are all directly unlifted avoiding a level of indirection for the >>>>>>>> containing structure is amazing, but I can only currently use it if my leaf >>>>>>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd be >>>>>>>> nice to be able to have the ability to put SmallArray# a stuff down at the >>>>>>>> leaves to hold lifted contents. >>>>>>>> > >>>>>>>> > I accept fully that if I name the wrong type when I go to access >>>>>>>> one of the fields it'll lie to me, but I suppose it'd do that if i tried to >>>>>>>> use one of the members that held a nested ArrayArray# as a ByteArray# >>>>>>>> anyways, so it isn't like there is a safety story preventing this. >>>>>>>> > >>>>>>>> > I've been hunting for ways to try to kill the indirection >>>>>>>> problems I get with Haskell and mutable structures, and I could shoehorn a >>>>>>>> number of them into ArrayArrays if this worked. >>>>>>>> > >>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>>>>> indirection compared to c/java and this could reduce that pain to just 1 >>>>>>>> level of unnecessary indirection. >>>>>>>> > >>>>>>>> > -Edward >>>>>>>> >>>>>>>> > _______________________________________________ >>>>>>>> > ghc-devs mailing list >>>>>>>> > ghc-devs at haskell.org >>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ghc-devs mailing list >>>>>>>> ghc-devs at haskell.org >>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>> >>>>>>> >>>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Fri Aug 28 23:33:41 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 19:33:41 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> They just segfault at this level. ;) Sent from my iPhone > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > > You presumably also save a bounds check on reads by hard-coding the sizes? > >> On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett wrote: >> Also there are 4 different "things" here, basically depending on two independent questions: >> >> a.) if you want to shove the sizes into the info table, and >> b.) if you want cardmarking. >> >> Versions with/without cardmarking for different sizes can be done pretty easily, but as noted, the infotable variants are pretty invasive. >> >> -Edward >> >>> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: >>> Well, on the plus side you'd save 16 bytes per object, which adds up if they were small enough and there are enough of them. You get a bit better locality of reference in terms of what fits in the first cache line of them. >>> >>> -Edward >>> >>>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton wrote: >>>> Yes. And for the short term I can imagine places we will settle with arrays even if it means tracking lengths unnecessarily and unsafeCoercing pointers whose types don't actually match their siblings. >>>> >>>> Is there anything to recommend the hacks mentioned for fixed sized array objects *other* than using them to fake structs? (Much to derecommend, as you mentioned!) >>>> >>>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: >>>>> I think both are useful, but the one you suggest requires a lot more plumbing and doesn't subsume all of the usecases of the other. >>>>> >>>>> -Edward >>>>> >>>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton wrote: >>>>>> So that primitive is an array like thing (Same pointed type, unbounded length) with extra payload. >>>>>> >>>>>> I can see how we can do without structs if we have arrays, especially with the extra payload at front. But wouldn't the general solution for structs be one that that allows new user data type defs for # types? >>>>>> >>>>>> >>>>>> >>>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >>>>>>> Some form of MutableStruct# with a known number of words and a known number of pointers is basically what Ryan Yates was suggesting above, but where the word counts were stored in the objects themselves. >>>>>>> >>>>>>> Given that it'd have a couple of words for those counts it'd likely want to be something we build in addition to MutVar# rather than a replacement. >>>>>>> >>>>>>> On the other hand, if we had to fix those numbers and build info tables that knew them, and typechecker support, for instance, it'd get rather invasive. >>>>>>> >>>>>>> Also, a number of things that we can do with the 'sized' versions above, like working with evil unsized c-style arrays directly inline at the end of the structure cease to be possible, so it isn't even a pure win if we did the engineering effort. >>>>>>> >>>>>>> I think 90% of the needs I have are covered just by adding the one primitive. The last 10% gets pretty invasive. >>>>>>> >>>>>>> -Edward >>>>>>> >>>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton wrote: >>>>>>>> I like the possibility of a general solution for mutable structs (like Ed said), and I'm trying to fully understand why it's hard. >>>>>>>> >>>>>>>> So, we can't unpack MutVar into constructors because of object identity problems. But what about directly supporting an extensible set of unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? That may be too much work, but is it problematic otherwise? >>>>>>>> >>>>>>>> Needless to say, this is also critical if we ever want best in class lockfree mutable structures, just like their Stm and sequential counterparts. >>>>>>>> >>>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones wrote: >>>>>>>>> At the very least I'll take this email and turn it into a short article. >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make a ticket for it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Simon >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>>>>>>> Sent: 27 August 2015 16:54 >>>>>>>>> To: Simon Peyton Jones >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>>>>> Subject: Re: ArrayArrays >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It points directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>>>>> >>>>>>>>> While those live in #, they are garbage collected objects, so this all lives on the heap. >>>>>>>>> >>>>>>>>> They were added to make some of the DPH stuff fast when it has to deal with nested arrays. >>>>>>>>> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>>>>> >>>>>>>>> The Problem >>>>>>>>> ----------------- >>>>>>>>> >>>>>>>>> Consider the scenario where you write a classic doubly-linked list in Haskell. >>>>>>>>> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>>>>> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers on the heap. >>>>>>>>> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> DLL >>>>>>>>> >>>>>>>>> That is 3 levels of indirection. >>>>>>>>> >>>>>>>>> We can trim one by simply unpacking the IORef with -funbox-strict-fields or UNPACK >>>>>>>>> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and worsening our representation. >>>>>>>>> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>>>>> >>>>>>>>> but now we're still stuck with a level of indirection >>>>>>>>> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>>>>> >>>>>>>>> This means that every operation we perform on this structure will be about half of the speed of an implementation in most other languages assuming we're memory bound on loading things into cache! >>>>>>>>> >>>>>>>>> Making Progress >>>>>>>>> ---------------------- >>>>>>>>> >>>>>>>>> I have been working on a number of data structures where the indirection of going from something in * out to an object in # which contains the real pointer to my target and coming back effectively doubles my runtime. >>>>>>>>> >>>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar# onto the mutable list when we dirty it. There is a well defined write-barrier. >>>>>>>>> >>>>>>>>> I could change out the representation to use >>>>>>>>> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>>>>> >>>>>>>>> I can just store two pointers in the MutableArray# every time, but this doesn't help _much_ directly. It has reduced the amount of distinct addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>>>>>> >>>>>>>>> I still have to go out to the heap from my DLL and get to the array object and then chase it to the next DLL and chase that to the next array. I do get my two pointers together in memory though. I'm paying for a card marking table as well, which I don't particularly need with just two pointers, but we can shed that with the "SmallMutableArray#" machinery added back in 7.10, which is just the old array code a a new data type, which can speed things up a bit when you don't have very big arrays: >>>>>>>>> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>>>>> >>>>>>>>> But what if I wanted my object itself to live in # and have two mutable fields and be able to share the sme write barrier? >>>>>>>>> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. What if we have one # -> * wrapper on the outside to deal with the impedence mismatch between the imperative world and Haskell, and then just let the ArrayArray#'s hold other arrayarrays. >>>>>>>>> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>>>>> >>>>>>>>> now I need to make up a new Nil, which I can just make be a special MutableArrayArray# I allocate on program startup. I can even abuse pattern synonyms. Alternately I can exploit the internals further to make this cheaper. >>>>>>>>> >>>>>>>>> Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls to directly access the preceding and next entry in the linked list. >>>>>>>>> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a strict world, and everything there lives in #. >>>>>>>>> >>>>>>>>> next :: DLL -> IO DLL >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>>>>> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to keep things unboxed. The 'DLL' wrappers get removed pretty easily when they are known strict and you chain operations of this sort! >>>>>>>>> >>>>>>>>> Cleaning it Up >>>>>>>>> ------------------ >>>>>>>>> >>>>>>>>> Now I have one outermost indirection pointing to an array that points directly to other arrays. >>>>>>>>> >>>>>>>>> I'm stuck paying for a card marking table per object, but I can fix that by duplicating the code for MutableArrayArray# and using a SmallMutableArray#. I can hack up primops that let me store a mixture of SmallMutableArray# fields and normal ones in the data structure. Operationally, I can even do so by just unsafeCoercing the existing SmallMutableArray# primitives to change the kind of one of the arguments it takes. >>>>>>>>> >>>>>>>>> This is almost ideal, but not quite. I often have fields that would be best left unboxed. >>>>>>>>> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>>>>> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently at best point one of the entries of the SmallMutableArray# at a boxed or at a MutableByteArray# for all of our misc. data and shove the int in question in there. >>>>>>>>> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to store masks and administrivia as I walk down the tree. Having to go off to the side costs me the entire win from avoiding the first pointer chase. >>>>>>>>> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could construct that had n words with unsafe access and m pointers to other heap objects, one that could put itself on the mutable list when any of those pointers changed then I could shed this last factor of two in all circumstances. >>>>>>>>> >>>>>>>>> Prototype >>>>>>>>> ------------- >>>>>>>>> >>>>>>>>> Over the last few days I've put together a small prototype implementation with a few non-trivial imperative data structures for things like Tarjan's link-cut trees, the list labeling problem and order-maintenance. >>>>>>>>> >>>>>>>>> https://github.com/ekmett/structs >>>>>>>>> >>>>>>>>> Notable bits: >>>>>>>>> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of link-cut trees in this style. >>>>>>>>> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that make it go fast. >>>>>>>>> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all the references to the LinkCut or Object data constructor get optimized away, and we're left with beautiful strict code directly mutating out underlying representation. >>>>>>>>> >>>>>>>>> At the very least I'll take this email and turn it into a short article. >>>>>>>>> >>>>>>>>> -Edward >>>>>>>>> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones wrote: >>>>>>>>> Just to say that I have no idea what is going on in this thread. What is ArrayArray? What is the issue in general? Is there a ticket? Is there a wiki page? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a good thing. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Simon >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Edward Kmett >>>>>>>>> Sent: 21 August 2015 05:25 >>>>>>>>> To: Manuel M T Chakravarty >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>>>>>>> Subject: Re: ArrayArrays >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be very handy as well. >>>>>>>>> >>>>>>>>> Consider right now if I have something like an order-maintenance structure I have: >>>>>>>>> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>>>>> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} !(MutVar s (Lower s)) >>>>>>>>> >>>>>>>>> The former contains, logically, a mutable integer and two pointers, one for forward and one for backwards. The latter is basically the same thing with a mutable reference up pointing at the structure above. >>>>>>>>> >>>>>>>>> On the heap this is an object that points to a structure for the bytearray, and points to another structure for each mutvar which each point to the other 'Upper' structure. So there is a level of indirection smeared over everything. >>>>>>>>> >>>>>>>>> So this is a pair of doubly linked lists with an upward link from the structure below to the structure above. >>>>>>>>> >>>>>>>>> Converted into ArrayArray#s I'd get >>>>>>>>> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>>>>> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots pointing to the previous and next previous objects, represented just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for object identity, which lets me check for the ends of the lists by tying things back on themselves. >>>>>>>>> >>>>>>>>> and below that >>>>>>>>> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>>>>> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an upper structure. >>>>>>>>> >>>>>>>>> I can then write a handful of combinators for getting out the slots in question, while it has gained a level of indirection between the wrapper to put it in * and the MutableArrayArray# s in #, that one can be basically erased by ghc. >>>>>>>>> >>>>>>>>> Unlike before I don't have several separate objects on the heap for each thing. I only have 2 now. The MutableArrayArray# for the object itself, and the MutableByteArray# that it references to carry around the mutable int. >>>>>>>>> >>>>>>>>> The only pain points are >>>>>>>>> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me from stuffing normal boxed data through a SmallArray or Array into an ArrayArray leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the card marking overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't help. >>>>>>>>> >>>>>>>>> Alternately I could just try to do really evil things and convert the whole mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, stuffing the #'d references to the other arrays directly into the SmallArray as slots, removing the limitation we see here by aping the MutableArrayArray# s API, but that gets really really dangerous! >>>>>>>>> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar of speed here, but I'd like to be able to let the GC move them and collect them which rules out simpler Ptr and Addr based solutions. >>>>>>>>> >>>>>>>>> -Edward >>>>>>>>> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty wrote: >>>>>>>>> That?s an interesting idea. >>>>>>>>> >>>>>>>>> Manuel >>>>>>>>> >>>>>>>>> > Edward Kmett : >>>>>>>>> > >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents. >>>>>>>>> > >>>>>>>>> > I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety story preventing this. >>>>>>>>> > >>>>>>>>> > I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked. >>>>>>>>> > >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection. >>>>>>>>> > >>>>>>>>> > -Edward >>>>>>>>> > _______________________________________________ >>>>>>>>> > ghc-devs mailing list >>>>>>>>> > ghc-devs at haskell.org >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> ghc-devs mailing list >>>>>>>>> ghc-devs at haskell.org >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fryguybob at gmail.com Sat Aug 29 00:48:46 2015 From: fryguybob at gmail.com (Ryan Yates) Date: Fri, 28 Aug 2015 20:48:46 -0400 Subject: ArrayArrays In-Reply-To: <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: I think from my perspective, the motivation for getting the type checker involved is primarily bringing this to the level where users could be expected to build these structures. it is reasonable to think that there are people who want to use STM (a context with mutation already) to implement a straight forward data structure that avoids extra indirection penalty. There should be some places where knowing that things are field accesses rather then array indexing could be helpful, but I think GHC is good right now about handling constant offsets. In my code I don't do any bounds checking as I know I will only be accessing my arrays with constant indexes. I make wrappers for each field access and leave all the unsafe stuff in there. When things go wrong though, the compiler is no help. Maybe template Haskell that generates the appropriate wrappers is the right direction to go. There is another benefit for me when working with these as arrays in that it is quite simple and direct (given the hoops already jumped through) to play with alignment. I can ensure two pointers are never on the same cache-line by just spacing things out in the array. On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > They just segfault at this level. ;) > > Sent from my iPhone > > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > > You presumably also save a bounds check on reads by hard-coding the sizes? > > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett wrote: >> >> Also there are 4 different "things" here, basically depending on two >> independent questions: >> >> a.) if you want to shove the sizes into the info table, and >> b.) if you want cardmarking. >> >> Versions with/without cardmarking for different sizes can be done pretty >> easily, but as noted, the infotable variants are pretty invasive. >> >> -Edward >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: >>> >>> Well, on the plus side you'd save 16 bytes per object, which adds up if >>> they were small enough and there are enough of them. You get a bit better >>> locality of reference in terms of what fits in the first cache line of them. >>> >>> -Edward >>> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton wrote: >>>> >>>> Yes. And for the short term I can imagine places we will settle with >>>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing >>>> pointers whose types don't actually match their siblings. >>>> >>>> Is there anything to recommend the hacks mentioned for fixed sized array >>>> objects *other* than using them to fake structs? (Much to derecommend, as >>>> you mentioned!) >>>> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett wrote: >>>>> >>>>> I think both are useful, but the one you suggest requires a lot more >>>>> plumbing and doesn't subsume all of the usecases of the other. >>>>> >>>>> -Edward >>>>> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >>>>> wrote: >>>>>> >>>>>> So that primitive is an array like thing (Same pointed type, unbounded >>>>>> length) with extra payload. >>>>>> >>>>>> I can see how we can do without structs if we have arrays, especially >>>>>> with the extra payload at front. But wouldn't the general solution for >>>>>> structs be one that that allows new user data type defs for # types? >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett wrote: >>>>>>> >>>>>>> Some form of MutableStruct# with a known number of words and a known >>>>>>> number of pointers is basically what Ryan Yates was suggesting above, but >>>>>>> where the word counts were stored in the objects themselves. >>>>>>> >>>>>>> Given that it'd have a couple of words for those counts it'd likely >>>>>>> want to be something we build in addition to MutVar# rather than a >>>>>>> replacement. >>>>>>> >>>>>>> On the other hand, if we had to fix those numbers and build info >>>>>>> tables that knew them, and typechecker support, for instance, it'd get >>>>>>> rather invasive. >>>>>>> >>>>>>> Also, a number of things that we can do with the 'sized' versions >>>>>>> above, like working with evil unsized c-style arrays directly inline at the >>>>>>> end of the structure cease to be possible, so it isn't even a pure win if we >>>>>>> did the engineering effort. >>>>>>> >>>>>>> I think 90% of the needs I have are covered just by adding the one >>>>>>> primitive. The last 10% gets pretty invasive. >>>>>>> >>>>>>> -Edward >>>>>>> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>>>>> wrote: >>>>>>>> >>>>>>>> I like the possibility of a general solution for mutable structs >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >>>>>>>> >>>>>>>> So, we can't unpack MutVar into constructors because of object >>>>>>>> identity problems. But what about directly supporting an extensible set of >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? That >>>>>>>> may be too much work, but is it problematic otherwise? >>>>>>>> >>>>>>>> Needless to say, this is also critical if we ever want best in class >>>>>>>> lockfree mutable structures, just like their Stm and sequential >>>>>>>> counterparts. >>>>>>>> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>>> article. >>>>>>>>> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>>>>>> make a ticket for it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Simon >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>>>>>>> Sent: 27 August 2015 16:54 >>>>>>>>> To: Simon Peyton Jones >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>>>>> Subject: Re: ArrayArrays >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> While those live in #, they are garbage collected objects, so this >>>>>>>>> all lives on the heap. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> They were added to make some of the DPH stuff fast when it has to >>>>>>>>> deal with nested arrays. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The Problem >>>>>>>>> >>>>>>>>> ----------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Consider the scenario where you write a classic doubly-linked list >>>>>>>>> in Haskell. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers on >>>>>>>>> the heap. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe >>>>>>>>> DLL ~> DLL >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> That is 3 levels of indirection. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> We can trim one by simply unpacking the IORef with >>>>>>>>> -funbox-strict-fields or UNPACK >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>>>>>> worsening our representation. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> but now we're still stuck with a level of indirection >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> This means that every operation we perform on this structure will >>>>>>>>> be about half of the speed of an implementation in most other languages >>>>>>>>> assuming we're memory bound on loading things into cache! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Making Progress >>>>>>>>> >>>>>>>>> ---------------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I have been working on a number of data structures where the >>>>>>>>> indirection of going from something in * out to an object in # which >>>>>>>>> contains the real pointer to my target and coming back effectively doubles >>>>>>>>> my runtime. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>>>>>> onto the mutable list when we dirty it. There is a well defined >>>>>>>>> write-barrier. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I could change out the representation to use >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I can just store two pointers in the MutableArray# every time, but >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I still have to go out to the heap from my DLL and get to the array >>>>>>>>> object and then chase it to the next DLL and chase that to the next array. I >>>>>>>>> do get my two pointers together in memory though. I'm paying for a card >>>>>>>>> marking table as well, which I don't particularly need with just two >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery added >>>>>>>>> back in 7.10, which is just the old array code a a new data type, which can >>>>>>>>> speed things up a bit when you don't have very big arrays: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> But what if I wanted my object itself to live in # and have two >>>>>>>>> mutable fields and be able to share the sme write barrier? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. What >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence >>>>>>>>> mismatch between the imperative world and Haskell, and then just let the >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> now I need to make up a new Nil, which I can just make be a special >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern >>>>>>>>> synonyms. Alternately I can exploit the internals further to make this >>>>>>>>> cheaper. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Then I can use the readMutableArrayArray# and >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding and next >>>>>>>>> entry in the linked list. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>>>>>> strict world, and everything there lives in #. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> next :: DLL -> IO DLL >>>>>>>>> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>>>>> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty easily when they >>>>>>>>> are known strict and you chain operations of this sort! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Cleaning it Up >>>>>>>>> >>>>>>>>> ------------------ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Now I have one outermost indirection pointing to an array that >>>>>>>>> points directly to other arrays. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>>>>>> that by duplicating the code for MutableArrayArray# and using a >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of >>>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it >>>>>>>>> takes. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> This is almost ideal, but not quite. I often have fields that would >>>>>>>>> be best left unboxed. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently at >>>>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in question in >>>>>>>>> there. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >>>>>>>>> store masks and administrivia as I walk down the tree. Having to go off to >>>>>>>>> the side costs me the entire win from avoiding the first pointer chase. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >>>>>>>>> construct that had n words with unsafe access and m pointers to other heap >>>>>>>>> objects, one that could put itself on the mutable list when any of those >>>>>>>>> pointers changed then I could shed this last factor of two in all >>>>>>>>> circumstances. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Prototype >>>>>>>>> >>>>>>>>> ------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Over the last few days I've put together a small prototype >>>>>>>>> implementation with a few non-trivial imperative data structures for things >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>>>>>> order-maintenance. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> https://github.com/ekmett/structs >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Notable bits: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of link-cut >>>>>>>>> trees in this style. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that make >>>>>>>>> it go fast. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all >>>>>>>>> the references to the LinkCut or Object data constructor get optimized away, >>>>>>>>> and we're left with beautiful strict code directly mutating out underlying >>>>>>>>> representation. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>>> article. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Edward >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Just to say that I have no idea what is going on in this thread. >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? Is >>>>>>>>> there a wiki page? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a good >>>>>>>>> thing. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Simon >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of >>>>>>>>> Edward Kmett >>>>>>>>> Sent: 21 August 2015 05:25 >>>>>>>>> To: Manuel M T Chakravarty >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>>>>>>> Subject: Re: ArrayArrays >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >>>>>>>>> very handy as well. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Consider right now if I have something like an order-maintenance >>>>>>>>> structure I have: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The former contains, logically, a mutable integer and two pointers, >>>>>>>>> one for forward and one for backwards. The latter is basically the same >>>>>>>>> thing with a mutable reference up pointing at the structure above. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On the heap this is an object that points to a structure for the >>>>>>>>> bytearray, and points to another structure for each mutvar which each point >>>>>>>>> to the other 'Upper' structure. So there is a level of indirection smeared >>>>>>>>> over everything. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> So this is a pair of doubly linked lists with an upward link from >>>>>>>>> the structure below to the structure above. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Converted into ArrayArray#s I'd get >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the >>>>>>>>> next 2 slots pointing to the previous and next previous objects, represented >>>>>>>>> just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these >>>>>>>>> for object identity, which lets me check for the ends of the lists by tying >>>>>>>>> things back on themselves. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> and below that >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>>>>>> upper structure. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I can then write a handful of combinators for getting out the slots >>>>>>>>> in question, while it has gained a level of indirection between the wrapper >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically >>>>>>>>> erased by ghc. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Unlike before I don't have several separate objects on the heap for >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object itself, >>>>>>>>> and the MutableByteArray# that it references to carry around the mutable >>>>>>>>> int. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The only pain points are >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me from >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray >>>>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers wide. Card >>>>>>>>> marking doesn't help. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Alternately I could just try to do really evil things and convert >>>>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce my way >>>>>>>>> to glory, stuffing the #'d references to the other arrays directly into the >>>>>>>>> SmallArray as slots, removing the limitation we see here by aping the >>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar >>>>>>>>> of speed here, but I'd like to be able to let the GC move them and collect >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Edward >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> That?s an interesting idea. >>>>>>>>> >>>>>>>>> Manuel >>>>>>>>> >>>>>>>>> > Edward Kmett : >>>>>>>>> >>>>>>>>> > >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries >>>>>>>>> > are all directly unlifted avoiding a level of indirection for the containing >>>>>>>>> > structure is amazing, but I can only currently use it if my leaf level data >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at the leaves to >>>>>>>>> > hold lifted contents. >>>>>>>>> > >>>>>>>>> > I accept fully that if I name the wrong type when I go to access >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that if i tried to >>>>>>>>> > use one of the members that held a nested ArrayArray# as a ByteArray# >>>>>>>>> > anyways, so it isn't like there is a safety story preventing this. >>>>>>>>> > >>>>>>>>> > I've been hunting for ways to try to kill the indirection >>>>>>>>> > problems I get with Haskell and mutable structures, and I could shoehorn a >>>>>>>>> > number of them into ArrayArrays if this worked. >>>>>>>>> > >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>>>>>> > indirection compared to c/java and this could reduce that pain to just 1 >>>>>>>>> > level of unnecessary indirection. >>>>>>>>> > >>>>>>>>> > -Edward >>>>>>>>> >>>>>>>>> > _______________________________________________ >>>>>>>>> > ghc-devs mailing list >>>>>>>>> > ghc-devs at haskell.org >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> ghc-devs mailing list >>>>>>>>> ghc-devs at haskell.org >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>> >>>>>>> >>>>> >>> >> > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > From ekmett at gmail.com Sat Aug 29 02:09:56 2015 From: ekmett at gmail.com (Edward Kmett) Date: Fri, 28 Aug 2015 22:09:56 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: I'd love to have that last 10%, but its a lot of work to get there and more importantly I don't know quite what it should look like. On the other hand, I do have a pretty good idea of how the primitives above could be banged out and tested in a long evening, well in time for 7.12. And as noted earlier, those remain useful even if a nicer typed version with an extra level of indirection to the sizes is built up after. The rest sounds like a good graduate student project for someone who has graduate students lying around. Maybe somebody at Indiana University who has an interest in type theory and parallelism can find us one. =) -Edward On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: > I think from my perspective, the motivation for getting the type > checker involved is primarily bringing this to the level where users > could be expected to build these structures. it is reasonable to > think that there are people who want to use STM (a context with > mutation already) to implement a straight forward data structure that > avoids extra indirection penalty. There should be some places where > knowing that things are field accesses rather then array indexing > could be helpful, but I think GHC is good right now about handling > constant offsets. In my code I don't do any bounds checking as I know > I will only be accessing my arrays with constant indexes. I make > wrappers for each field access and leave all the unsafe stuff in > there. When things go wrong though, the compiler is no help. Maybe > template Haskell that generates the appropriate wrappers is the right > direction to go. > There is another benefit for me when working with these as arrays in > that it is quite simple and direct (given the hoops already jumped > through) to play with alignment. I can ensure two pointers are never > on the same cache-line by just spacing things out in the array. > > On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > > They just segfault at this level. ;) > > > > Sent from my iPhone > > > > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > > > > You presumably also save a bounds check on reads by hard-coding the > sizes? > > > > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett wrote: > >> > >> Also there are 4 different "things" here, basically depending on two > >> independent questions: > >> > >> a.) if you want to shove the sizes into the info table, and > >> b.) if you want cardmarking. > >> > >> Versions with/without cardmarking for different sizes can be done pretty > >> easily, but as noted, the infotable variants are pretty invasive. > >> > >> -Edward > >> > >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: > >>> > >>> Well, on the plus side you'd save 16 bytes per object, which adds up if > >>> they were small enough and there are enough of them. You get a bit > better > >>> locality of reference in terms of what fits in the first cache line of > them. > >>> > >>> -Edward > >>> > >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > wrote: > >>>> > >>>> Yes. And for the short term I can imagine places we will settle with > >>>> arrays even if it means tracking lengths unnecessarily and > unsafeCoercing > >>>> pointers whose types don't actually match their siblings. > >>>> > >>>> Is there anything to recommend the hacks mentioned for fixed sized > array > >>>> objects *other* than using them to fake structs? (Much to > derecommend, as > >>>> you mentioned!) > >>>> > >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > wrote: > >>>>> > >>>>> I think both are useful, but the one you suggest requires a lot more > >>>>> plumbing and doesn't subsume all of the usecases of the other. > >>>>> > >>>>> -Edward > >>>>> > >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >>>>> wrote: > >>>>>> > >>>>>> So that primitive is an array like thing (Same pointed type, > unbounded > >>>>>> length) with extra payload. > >>>>>> > >>>>>> I can see how we can do without structs if we have arrays, > especially > >>>>>> with the extra payload at front. But wouldn't the general solution > for > >>>>>> structs be one that that allows new user data type defs for # types? > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > wrote: > >>>>>>> > >>>>>>> Some form of MutableStruct# with a known number of words and a > known > >>>>>>> number of pointers is basically what Ryan Yates was suggesting > above, but > >>>>>>> where the word counts were stored in the objects themselves. > >>>>>>> > >>>>>>> Given that it'd have a couple of words for those counts it'd likely > >>>>>>> want to be something we build in addition to MutVar# rather than a > >>>>>>> replacement. > >>>>>>> > >>>>>>> On the other hand, if we had to fix those numbers and build info > >>>>>>> tables that knew them, and typechecker support, for instance, it'd > get > >>>>>>> rather invasive. > >>>>>>> > >>>>>>> Also, a number of things that we can do with the 'sized' versions > >>>>>>> above, like working with evil unsized c-style arrays directly > inline at the > >>>>>>> end of the structure cease to be possible, so it isn't even a pure > win if we > >>>>>>> did the engineering effort. > >>>>>>> > >>>>>>> I think 90% of the needs I have are covered just by adding the one > >>>>>>> primitive. The last 10% gets pretty invasive. > >>>>>>> > >>>>>>> -Edward > >>>>>>> > >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> I like the possibility of a general solution for mutable structs > >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. > >>>>>>>> > >>>>>>>> So, we can't unpack MutVar into constructors because of object > >>>>>>>> identity problems. But what about directly supporting an > extensible set of > >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) > MutVar#? That > >>>>>>>> may be too much work, but is it problematic otherwise? > >>>>>>>> > >>>>>>>> Needless to say, this is also critical if we ever want best in > class > >>>>>>>> lockfree mutable structures, just like their Stm and sequential > >>>>>>>> counterparts. > >>>>>>>> > >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> At the very least I'll take this email and turn it into a short > >>>>>>>>> article. > >>>>>>>>> > >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and > maybe > >>>>>>>>> make a ticket for it. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Simon > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >>>>>>>>> Sent: 27 August 2015 16:54 > >>>>>>>>> To: Simon Peyton Jones > >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >>>>>>>>> Subject: Re: ArrayArrays > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It > >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> While those live in #, they are garbage collected objects, so > this > >>>>>>>>> all lives on the heap. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> They were added to make some of the DPH stuff fast when it has to > >>>>>>>>> deal with nested arrays. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> The Problem > >>>>>>>>> > >>>>>>>>> ----------------- > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Consider the scenario where you write a classic doubly-linked > list > >>>>>>>>> in Haskell. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers on > >>>>>>>>> the heap. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> > Maybe > >>>>>>>>> DLL ~> DLL > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> That is 3 levels of indirection. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> We can trim one by simply unpacking the IORef with > >>>>>>>>> -funbox-strict-fields or UNPACK > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and > >>>>>>>>> worsening our representation. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> but now we're still stuck with a level of indirection > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> This means that every operation we perform on this structure will > >>>>>>>>> be about half of the speed of an implementation in most other > languages > >>>>>>>>> assuming we're memory bound on loading things into cache! > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Making Progress > >>>>>>>>> > >>>>>>>>> ---------------------- > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I have been working on a number of data structures where the > >>>>>>>>> indirection of going from something in * out to an object in # > which > >>>>>>>>> contains the real pointer to my target and coming back > effectively doubles > >>>>>>>>> my runtime. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> We go out to the MutVar# because we are allowed to put the > MutVar# > >>>>>>>>> onto the mutable list when we dirty it. There is a well defined > >>>>>>>>> write-barrier. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I could change out the representation to use > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I can just store two pointers in the MutableArray# every time, > but > >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of > distinct > >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per > object to 2. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I still have to go out to the heap from my DLL and get to the > array > >>>>>>>>> object and then chase it to the next DLL and chase that to the > next array. I > >>>>>>>>> do get my two pointers together in memory though. I'm paying for > a card > >>>>>>>>> marking table as well, which I don't particularly need with just > two > >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" > machinery added > >>>>>>>>> back in 7.10, which is just the old array code a a new data > type, which can > >>>>>>>>> speed things up a bit when you don't have very big arrays: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> But what if I wanted my object itself to live in # and have two > >>>>>>>>> mutable fields and be able to share the sme write barrier? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> An ArrayArray# points directly to other unlifted array types. > What > >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the > impedence > >>>>>>>>> mismatch between the imperative world and Haskell, and then just > let the > >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> now I need to make up a new Nil, which I can just make be a > special > >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even > abuse pattern > >>>>>>>>> synonyms. Alternately I can exploit the internals further to > make this > >>>>>>>>> cheaper. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Then I can use the readMutableArrayArray# and > >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding > and next > >>>>>>>>> entry in the linked list. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a > >>>>>>>>> strict world, and everything there lives in #. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> next :: DLL -> IO DLL > >>>>>>>>> > >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > >>>>>>>>> > >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to > >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty > easily when they > >>>>>>>>> are known strict and you chain operations of this sort! > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Cleaning it Up > >>>>>>>>> > >>>>>>>>> ------------------ > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Now I have one outermost indirection pointing to an array that > >>>>>>>>> points directly to other arrays. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm stuck paying for a card marking table per object, but I can > fix > >>>>>>>>> that by duplicating the code for MutableArrayArray# and using a > >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a > mixture of > >>>>>>>>> SmallMutableArray# fields and normal ones in the data structure. > >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the > existing > >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the > arguments it > >>>>>>>>> takes. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> This is almost ideal, but not quite. I often have fields that > would > >>>>>>>>> be best left unboxed. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> was able to unpack the Int, but we lost that. We can currently at > >>>>>>>>> best point one of the entries of the SmallMutableArray# at a > boxed or at a > >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in > question in > >>>>>>>>> there. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to > >>>>>>>>> store masks and administrivia as I walk down the tree. Having to > go off to > >>>>>>>>> the side costs me the entire win from avoiding the first pointer > chase. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> But, if like Ryan suggested, we had a heap object we could > >>>>>>>>> construct that had n words with unsafe access and m pointers to > other heap > >>>>>>>>> objects, one that could put itself on the mutable list when any > of those > >>>>>>>>> pointers changed then I could shed this last factor of two in all > >>>>>>>>> circumstances. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Prototype > >>>>>>>>> > >>>>>>>>> ------------- > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Over the last few days I've put together a small prototype > >>>>>>>>> implementation with a few non-trivial imperative data structures > for things > >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and > >>>>>>>>> order-maintenance. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> https://github.com/ekmett/structs > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Notable bits: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of > link-cut > >>>>>>>>> trees in this style. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that > make > >>>>>>>>> it go fast. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all > >>>>>>>>> the references to the LinkCut or Object data constructor get > optimized away, > >>>>>>>>> and we're left with beautiful strict code directly mutating out > underlying > >>>>>>>>> representation. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> At the very least I'll take this email and turn it into a short > >>>>>>>>> article. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Edward > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Just to say that I have no idea what is going on in this thread. > >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a > ticket? Is > >>>>>>>>> there a wiki page? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a > good > >>>>>>>>> thing. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Simon > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf > Of > >>>>>>>>> Edward Kmett > >>>>>>>>> Sent: 21 August 2015 05:25 > >>>>>>>>> To: Manuel M T Chakravarty > >>>>>>>>> Cc: Simon Marlow; ghc-devs > >>>>>>>>> Subject: Re: ArrayArrays > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be > >>>>>>>>> very handy as well. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Consider right now if I have something like an order-maintenance > >>>>>>>>> structure I have: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# > >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s > (Upper s)) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# > >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s > (Lower s)) {-# > >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> The former contains, logically, a mutable integer and two > pointers, > >>>>>>>>> one for forward and one for backwards. The latter is basically > the same > >>>>>>>>> thing with a mutable reference up pointing at the structure > above. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On the heap this is an object that points to a structure for the > >>>>>>>>> bytearray, and points to another structure for each mutvar which > each point > >>>>>>>>> to the other 'Upper' structure. So there is a level of > indirection smeared > >>>>>>>>> over everything. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> So this is a pair of doubly linked lists with an upward link from > >>>>>>>>> the structure below to the structure above. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Converted into ArrayArray#s I'd get > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the > >>>>>>>>> next 2 slots pointing to the previous and next previous objects, > represented > >>>>>>>>> just as their MutableArrayArray#s. I can use > sameMutableArrayArray# on these > >>>>>>>>> for object identity, which lets me check for the ends of the > lists by tying > >>>>>>>>> things back on themselves. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> and below that > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to > an > >>>>>>>>> upper structure. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I can then write a handful of combinators for getting out the > slots > >>>>>>>>> in question, while it has gained a level of indirection between > the wrapper > >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can > be basically > >>>>>>>>> erased by ghc. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Unlike before I don't have several separate objects on the heap > for > >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the > object itself, > >>>>>>>>> and the MutableByteArray# that it references to carry around the > mutable > >>>>>>>>> int. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> The only pain points are > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 1.) the aforementioned limitation that currently prevents me from > >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an > ArrayArray > >>>>>>>>> leaving me in a little ghetto disconnected from the rest of > Haskell, > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> and > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the > >>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers > wide. Card > >>>>>>>>> marking doesn't help. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Alternately I could just try to do really evil things and convert > >>>>>>>>> the whole mess to SmallArrays and then figure out how to > unsafeCoerce my way > >>>>>>>>> to glory, stuffing the #'d references to the other arrays > directly into the > >>>>>>>>> SmallArray as slots, removing the limitation we see here by > aping the > >>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar > >>>>>>>>> of speed here, but I'd like to be able to let the GC move them > and collect > >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Edward > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> That?s an interesting idea. > >>>>>>>>> > >>>>>>>>> Manuel > >>>>>>>>> > >>>>>>>>> > Edward Kmett : > >>>>>>>>> > >>>>>>>>> > > >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and > >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the > ArrayArray# entries > >>>>>>>>> > are all directly unlifted avoiding a level of indirection for > the containing > >>>>>>>>> > structure is amazing, but I can only currently use it if my > leaf level data > >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be > nice to be > >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at > the leaves to > >>>>>>>>> > hold lifted contents. > >>>>>>>>> > > >>>>>>>>> > I accept fully that if I name the wrong type when I go to > access > >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that > if i tried to > >>>>>>>>> > use one of the members that held a nested ArrayArray# as a > ByteArray# > >>>>>>>>> > anyways, so it isn't like there is a safety story preventing > this. > >>>>>>>>> > > >>>>>>>>> > I've been hunting for ways to try to kill the indirection > >>>>>>>>> > problems I get with Haskell and mutable structures, and I > could shoehorn a > >>>>>>>>> > number of them into ArrayArrays if this worked. > >>>>>>>>> > > >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary > >>>>>>>>> > indirection compared to c/java and this could reduce that pain > to just 1 > >>>>>>>>> > level of unnecessary indirection. > >>>>>>>>> > > >>>>>>>>> > -Edward > >>>>>>>>> > >>>>>>>>> > _______________________________________________ > >>>>>>>>> > ghc-devs mailing list > >>>>>>>>> > ghc-devs at haskell.org > >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> ghc-devs mailing list > >>>>>>>>> ghc-devs at haskell.org > >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >>>>>>> > >>>>>>> > >>>>> > >>> > >> > > > > > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fryguybob at gmail.com Sat Aug 29 02:47:38 2015 From: fryguybob at gmail.com (Ryan Yates) Date: Fri, 28 Aug 2015 22:47:38 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: I completely agree. I would love to spend some time during ICFP and friends talking about what it could look like. My small array for STM changes for the RTS can be seen here [1]. It is on a branch somewhere between 7.8 and 7.10 and includes irrelevant STM bits and some confusing naming choices (sorry), but should cover all the details needed to implement it for a non-STM context. The biggest surprise for me was following small array too closely and having a word/byte offset miss-match [2]. [1]: https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 Ryan On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: > I'd love to have that last 10%, but its a lot of work to get there and more > importantly I don't know quite what it should look like. > > On the other hand, I do have a pretty good idea of how the primitives above > could be banged out and tested in a long evening, well in time for 7.12. And > as noted earlier, those remain useful even if a nicer typed version with an > extra level of indirection to the sizes is built up after. > > The rest sounds like a good graduate student project for someone who has > graduate students lying around. Maybe somebody at Indiana University who has > an interest in type theory and parallelism can find us one. =) > > -Edward > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: >> >> I think from my perspective, the motivation for getting the type >> checker involved is primarily bringing this to the level where users >> could be expected to build these structures. it is reasonable to >> think that there are people who want to use STM (a context with >> mutation already) to implement a straight forward data structure that >> avoids extra indirection penalty. There should be some places where >> knowing that things are field accesses rather then array indexing >> could be helpful, but I think GHC is good right now about handling >> constant offsets. In my code I don't do any bounds checking as I know >> I will only be accessing my arrays with constant indexes. I make >> wrappers for each field access and leave all the unsafe stuff in >> there. When things go wrong though, the compiler is no help. Maybe >> template Haskell that generates the appropriate wrappers is the right >> direction to go. >> There is another benefit for me when working with these as arrays in >> that it is quite simple and direct (given the hoops already jumped >> through) to play with alignment. I can ensure two pointers are never >> on the same cache-line by just spacing things out in the array. >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: >> > They just segfault at this level. ;) >> > >> > Sent from my iPhone >> > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: >> > >> > You presumably also save a bounds check on reads by hard-coding the >> > sizes? >> > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett wrote: >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> independent questions: >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> b.) if you want cardmarking. >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> pretty >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> -Edward >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett wrote: >> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up >> >>> if >> >>> they were small enough and there are enough of them. You get a bit >> >>> better >> >>> locality of reference in terms of what fits in the first cache line of >> >>> them. >> >>> >> >>> -Edward >> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >> >>> wrote: >> >>>> >> >>>> Yes. And for the short term I can imagine places we will settle with >> >>>> arrays even if it means tracking lengths unnecessarily and >> >>>> unsafeCoercing >> >>>> pointers whose types don't actually match their siblings. >> >>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >>>> array >> >>>> objects *other* than using them to fake structs? (Much to >> >>>> derecommend, as >> >>>> you mentioned!) >> >>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >> >>>> wrote: >> >>>>> >> >>>>> I think both are useful, but the one you suggest requires a lot more >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >>>>> >> >>>>> -Edward >> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >> >>>>> wrote: >> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >>>>>> unbounded >> >>>>>> length) with extra payload. >> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >> >>>>>> especially >> >>>>>> with the extra payload at front. But wouldn't the general solution >> >>>>>> for >> >>>>>> structs be one that that allows new user data type defs for # >> >>>>>> types? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >>>>>>> known >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >>>>>>> above, but >> >>>>>>> where the word counts were stored in the objects themselves. >> >>>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >>>>>>> likely >> >>>>>>> want to be something we build in addition to MutVar# rather than a >> >>>>>>> replacement. >> >>>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build info >> >>>>>>> tables that knew them, and typechecker support, for instance, it'd >> >>>>>>> get >> >>>>>>> rather invasive. >> >>>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' versions >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >>>>>>> inline at the >> >>>>>>> end of the structure cease to be possible, so it isn't even a pure >> >>>>>>> win if we >> >>>>>>> did the engineering effort. >> >>>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding the one >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >>>>>>> >> >>>>>>> -Edward >> >>>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable structs >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >> >>>>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >>>>>>>> identity problems. But what about directly supporting an >> >>>>>>>> extensible set of >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >>>>>>>> MutVar#? That >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >>>>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >>>>>>>> class >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential >> >>>>>>>> counterparts. >> >>>>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >>>>>>>>> maybe >> >>>>>>>>> make a ticket for it. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Thanks >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >>>>>>>>> To: Simon Peyton Jones >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >>>>>>>>> this >> >>>>>>>>> all lives on the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has >> >>>>>>>>> to >> >>>>>>>>> deal with nested arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The Problem >> >>>>>>>>> >> >>>>>>>>> ----------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >>>>>>>>> list >> >>>>>>>>> in Haskell. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers >> >>>>>>>>> on >> >>>>>>>>> the heap. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >>>>>>>>> Maybe >> >>>>>>>>> DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> That is 3 levels of indirection. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >>>>>>>>> worsening our representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This means that every operation we perform on this structure >> >>>>>>>>> will >> >>>>>>>>> be about half of the speed of an implementation in most other >> >>>>>>>>> languages >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Making Progress >> >>>>>>>>> >> >>>>>>>>> ---------------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I have been working on a number of data structures where the >> >>>>>>>>> indirection of going from something in * out to an object in # >> >>>>>>>>> which >> >>>>>>>>> contains the real pointer to my target and coming back >> >>>>>>>>> effectively doubles >> >>>>>>>>> my runtime. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >>>>>>>>> MutVar# >> >>>>>>>>> onto the mutable list when we dirty it. There is a well defined >> >>>>>>>>> write-barrier. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I could change out the representation to use >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, >> >>>>>>>>> but >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of >> >>>>>>>>> distinct >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >>>>>>>>> object to 2. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >>>>>>>>> array >> >>>>>>>>> object and then chase it to the next DLL and chase that to the >> >>>>>>>>> next array. I >> >>>>>>>>> do get my two pointers together in memory though. I'm paying for >> >>>>>>>>> a card >> >>>>>>>>> marking table as well, which I don't particularly need with just >> >>>>>>>>> two >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >>>>>>>>> machinery added >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >>>>>>>>> type, which can >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have two >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >>>>>>>>> What >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >>>>>>>>> impedence >> >>>>>>>>> mismatch between the imperative world and Haskell, and then just >> >>>>>>>>> let the >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >>>>>>>>> special >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >>>>>>>>> abuse pattern >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >>>>>>>>> make this >> >>>>>>>>> cheaper. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding >> >>>>>>>>> and next >> >>>>>>>>> entry in the linked list. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >> >>>>>>>>> strict world, and everything there lives in #. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >> >>>>>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >>>>>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code to >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >>>>>>>>> easily when they >> >>>>>>>>> are known strict and you chain operations of this sort! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Cleaning it Up >> >>>>>>>>> >> >>>>>>>>> ------------------ >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array that >> >>>>>>>>> points directly to other arrays. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I can >> >>>>>>>>> fix >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using a >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a >> >>>>>>>>> mixture of >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >> >>>>>>>>> existing >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the >> >>>>>>>>> arguments it >> >>>>>>>>> takes. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that >> >>>>>>>>> would >> >>>>>>>>> be best left unboxed. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently >> >>>>>>>>> at >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >> >>>>>>>>> boxed or at a >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in >> >>>>>>>>> question in >> >>>>>>>>> there. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having to >> >>>>>>>>> go off to >> >>>>>>>>> the side costs me the entire win from avoiding the first pointer >> >>>>>>>>> chase. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >> >>>>>>>>> construct that had n words with unsafe access and m pointers to >> >>>>>>>>> other heap >> >>>>>>>>> objects, one that could put itself on the mutable list when any >> >>>>>>>>> of those >> >>>>>>>>> pointers changed then I could shed this last factor of two in >> >>>>>>>>> all >> >>>>>>>>> circumstances. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Prototype >> >>>>>>>>> >> >>>>>>>>> ------------- >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Over the last few days I've put together a small prototype >> >>>>>>>>> implementation with a few non-trivial imperative data structures >> >>>>>>>>> for things >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >> >>>>>>>>> order-maintenance. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Notable bits: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >> >>>>>>>>> link-cut >> >>>>>>>>> trees in this style. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that >> >>>>>>>>> make >> >>>>>>>>> it go fast. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost >> >>>>>>>>> all >> >>>>>>>>> the references to the LinkCut or Object data constructor get >> >>>>>>>>> optimized away, >> >>>>>>>>> and we're left with beautiful strict code directly mutating out >> >>>>>>>>> underlying >> >>>>>>>>> representation. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a short >> >>>>>>>>> article. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this thread. >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a >> >>>>>>>>> ticket? Is >> >>>>>>>>> there a wiki page? >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a >> >>>>>>>>> good >> >>>>>>>>> thing. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Simon >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf >> >>>>>>>>> Of >> >>>>>>>>> Edward Kmett >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >>>>>>>>> To: Manuel M T Chakravarty >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >>>>>>>>> Subject: Re: ArrayArrays >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >> >>>>>>>>> very handy as well. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Consider right now if I have something like an order-maintenance >> >>>>>>>>> structure I have: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Upper s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >> >>>>>>>>> (Lower s)) {-# >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >> >>>>>>>>> pointers, >> >>>>>>>>> one for forward and one for backwards. The latter is basically >> >>>>>>>>> the same >> >>>>>>>>> thing with a mutable reference up pointing at the structure >> >>>>>>>>> above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On the heap this is an object that points to a structure for the >> >>>>>>>>> bytearray, and points to another structure for each mutvar which >> >>>>>>>>> each point >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >>>>>>>>> indirection smeared >> >>>>>>>>> over everything. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >> >>>>>>>>> from >> >>>>>>>>> the structure below to the structure above. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and >> >>>>>>>>> the >> >>>>>>>>> next 2 slots pointing to the previous and next previous objects, >> >>>>>>>>> represented >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >>>>>>>>> sameMutableArrayArray# on these >> >>>>>>>>> for object identity, which lets me check for the ends of the >> >>>>>>>>> lists by tying >> >>>>>>>>> things back on themselves. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and below that >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to >> >>>>>>>>> an >> >>>>>>>>> upper structure. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out the >> >>>>>>>>> slots >> >>>>>>>>> in question, while it has gained a level of indirection between >> >>>>>>>>> the wrapper >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can >> >>>>>>>>> be basically >> >>>>>>>>> erased by ghc. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the heap >> >>>>>>>>> for >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the >> >>>>>>>>> object itself, >> >>>>>>>>> and the MutableByteArray# that it references to carry around the >> >>>>>>>>> mutable >> >>>>>>>>> int. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> The only pain points are >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >> >>>>>>>>> from >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an >> >>>>>>>>> ArrayArray >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >> >>>>>>>>> Haskell, >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> and >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers >> >>>>>>>>> wide. Card >> >>>>>>>>> marking doesn't help. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >> >>>>>>>>> convert >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >> >>>>>>>>> unsafeCoerce my way >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >> >>>>>>>>> directly into the >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >> >>>>>>>>> aping the >> >>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >> >>>>>>>>> altar >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them >> >>>>>>>>> and collect >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Edward >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> That?s an interesting idea. >> >>>>>>>>> >> >>>>>>>>> Manuel >> >>>>>>>>> >> >>>>>>>>> > Edward Kmett : >> >>>>>>>>> >> >>>>>>>>> > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >> >>>>>>>>> > ArrayArray# entries >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection for >> >>>>>>>>> > the containing >> >>>>>>>>> > structure is amazing, but I can only currently use it if my >> >>>>>>>>> > leaf level data >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be >> >>>>>>>>> > nice to be >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at >> >>>>>>>>> > the leaves to >> >>>>>>>>> > hold lifted contents. >> >>>>>>>>> > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >> >>>>>>>>> > access >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that >> >>>>>>>>> > if i tried to >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >> >>>>>>>>> > ByteArray# >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing >> >>>>>>>>> > this. >> >>>>>>>>> > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >> >>>>>>>>> > could shoehorn a >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >>>>>>>>> > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >> >>>>>>>>> > indirection compared to c/java and this could reduce that pain >> >>>>>>>>> > to just 1 >> >>>>>>>>> > level of unnecessary indirection. >> >>>>>>>>> > >> >>>>>>>>> > -Edward >> >>>>>>>>> >> >>>>>>>>> > _______________________________________________ >> >>>>>>>>> > ghc-devs mailing list >> >>>>>>>>> > ghc-devs at haskell.org >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> _______________________________________________ >> >>>>>>>>> ghc-devs mailing list >> >>>>>>>>> ghc-devs at haskell.org >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>>>>>> >> >>>>>>> >> >>>>> >> >>> >> >> >> > >> > >> > _______________________________________________ >> > ghc-devs mailing list >> > ghc-devs at haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > > > From hvr at gnu.org Sat Aug 29 06:56:53 2015 From: hvr at gnu.org (Herbert Valerio Riedel) Date: Sat, 29 Aug 2015 08:56:53 +0200 Subject: Please help beta test "no-reinstall Cabal" (was: Cabal and simultaneous installations of the same package) In-Reply-To: <68326f3ebbd943768effe6b0f2ff522c@DB4PR30MB030.064d.mgd.msft.net> (Simon Peyton Jones's message of "Mon, 23 Mar 2015 08:45:48 +0000") References: <68326f3ebbd943768effe6b0f2ff522c@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <87fv32lgzu.fsf@gmail.com> Good news, everyone! ...you may be interested to know this has finally come to fruition (just in time for HIW): http://blog.ezyang.com/2015/08/help-us-beta-test-no-reinstall-cabal/ Cheers, hvr On 2015-03-23 at 09:45:48 +0100, Simon Peyton Jones wrote: > Dear Cabal developers > > You'll probably have seen the thread about the Haskell Platform. > > Among other things, this point arose: > > | Another thing we should fix is the (now false) impression that HP gets in > | the way of installing other packages and versions due to cabal hell. > > People mean different things by "cabal hell", but the inability to > simultaneously install multiple versions of the same package, > compiled against different dependencies > is certainly one of them, and I think it is the one that Yitzchak is referring to here. > > But some time now GHC has allowed multiple versions of the same > package (compiled against different dependencies) to be installed > simultaneously. So all we need to do is to fix Cabal to allow it too, > and thereby kill of a huge class of cabal-hell problems at one blow. > > But time has passed and it hasn't happened. Is this because I'm misunderstanding? Or because it is harder than I think? Or because there are much bigger problems? Or because there is insufficient effort available? Or what? > > Unless I'm way off beam, this "multiple installations of the same package" thing has been a huge pain forever, and the solution is within our grasp. What's stopping us grasping it? From eir at cis.upenn.edu Sat Aug 29 13:54:24 2015 From: eir at cis.upenn.edu (Richard Eisenberg) Date: Sat, 29 Aug 2015 09:54:24 -0400 Subject: Planning for the 7.12 release In-Reply-To: <87y4gvtj1i.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <87y4gvtj1i.fsf@smart-cactus.org> Message-ID: <913B26ED-A2C0-436E-8106-1260A0D1E54C@cis.upenn.edu> On Aug 28, 2015, at 1:33 PM, Ben Gamari wrote: > I > half-jokingly suggested that 8.0 should only come with Phase 2 of > Richard's Dependent Haskell work, but I'm willing to settle for > merely kind equality. > > I think doing a major bump would be a great idea. Drat! I, too, was hoping to herald in 8.0 with -XDependentTypes, but I guess I'm a little late. Don't hold up this change for me, though. :) Richard From spam at scientician.net Sat Aug 29 14:59:42 2015 From: spam at scientician.net (Bardur Arantsson) Date: Sat, 29 Aug 2015 16:59:42 +0200 Subject: Planning for the 7.12 release In-Reply-To: <87y4gvtj1i.fsf@smart-cactus.org> References: <87r3mo68t0.fsf@smart-cactus.org> <87y4gvtj1i.fsf@smart-cactus.org> Message-ID: On 08/28/2015 07:33 PM, Ben Gamari wrote: > Simon Peyton Jones writes: > >> Actually that?s a good idea. >> >> Simon >> >> >> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Greg Weber >> Sent: 28 August 2015 16:43 >> To: Ben Gamari >> Cc: GHC developers >> Subject: Re: Planning for the 7.12 release >> >> Can we call this GHC 8.0 instead of 7.12 ? >> Overloaded record fields and backtraces are a huge missing piece to >> Haskell. It would be nice to have the bump to celebrate this occasion >> and say that Haskell 8 is "ready". I have had a hard time seriously >> recommending Haskell due to those last missing features. Now I should >> be able to say without reservation: "use Haskell > 8; it is great!" >> > I was discussing this very matter yesterday with a few folks. I think we > certainly have enough features in this release to do a major bump. I > half-jokingly suggested that 8.0 should only come with Phase 2 of > Richard's Dependent Haskell work, but I'm willing to settle for > merely kind equality. > That could be 9.0...? Let's embrace the Firefox/Chrome philosophy about versioning :) I'm very excited about the feature list for 7.12 and I agree that it's almost big enough for a new "major" release. (Thanks to all the people who've worked on it, btw!) Cheers, From rrnewton at gmail.com Sat Aug 29 14:59:45 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Sat, 29 Aug 2015 07:59:45 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: I'd also love to meet up at ICFP and discuss this. I think the array primops plus a TH layer that lets (ab)use them many times without too much marginal cost sounds great. And I'd like to learn how we could be either early users of, or help with, this infrastructure. CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is currently working on concurrent data structures in Haskell, but will not be at ICFP. On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: > I completely agree. I would love to spend some time during ICFP and > friends talking about what it could look like. My small array for STM > changes for the RTS can be seen here [1]. It is on a branch somewhere > between 7.8 and 7.10 and includes irrelevant STM bits and some > confusing naming choices (sorry), but should cover all the details > needed to implement it for a non-STM context. The biggest surprise > for me was following small array too closely and having a word/byte > offset miss-match [2]. > > [1]: > https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut > [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 > > Ryan > > On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: > > I'd love to have that last 10%, but its a lot of work to get there and > more > > importantly I don't know quite what it should look like. > > > > On the other hand, I do have a pretty good idea of how the primitives > above > > could be banged out and tested in a long evening, well in time for 7.12. > And > > as noted earlier, those remain useful even if a nicer typed version with > an > > extra level of indirection to the sizes is built up after. > > > > The rest sounds like a good graduate student project for someone who has > > graduate students lying around. Maybe somebody at Indiana University who > has > > an interest in type theory and parallelism can find us one. =) > > > > -Edward > > > > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates wrote: > >> > >> I think from my perspective, the motivation for getting the type > >> checker involved is primarily bringing this to the level where users > >> could be expected to build these structures. it is reasonable to > >> think that there are people who want to use STM (a context with > >> mutation already) to implement a straight forward data structure that > >> avoids extra indirection penalty. There should be some places where > >> knowing that things are field accesses rather then array indexing > >> could be helpful, but I think GHC is good right now about handling > >> constant offsets. In my code I don't do any bounds checking as I know > >> I will only be accessing my arrays with constant indexes. I make > >> wrappers for each field access and leave all the unsafe stuff in > >> there. When things go wrong though, the compiler is no help. Maybe > >> template Haskell that generates the appropriate wrappers is the right > >> direction to go. > >> There is another benefit for me when working with these as arrays in > >> that it is quite simple and direct (given the hoops already jumped > >> through) to play with alignment. I can ensure two pointers are never > >> on the same cache-line by just spacing things out in the array. > >> > >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett wrote: > >> > They just segfault at this level. ;) > >> > > >> > Sent from my iPhone > >> > > >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: > >> > > >> > You presumably also save a bounds check on reads by hard-coding the > >> > sizes? > >> > > >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett > wrote: > >> >> > >> >> Also there are 4 different "things" here, basically depending on two > >> >> independent questions: > >> >> > >> >> a.) if you want to shove the sizes into the info table, and > >> >> b.) if you want cardmarking. > >> >> > >> >> Versions with/without cardmarking for different sizes can be done > >> >> pretty > >> >> easily, but as noted, the infotable variants are pretty invasive. > >> >> > >> >> -Edward > >> >> > >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett > wrote: > >> >>> > >> >>> Well, on the plus side you'd save 16 bytes per object, which adds up > >> >>> if > >> >>> they were small enough and there are enough of them. You get a bit > >> >>> better > >> >>> locality of reference in terms of what fits in the first cache line > of > >> >>> them. > >> >>> > >> >>> -Edward > >> >>> > >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton > >> >>> wrote: > >> >>>> > >> >>>> Yes. And for the short term I can imagine places we will settle > with > >> >>>> arrays even if it means tracking lengths unnecessarily and > >> >>>> unsafeCoercing > >> >>>> pointers whose types don't actually match their siblings. > >> >>>> > >> >>>> Is there anything to recommend the hacks mentioned for fixed sized > >> >>>> array > >> >>>> objects *other* than using them to fake structs? (Much to > >> >>>> derecommend, as > >> >>>> you mentioned!) > >> >>>> > >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett > >> >>>> wrote: > >> >>>>> > >> >>>>> I think both are useful, but the one you suggest requires a lot > more > >> >>>>> plumbing and doesn't subsume all of the usecases of the other. > >> >>>>> > >> >>>>> -Edward > >> >>>>> > >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> So that primitive is an array like thing (Same pointed type, > >> >>>>>> unbounded > >> >>>>>> length) with extra payload. > >> >>>>>> > >> >>>>>> I can see how we can do without structs if we have arrays, > >> >>>>>> especially > >> >>>>>> with the extra payload at front. But wouldn't the general > solution > >> >>>>>> for > >> >>>>>> structs be one that that allows new user data type defs for # > >> >>>>>> types? > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Some form of MutableStruct# with a known number of words and a > >> >>>>>>> known > >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting > >> >>>>>>> above, but > >> >>>>>>> where the word counts were stored in the objects themselves. > >> >>>>>>> > >> >>>>>>> Given that it'd have a couple of words for those counts it'd > >> >>>>>>> likely > >> >>>>>>> want to be something we build in addition to MutVar# rather > than a > >> >>>>>>> replacement. > >> >>>>>>> > >> >>>>>>> On the other hand, if we had to fix those numbers and build info > >> >>>>>>> tables that knew them, and typechecker support, for instance, > it'd > >> >>>>>>> get > >> >>>>>>> rather invasive. > >> >>>>>>> > >> >>>>>>> Also, a number of things that we can do with the 'sized' > versions > >> >>>>>>> above, like working with evil unsized c-style arrays directly > >> >>>>>>> inline at the > >> >>>>>>> end of the structure cease to be possible, so it isn't even a > pure > >> >>>>>>> win if we > >> >>>>>>> did the engineering effort. > >> >>>>>>> > >> >>>>>>> I think 90% of the needs I have are covered just by adding the > one > >> >>>>>>> primitive. The last 10% gets pretty invasive. > >> >>>>>>> > >> >>>>>>> -Edward > >> >>>>>>> > >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < > rrnewton at gmail.com> > >> >>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>> I like the possibility of a general solution for mutable > structs > >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's > hard. > >> >>>>>>>> > >> >>>>>>>> So, we can't unpack MutVar into constructors because of object > >> >>>>>>>> identity problems. But what about directly supporting an > >> >>>>>>>> extensible set of > >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) > >> >>>>>>>> MutVar#? That > >> >>>>>>>> may be too much work, but is it problematic otherwise? > >> >>>>>>>> > >> >>>>>>>> Needless to say, this is also critical if we ever want best in > >> >>>>>>>> class > >> >>>>>>>> lockfree mutable structures, just like their Stm and sequential > >> >>>>>>>> counterparts. > >> >>>>>>>> > >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones > >> >>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and > >> >>>>>>>>> maybe > >> >>>>>>>>> make a ticket for it. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Thanks > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] > >> >>>>>>>>> Sent: 27 August 2015 16:54 > >> >>>>>>>>> To: Simon Peyton Jones > >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It > >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or > ByteArray#'s. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> While those live in #, they are garbage collected objects, so > >> >>>>>>>>> this > >> >>>>>>>>> all lives on the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> They were added to make some of the DPH stuff fast when it has > >> >>>>>>>>> to > >> >>>>>>>>> deal with nested arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm currently abusing them as a placeholder for a better > thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The Problem > >> >>>>>>>>> > >> >>>>>>>>> ----------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked > >> >>>>>>>>> list > >> >>>>>>>>> in Haskell. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Chasing from one DLL to the next requires following 3 pointers > >> >>>>>>>>> on > >> >>>>>>>>> the heap. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> > >> >>>>>>>>> Maybe > >> >>>>>>>>> DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> That is 3 levels of indirection. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim one by simply unpacking the IORef with > >> >>>>>>>>> -funbox-strict-fields or UNPACK > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and > >> >>>>>>>>> worsening our representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> but now we're still stuck with a level of indirection > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This means that every operation we perform on this structure > >> >>>>>>>>> will > >> >>>>>>>>> be about half of the speed of an implementation in most other > >> >>>>>>>>> languages > >> >>>>>>>>> assuming we're memory bound on loading things into cache! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Making Progress > >> >>>>>>>>> > >> >>>>>>>>> ---------------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I have been working on a number of data structures where the > >> >>>>>>>>> indirection of going from something in * out to an object in # > >> >>>>>>>>> which > >> >>>>>>>>> contains the real pointer to my target and coming back > >> >>>>>>>>> effectively doubles > >> >>>>>>>>> my runtime. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the > >> >>>>>>>>> MutVar# > >> >>>>>>>>> onto the mutable list when we dirty it. There is a well > defined > >> >>>>>>>>> write-barrier. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I could change out the representation to use > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can just store two pointers in the MutableArray# every time, > >> >>>>>>>>> but > >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount > of > >> >>>>>>>>> distinct > >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per > >> >>>>>>>>> object to 2. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the > >> >>>>>>>>> array > >> >>>>>>>>> object and then chase it to the next DLL and chase that to the > >> >>>>>>>>> next array. I > >> >>>>>>>>> do get my two pointers together in memory though. I'm paying > for > >> >>>>>>>>> a card > >> >>>>>>>>> marking table as well, which I don't particularly need with > just > >> >>>>>>>>> two > >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" > >> >>>>>>>>> machinery added > >> >>>>>>>>> back in 7.10, which is just the old array code a a new data > >> >>>>>>>>> type, which can > >> >>>>>>>>> speed things up a bit when you don't have very big arrays: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But what if I wanted my object itself to live in # and have > two > >> >>>>>>>>> mutable fields and be able to share the sme write barrier? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. > >> >>>>>>>>> What > >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the > >> >>>>>>>>> impedence > >> >>>>>>>>> mismatch between the imperative world and Haskell, and then > just > >> >>>>>>>>> let the > >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a > >> >>>>>>>>> special > >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even > >> >>>>>>>>> abuse pattern > >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to > >> >>>>>>>>> make this > >> >>>>>>>>> cheaper. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Then I can use the readMutableArrayArray# and > >> >>>>>>>>> writeMutableArrayArray# calls to directly access the preceding > >> >>>>>>>>> and next > >> >>>>>>>>> entry in the linked list. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' > into a > >> >>>>>>>>> strict world, and everything there lives in #. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> next :: DLL -> IO DLL > >> >>>>>>>>> > >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of > >> >>>>>>>>> > >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code > to > >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty > >> >>>>>>>>> easily when they > >> >>>>>>>>> are known strict and you chain operations of this sort! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Cleaning it Up > >> >>>>>>>>> > >> >>>>>>>>> ------------------ > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Now I have one outermost indirection pointing to an array that > >> >>>>>>>>> points directly to other arrays. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I > can > >> >>>>>>>>> fix > >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and using > a > >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a > >> >>>>>>>>> mixture of > >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data > structure. > >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the > >> >>>>>>>>> existing > >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of the > >> >>>>>>>>> arguments it > >> >>>>>>>>> takes. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that > >> >>>>>>>>> would > >> >>>>>>>>> be best left unboxed. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> was able to unpack the Int, but we lost that. We can currently > >> >>>>>>>>> at > >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a > >> >>>>>>>>> boxed or at a > >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the int > in > >> >>>>>>>>> question in > >> >>>>>>>>> there. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to > >> >>>>>>>>> store masks and administrivia as I walk down the tree. Having > to > >> >>>>>>>>> go off to > >> >>>>>>>>> the side costs me the entire win from avoiding the first > pointer > >> >>>>>>>>> chase. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could > >> >>>>>>>>> construct that had n words with unsafe access and m pointers > to > >> >>>>>>>>> other heap > >> >>>>>>>>> objects, one that could put itself on the mutable list when > any > >> >>>>>>>>> of those > >> >>>>>>>>> pointers changed then I could shed this last factor of two in > >> >>>>>>>>> all > >> >>>>>>>>> circumstances. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Prototype > >> >>>>>>>>> > >> >>>>>>>>> ------------- > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Over the last few days I've put together a small prototype > >> >>>>>>>>> implementation with a few non-trivial imperative data > structures > >> >>>>>>>>> for things > >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and > >> >>>>>>>>> order-maintenance. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> https://github.com/ekmett/structs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Notable bits: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of > >> >>>>>>>>> link-cut > >> >>>>>>>>> trees in this style. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that > >> >>>>>>>>> make > >> >>>>>>>>> it go fast. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost > >> >>>>>>>>> all > >> >>>>>>>>> the references to the LinkCut or Object data constructor get > >> >>>>>>>>> optimized away, > >> >>>>>>>>> and we're left with beautiful strict code directly mutating > out > >> >>>>>>>>> underlying > >> >>>>>>>>> representation. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> At the very least I'll take this email and turn it into a > short > >> >>>>>>>>> article. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> Just to say that I have no idea what is going on in this > thread. > >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there a > >> >>>>>>>>> ticket? Is > >> >>>>>>>>> there a wiki page? > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a > >> >>>>>>>>> good > >> >>>>>>>>> thing. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Simon > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On > Behalf > >> >>>>>>>>> Of > >> >>>>>>>>> Edward Kmett > >> >>>>>>>>> Sent: 21 August 2015 05:25 > >> >>>>>>>>> To: Manuel M T Chakravarty > >> >>>>>>>>> Cc: Simon Marlow; ghc-devs > >> >>>>>>>>> Subject: Re: ArrayArrays > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would > be > >> >>>>>>>>> very handy as well. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Consider right now if I have something like an > order-maintenance > >> >>>>>>>>> structure I have: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Upper s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s > >> >>>>>>>>> (Lower s)) {-# > >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The former contains, logically, a mutable integer and two > >> >>>>>>>>> pointers, > >> >>>>>>>>> one for forward and one for backwards. The latter is basically > >> >>>>>>>>> the same > >> >>>>>>>>> thing with a mutable reference up pointing at the structure > >> >>>>>>>>> above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On the heap this is an object that points to a structure for > the > >> >>>>>>>>> bytearray, and points to another structure for each mutvar > which > >> >>>>>>>>> each point > >> >>>>>>>>> to the other 'Upper' structure. So there is a level of > >> >>>>>>>>> indirection smeared > >> >>>>>>>>> over everything. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link > >> >>>>>>>>> from > >> >>>>>>>>> the structure below to the structure above. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Converted into ArrayArray#s I'd get > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and > >> >>>>>>>>> the > >> >>>>>>>>> next 2 slots pointing to the previous and next previous > objects, > >> >>>>>>>>> represented > >> >>>>>>>>> just as their MutableArrayArray#s. I can use > >> >>>>>>>>> sameMutableArrayArray# on these > >> >>>>>>>>> for object identity, which lets me check for the ends of the > >> >>>>>>>>> lists by tying > >> >>>>>>>>> things back on themselves. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and below that > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up > to > >> >>>>>>>>> an > >> >>>>>>>>> upper structure. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I can then write a handful of combinators for getting out the > >> >>>>>>>>> slots > >> >>>>>>>>> in question, while it has gained a level of indirection > between > >> >>>>>>>>> the wrapper > >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can > >> >>>>>>>>> be basically > >> >>>>>>>>> erased by ghc. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Unlike before I don't have several separate objects on the > heap > >> >>>>>>>>> for > >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the > >> >>>>>>>>> object itself, > >> >>>>>>>>> and the MutableByteArray# that it references to carry around > the > >> >>>>>>>>> mutable > >> >>>>>>>>> int. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> The only pain points are > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me > >> >>>>>>>>> from > >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array into > an > >> >>>>>>>>> ArrayArray > >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of > >> >>>>>>>>> Haskell, > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> and > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid > the > >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 > pointers > >> >>>>>>>>> wide. Card > >> >>>>>>>>> marking doesn't help. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> Alternately I could just try to do really evil things and > >> >>>>>>>>> convert > >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to > >> >>>>>>>>> unsafeCoerce my way > >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays > >> >>>>>>>>> directly into the > >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by > >> >>>>>>>>> aping the > >> >>>>>>>>> MutableArrayArray# s API, but that gets really really > dangerous! > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the > >> >>>>>>>>> altar > >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move them > >> >>>>>>>>> and collect > >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> -Edward > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>> That?s an interesting idea. > >> >>>>>>>>> > >> >>>>>>>>> Manuel > >> >>>>>>>>> > >> >>>>>>>>> > Edward Kmett : > >> >>>>>>>>> > >> >>>>>>>>> > > >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# and > >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the > >> >>>>>>>>> > ArrayArray# entries > >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection > for > >> >>>>>>>>> > the containing > >> >>>>>>>>> > structure is amazing, but I can only currently use it if my > >> >>>>>>>>> > leaf level data > >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd > be > >> >>>>>>>>> > nice to be > >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at > >> >>>>>>>>> > the leaves to > >> >>>>>>>>> > hold lifted contents. > >> >>>>>>>>> > > >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to > >> >>>>>>>>> > access > >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do > that > >> >>>>>>>>> > if i tried to > >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a > >> >>>>>>>>> > ByteArray# > >> >>>>>>>>> > anyways, so it isn't like there is a safety story preventing > >> >>>>>>>>> > this. > >> >>>>>>>>> > > >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection > >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I > >> >>>>>>>>> > could shoehorn a > >> >>>>>>>>> > number of them into ArrayArrays if this worked. > >> >>>>>>>>> > > >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary > >> >>>>>>>>> > indirection compared to c/java and this could reduce that > pain > >> >>>>>>>>> > to just 1 > >> >>>>>>>>> > level of unnecessary indirection. > >> >>>>>>>>> > > >> >>>>>>>>> > -Edward > >> >>>>>>>>> > >> >>>>>>>>> > _______________________________________________ > >> >>>>>>>>> > ghc-devs mailing list > >> >>>>>>>>> > ghc-devs at haskell.org > >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> _______________________________________________ > >> >>>>>>>>> ghc-devs mailing list > >> >>>>>>>>> ghc-devs at haskell.org > >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> >> > >> > > >> > > >> > _______________________________________________ > >> > ghc-devs mailing list > >> > ghc-devs at haskell.org > >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erkokl at gmail.com Sat Aug 29 20:41:46 2015 From: erkokl at gmail.com (Levent Erkok) Date: Sat, 29 Aug 2015 13:41:46 -0700 Subject: Installing ghc-7.10.2 linux binary distro on SuSE Message-ID: Hello all, I've been having a lot of trouble installing the binary-distro's on a SuSE machine. Unfortunately, I don't have root privileges and thus my options are rather limited. The problem seem to boil down to the use of the function pthread_setname_np. It appears the problem was noted before, and Simon Marlow added a corresponding configure check for platforms that do not have this function. See here: https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html Alas, none of the binary distributions listed on https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be built against a system that does not have this function. So, I was unable to install 7.10.2 successfully. Essentially, I'm looking for a binary distro on SuSE, or with a libc that doesn't have the GNU extensions such as pthread_setname_np; if anyone would be kind enough to put out such a binary distro, that'd really be appreciated. (Yes, I tried building from the source; but in the corporate environment with so many things controlled, that did not go very far.) Thanks, -Levent. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekmett at gmail.com Sat Aug 29 22:07:17 2015 From: ekmett at gmail.com (Edward Kmett) Date: Sat, 29 Aug 2015 15:07:17 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Sounds good to me. Right now I'm just hacking up composable accessors for "typed slots" in a fairly lens-like fashion, and treating the set of slots I define and the 'new' function I build for the data type as its API, and build atop that. This could eventually graduate to template-haskell, but I'm not entirely satisfied with the solution I have. I currently distinguish between what I'm calling "slots" (things that point directly to another SmallMutableArrayArray# sans wrapper) and "fields" which point directly to the usual Haskell data types because unifying the two notions meant that I couldn't lift some coercions out "far enough" to make them vanish. I'll be happy to run through my current working set of issues in person and -- as things get nailed down further -- in a longer lived medium than in personal conversations. ;) -Edward On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: > I'd also love to meet up at ICFP and discuss this. I think the array > primops plus a TH layer that lets (ab)use them many times without too much > marginal cost sounds great. And I'd like to learn how we could be either > early users of, or help with, this infrastructure. > > CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping > in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is > currently working on concurrent data structures in Haskell, but will not be > at ICFP. > > > On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: > >> I completely agree. I would love to spend some time during ICFP and >> friends talking about what it could look like. My small array for STM >> changes for the RTS can be seen here [1]. It is on a branch somewhere >> between 7.8 and 7.10 and includes irrelevant STM bits and some >> confusing naming choices (sorry), but should cover all the details >> needed to implement it for a non-STM context. The biggest surprise >> for me was following small array too closely and having a word/byte >> offset miss-match [2]. >> >> [1]: >> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >> >> Ryan >> >> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: >> > I'd love to have that last 10%, but its a lot of work to get there and >> more >> > importantly I don't know quite what it should look like. >> > >> > On the other hand, I do have a pretty good idea of how the primitives >> above >> > could be banged out and tested in a long evening, well in time for >> 7.12. And >> > as noted earlier, those remain useful even if a nicer typed version >> with an >> > extra level of indirection to the sizes is built up after. >> > >> > The rest sounds like a good graduate student project for someone who has >> > graduate students lying around. Maybe somebody at Indiana University >> who has >> > an interest in type theory and parallelism can find us one. =) >> > >> > -Edward >> > >> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >> wrote: >> >> >> >> I think from my perspective, the motivation for getting the type >> >> checker involved is primarily bringing this to the level where users >> >> could be expected to build these structures. it is reasonable to >> >> think that there are people who want to use STM (a context with >> >> mutation already) to implement a straight forward data structure that >> >> avoids extra indirection penalty. There should be some places where >> >> knowing that things are field accesses rather then array indexing >> >> could be helpful, but I think GHC is good right now about handling >> >> constant offsets. In my code I don't do any bounds checking as I know >> >> I will only be accessing my arrays with constant indexes. I make >> >> wrappers for each field access and leave all the unsafe stuff in >> >> there. When things go wrong though, the compiler is no help. Maybe >> >> template Haskell that generates the appropriate wrappers is the right >> >> direction to go. >> >> There is another benefit for me when working with these as arrays in >> >> that it is quite simple and direct (given the hoops already jumped >> >> through) to play with alignment. I can ensure two pointers are never >> >> on the same cache-line by just spacing things out in the array. >> >> >> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >> wrote: >> >> > They just segfault at this level. ;) >> >> > >> >> > Sent from my iPhone >> >> > >> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton wrote: >> >> > >> >> > You presumably also save a bounds check on reads by hard-coding the >> >> > sizes? >> >> > >> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >> wrote: >> >> >> >> >> >> Also there are 4 different "things" here, basically depending on two >> >> >> independent questions: >> >> >> >> >> >> a.) if you want to shove the sizes into the info table, and >> >> >> b.) if you want cardmarking. >> >> >> >> >> >> Versions with/without cardmarking for different sizes can be done >> >> >> pretty >> >> >> easily, but as noted, the infotable variants are pretty invasive. >> >> >> >> >> >> -Edward >> >> >> >> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >> wrote: >> >> >>> >> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds >> up >> >> >>> if >> >> >>> they were small enough and there are enough of them. You get a bit >> >> >>> better >> >> >>> locality of reference in terms of what fits in the first cache >> line of >> >> >>> them. >> >> >>> >> >> >>> -Edward >> >> >>> >> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >> >> >>> wrote: >> >> >>>> >> >> >>>> Yes. And for the short term I can imagine places we will settle >> with >> >> >>>> arrays even if it means tracking lengths unnecessarily and >> >> >>>> unsafeCoercing >> >> >>>> pointers whose types don't actually match their siblings. >> >> >>>> >> >> >>>> Is there anything to recommend the hacks mentioned for fixed sized >> >> >>>> array >> >> >>>> objects *other* than using them to fake structs? (Much to >> >> >>>> derecommend, as >> >> >>>> you mentioned!) >> >> >>>> >> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> I think both are useful, but the one you suggest requires a lot >> more >> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >> >> >>>>> >> >> >>>>> -Edward >> >> >>>>> >> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton > > >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>> So that primitive is an array like thing (Same pointed type, >> >> >>>>>> unbounded >> >> >>>>>> length) with extra payload. >> >> >>>>>> >> >> >>>>>> I can see how we can do without structs if we have arrays, >> >> >>>>>> especially >> >> >>>>>> with the extra payload at front. But wouldn't the general >> solution >> >> >>>>>> for >> >> >>>>>> structs be one that that allows new user data type defs for # >> >> >>>>>> types? >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >> >>>>>> wrote: >> >> >>>>>>> >> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >> >> >>>>>>> known >> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >> >> >>>>>>> above, but >> >> >>>>>>> where the word counts were stored in the objects themselves. >> >> >>>>>>> >> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >> >> >>>>>>> likely >> >> >>>>>>> want to be something we build in addition to MutVar# rather >> than a >> >> >>>>>>> replacement. >> >> >>>>>>> >> >> >>>>>>> On the other hand, if we had to fix those numbers and build >> info >> >> >>>>>>> tables that knew them, and typechecker support, for instance, >> it'd >> >> >>>>>>> get >> >> >>>>>>> rather invasive. >> >> >>>>>>> >> >> >>>>>>> Also, a number of things that we can do with the 'sized' >> versions >> >> >>>>>>> above, like working with evil unsized c-style arrays directly >> >> >>>>>>> inline at the >> >> >>>>>>> end of the structure cease to be possible, so it isn't even a >> pure >> >> >>>>>>> win if we >> >> >>>>>>> did the engineering effort. >> >> >>>>>>> >> >> >>>>>>> I think 90% of the needs I have are covered just by adding the >> one >> >> >>>>>>> primitive. The last 10% gets pretty invasive. >> >> >>>>>>> >> >> >>>>>>> -Edward >> >> >>>>>>> >> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >> rrnewton at gmail.com> >> >> >>>>>>> wrote: >> >> >>>>>>>> >> >> >>>>>>>> I like the possibility of a general solution for mutable >> structs >> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's >> hard. >> >> >>>>>>>> >> >> >>>>>>>> So, we can't unpack MutVar into constructors because of object >> >> >>>>>>>> identity problems. But what about directly supporting an >> >> >>>>>>>> extensible set of >> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) >> >> >>>>>>>> MutVar#? That >> >> >>>>>>>> may be too much work, but is it problematic otherwise? >> >> >>>>>>>> >> >> >>>>>>>> Needless to say, this is also critical if we ever want best in >> >> >>>>>>>> class >> >> >>>>>>>> lockfree mutable structures, just like their Stm and >> sequential >> >> >>>>>>>> counterparts. >> >> >>>>>>>> >> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >> >> >>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> At the very least I'll take this email and turn it into a >> short >> >> >>>>>>>>> article. >> >> >>>>>>>>> >> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >> >> >>>>>>>>> maybe >> >> >>>>>>>>> make a ticket for it. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Thanks >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Simon >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >> >> >>>>>>>>> Sent: 27 August 2015 16:54 >> >> >>>>>>>>> To: Simon Peyton Jones >> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >> >> >>>>>>>>> Subject: Re: ArrayArrays >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. >> It >> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >> ByteArray#'s. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> While those live in #, they are garbage collected objects, so >> >> >>>>>>>>> this >> >> >>>>>>>>> all lives on the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it >> has >> >> >>>>>>>>> to >> >> >>>>>>>>> deal with nested arrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >> thing. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The Problem >> >> >>>>>>>>> >> >> >>>>>>>>> ----------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Consider the scenario where you write a classic doubly-linked >> >> >>>>>>>>> list >> >> >>>>>>>>> in Haskell. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >> pointers >> >> >>>>>>>>> on >> >> >>>>>>>>> the heap. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >> >> >>>>>>>>> Maybe >> >> >>>>>>>>> DLL ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> That is 3 levels of indirection. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >> >> >>>>>>>>> -funbox-strict-fields or UNPACK >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >> >> >>>>>>>>> worsening our representation. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> but now we're still stuck with a level of indirection >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> This means that every operation we perform on this structure >> >> >>>>>>>>> will >> >> >>>>>>>>> be about half of the speed of an implementation in most other >> >> >>>>>>>>> languages >> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Making Progress >> >> >>>>>>>>> >> >> >>>>>>>>> ---------------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I have been working on a number of data structures where the >> >> >>>>>>>>> indirection of going from something in * out to an object in >> # >> >> >>>>>>>>> which >> >> >>>>>>>>> contains the real pointer to my target and coming back >> >> >>>>>>>>> effectively doubles >> >> >>>>>>>>> my runtime. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >> >> >>>>>>>>> MutVar# >> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >> defined >> >> >>>>>>>>> write-barrier. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I could change out the representation to use >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >> time, >> >> >>>>>>>>> but >> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the amount >> of >> >> >>>>>>>>> distinct >> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >> >> >>>>>>>>> object to 2. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to the >> >> >>>>>>>>> array >> >> >>>>>>>>> object and then chase it to the next DLL and chase that to >> the >> >> >>>>>>>>> next array. I >> >> >>>>>>>>> do get my two pointers together in memory though. I'm paying >> for >> >> >>>>>>>>> a card >> >> >>>>>>>>> marking table as well, which I don't particularly need with >> just >> >> >>>>>>>>> two >> >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >> >> >>>>>>>>> machinery added >> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >> >> >>>>>>>>> type, which can >> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> But what if I wanted my object itself to live in # and have >> two >> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array types. >> >> >>>>>>>>> What >> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with the >> >> >>>>>>>>> impedence >> >> >>>>>>>>> mismatch between the imperative world and Haskell, and then >> just >> >> >>>>>>>>> let the >> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >> >> >>>>>>>>> special >> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >> >> >>>>>>>>> abuse pattern >> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >> >> >>>>>>>>> make this >> >> >>>>>>>>> cheaper. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >> preceding >> >> >>>>>>>>> and next >> >> >>>>>>>>> entry in the linked list. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >> into a >> >> >>>>>>>>> strict world, and everything there lives in #. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> next :: DLL -> IO DLL >> >> >>>>>>>>> >> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >> >> >>>>>>>>> >> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that code >> to >> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >> >> >>>>>>>>> easily when they >> >> >>>>>>>>> are known strict and you chain operations of this sort! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Cleaning it Up >> >> >>>>>>>>> >> >> >>>>>>>>> ------------------ >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >> that >> >> >>>>>>>>> points directly to other arrays. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I >> can >> >> >>>>>>>>> fix >> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >> using a >> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a >> >> >>>>>>>>> mixture of >> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >> structure. >> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >> >> >>>>>>>>> existing >> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of >> the >> >> >>>>>>>>> arguments it >> >> >>>>>>>>> takes. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields that >> >> >>>>>>>>> would >> >> >>>>>>>>> be best left unboxed. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >> currently >> >> >>>>>>>>> at >> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >> >> >>>>>>>>> boxed or at a >> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >> int in >> >> >>>>>>>>> question in >> >> >>>>>>>>> there. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need >> to >> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >> Having to >> >> >>>>>>>>> go off to >> >> >>>>>>>>> the side costs me the entire win from avoiding the first >> pointer >> >> >>>>>>>>> chase. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >> >> >>>>>>>>> construct that had n words with unsafe access and m pointers >> to >> >> >>>>>>>>> other heap >> >> >>>>>>>>> objects, one that could put itself on the mutable list when >> any >> >> >>>>>>>>> of those >> >> >>>>>>>>> pointers changed then I could shed this last factor of two in >> >> >>>>>>>>> all >> >> >>>>>>>>> circumstances. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Prototype >> >> >>>>>>>>> >> >> >>>>>>>>> ------------- >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Over the last few days I've put together a small prototype >> >> >>>>>>>>> implementation with a few non-trivial imperative data >> structures >> >> >>>>>>>>> for things >> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >> >> >>>>>>>>> order-maintenance. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> https://github.com/ekmett/structs >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Notable bits: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >> >> >>>>>>>>> link-cut >> >> >>>>>>>>> trees in this style. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that >> >> >>>>>>>>> make >> >> >>>>>>>>> it go fast. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost >> >> >>>>>>>>> all >> >> >>>>>>>>> the references to the LinkCut or Object data constructor get >> >> >>>>>>>>> optimized away, >> >> >>>>>>>>> and we're left with beautiful strict code directly mutating >> out >> >> >>>>>>>>> underlying >> >> >>>>>>>>> representation. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> At the very least I'll take this email and turn it into a >> short >> >> >>>>>>>>> article. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >> >> >>>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> Just to say that I have no idea what is going on in this >> thread. >> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is there >> a >> >> >>>>>>>>> ticket? Is >> >> >>>>>>>>> there a wiki page? >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be a >> >> >>>>>>>>> good >> >> >>>>>>>>> thing. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Simon >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >> Behalf >> >> >>>>>>>>> Of >> >> >>>>>>>>> Edward Kmett >> >> >>>>>>>>> Sent: 21 August 2015 05:25 >> >> >>>>>>>>> To: Manuel M T Chakravarty >> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >> >> >>>>>>>>> Subject: Re: ArrayArrays >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >> would be >> >> >>>>>>>>> very handy as well. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Consider right now if I have something like an >> order-maintenance >> >> >>>>>>>>> structure I have: >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >> >> >>>>>>>>> (Upper s)) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >> >> >>>>>>>>> (Lower s)) {-# >> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The former contains, logically, a mutable integer and two >> >> >>>>>>>>> pointers, >> >> >>>>>>>>> one for forward and one for backwards. The latter is >> basically >> >> >>>>>>>>> the same >> >> >>>>>>>>> thing with a mutable reference up pointing at the structure >> >> >>>>>>>>> above. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On the heap this is an object that points to a structure for >> the >> >> >>>>>>>>> bytearray, and points to another structure for each mutvar >> which >> >> >>>>>>>>> each point >> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >> >> >>>>>>>>> indirection smeared >> >> >>>>>>>>> over everything. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >> >> >>>>>>>>> from >> >> >>>>>>>>> the structure below to the structure above. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Converted into ArrayArray#s I'd get >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and >> >> >>>>>>>>> the >> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >> objects, >> >> >>>>>>>>> represented >> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >> >> >>>>>>>>> sameMutableArrayArray# on these >> >> >>>>>>>>> for object identity, which lets me check for the ends of the >> >> >>>>>>>>> lists by tying >> >> >>>>>>>>> things back on themselves. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> and below that >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up >> to >> >> >>>>>>>>> an >> >> >>>>>>>>> upper structure. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I can then write a handful of combinators for getting out the >> >> >>>>>>>>> slots >> >> >>>>>>>>> in question, while it has gained a level of indirection >> between >> >> >>>>>>>>> the wrapper >> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one >> can >> >> >>>>>>>>> be basically >> >> >>>>>>>>> erased by ghc. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Unlike before I don't have several separate objects on the >> heap >> >> >>>>>>>>> for >> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the >> >> >>>>>>>>> object itself, >> >> >>>>>>>>> and the MutableByteArray# that it references to carry around >> the >> >> >>>>>>>>> mutable >> >> >>>>>>>>> int. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> The only pain points are >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >> >> >>>>>>>>> from >> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >> into an >> >> >>>>>>>>> ArrayArray >> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >> >> >>>>>>>>> Haskell, >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> and >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid >> the >> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >> pointers >> >> >>>>>>>>> wide. Card >> >> >>>>>>>>> marking doesn't help. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> Alternately I could just try to do really evil things and >> >> >>>>>>>>> convert >> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >> >> >>>>>>>>> unsafeCoerce my way >> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >> >> >>>>>>>>> directly into the >> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >> >> >>>>>>>>> aping the >> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >> dangerous! >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >> >> >>>>>>>>> altar >> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >> them >> >> >>>>>>>>> and collect >> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >> >> >>>>>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> That?s an interesting idea. >> >> >>>>>>>>> >> >> >>>>>>>>> Manuel >> >> >>>>>>>>> >> >> >>>>>>>>> > Edward Kmett : >> >> >>>>>>>>> >> >> >>>>>>>>> > >> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# >> and >> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >> >> >>>>>>>>> > ArrayArray# entries >> >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection >> for >> >> >>>>>>>>> > the containing >> >> >>>>>>>>> > structure is amazing, but I can only currently use it if my >> >> >>>>>>>>> > leaf level data >> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >> It'd be >> >> >>>>>>>>> > nice to be >> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down at >> >> >>>>>>>>> > the leaves to >> >> >>>>>>>>> > hold lifted contents. >> >> >>>>>>>>> > >> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >> >> >>>>>>>>> > access >> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do >> that >> >> >>>>>>>>> > if i tried to >> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >> >> >>>>>>>>> > ByteArray# >> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >> preventing >> >> >>>>>>>>> > this. >> >> >>>>>>>>> > >> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >> >> >>>>>>>>> > could shoehorn a >> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >> >> >>>>>>>>> > >> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >> >> >>>>>>>>> > indirection compared to c/java and this could reduce that >> pain >> >> >>>>>>>>> > to just 1 >> >> >>>>>>>>> > level of unnecessary indirection. >> >> >>>>>>>>> > >> >> >>>>>>>>> > -Edward >> >> >>>>>>>>> >> >> >>>>>>>>> > _______________________________________________ >> >> >>>>>>>>> > ghc-devs mailing list >> >> >>>>>>>>> > ghc-devs at haskell.org >> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> _______________________________________________ >> >> >>>>>>>>> ghc-devs mailing list >> >> >>>>>>>>> ghc-devs at haskell.org >> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>> >> >> >>> >> >> >> >> >> > >> >> > >> >> > _______________________________________________ >> >> > ghc-devs mailing list >> >> > ghc-devs at haskell.org >> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> > >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From _deepfire at feelingofgreen.ru Sat Aug 29 22:23:51 2015 From: _deepfire at feelingofgreen.ru (Kosyrev Serge) Date: Sun, 30 Aug 2015 01:23:51 +0300 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: (sfid-20150830_010237_522745_CAD1A97F) (Levent Erkok's message of "Sat, 29 Aug 2015 13:41:46 -0700") References: Message-ID: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> Levent Erkok writes: > Hello all, > > I've been having a lot of trouble installing the binary-distro's on a > SuSE machine. Unfortunately, I don't have root privileges and thus my > options are rather limited. > > The problem seem to boil down to the use of the function > pthread_setname_np. It appears the problem was noted before, and Simon > Marlow added a corresponding configure check for platforms that do not > have this function. See here: > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html > > Alas, none of the binary distributions listed on > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be > built against a system that does not have this function. So, I was > unable to install 7.10.2 successfully. > > Essentially, I'm looking for a binary distro on SuSE, or with a libc > that doesn't have the GNU extensions such as pthread_setname_np; if > anyone would be kind enough to put out such a binary distro, that'd > really be appreciated. > > (Yes, I tried building from the source; but in the corporate > environment with so many things controlled, that did not go very far.) You could try the Nix route, which, conceptually, would boil down to: 1. Installing the Nix package manager into your $HOME on the SuSE system 2. Use Nix to install GHC Which expands to: 1. Following the instructions at: https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation 2. Invoking: nix-env -iA haskellPackages.ghc This would require only HTTP access, which, I presume, should be available within the corporate environment. All the packages from Hackage can be had precompiled from Nixpkgs, but that's slightly more involved and requires some reading: http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure Should you meet trouble, you can always seek help either at nix-dev at lists.science.uu.nl, or on the #nixos/irc.freenode.net IRC channel -- both have a vibrant nightlife^W Haskell community. -- ? ???????e? / respectfully, ??????? ?????? -- ?And those who were seen dancing were thought to be insane by those who could not hear the music.? ? Friedrich Wilhelm Nietzsche From erkokl at gmail.com Sun Aug 30 00:02:22 2015 From: erkokl at gmail.com (Levent Erkok) Date: Sat, 29 Aug 2015 17:02:22 -0700 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> References: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> Message-ID: I really like the idea of nix. Alas, generating native binaries that can run on SuSE without being in the nix environment is a requirement that's hard to let go. (Everyone in my group would have to start using nix, a tall order.) Thanks for the advice however, it can indeed come handy for one-off trials if needed. In the meantime, I'm still looking for a binary-linux-distro that doesn't require the set_threadname_np functionality, if anyone can point me in that direction. Thanks, -Levent. On Sat, Aug 29, 2015 at 3:23 PM, Kosyrev Serge <_deepfire at feelingofgreen.ru> wrote: > Levent Erkok writes: > > Hello all, > > > > I've been having a lot of trouble installing the binary-distro's on a > > SuSE machine. Unfortunately, I don't have root privileges and thus my > > options are rather limited. > > > > The problem seem to boil down to the use of the function > > pthread_setname_np. It appears the problem was noted before, and Simon > > Marlow added a corresponding configure check for platforms that do not > > have this function. See here: > > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html > > > > Alas, none of the binary distributions listed on > > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be > > built against a system that does not have this function. So, I was > > unable to install 7.10.2 successfully. > > > > Essentially, I'm looking for a binary distro on SuSE, or with a libc > > that doesn't have the GNU extensions such as pthread_setname_np; if > > anyone would be kind enough to put out such a binary distro, that'd > > really be appreciated. > > > > (Yes, I tried building from the source; but in the corporate > > environment with so many things controlled, that did not go very far.) > > You could try the Nix route, which, conceptually, would boil down to: > > 1. Installing the Nix package manager into your $HOME on the SuSE system > 2. Use Nix to install GHC > > Which expands to: > > 1. Following the instructions at: > > > https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation > > 2. Invoking: > > nix-env -iA haskellPackages.ghc > > This would require only HTTP access, which, I presume, should be > available within the corporate environment. > > All the packages from Hackage can be had precompiled from Nixpkgs, > but that's slightly more involved and requires some reading: > > > http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure > > Should you meet trouble, you can always seek help either at > nix-dev at lists.science.uu.nl, or on the #nixos/irc.freenode.net IRC > channel -- both have a vibrant nightlife^W Haskell community. > > -- > ? ???????e? / respectfully, > ??????? ?????? > -- > ?And those who were seen dancing were thought to be insane > by those who could not hear the music.? > ? Friedrich Wilhelm Nietzsche > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.tibell at gmail.com Sun Aug 30 00:45:41 2015 From: johan.tibell at gmail.com (Johan Tibell) Date: Sat, 29 Aug 2015 17:45:41 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: I'd also be interested to chat at ICFP to see if I can use this for my HAMT implementation. On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: > Sounds good to me. Right now I'm just hacking up composable accessors for > "typed slots" in a fairly lens-like fashion, and treating the set of slots > I define and the 'new' function I build for the data type as its API, and > build atop that. This could eventually graduate to template-haskell, but > I'm not entirely satisfied with the solution I have. I currently > distinguish between what I'm calling "slots" (things that point directly to > another SmallMutableArrayArray# sans wrapper) and "fields" which point > directly to the usual Haskell data types because unifying the two notions > meant that I couldn't lift some coercions out "far enough" to make them > vanish. > > I'll be happy to run through my current working set of issues in person > and -- as things get nailed down further -- in a longer lived medium than > in personal conversations. ;) > > -Edward > > On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: > >> I'd also love to meet up at ICFP and discuss this. I think the array >> primops plus a TH layer that lets (ab)use them many times without too much >> marginal cost sounds great. And I'd like to learn how we could be either >> early users of, or help with, this infrastructure. >> >> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >> who is currently working on concurrent data structures in Haskell, but will >> not be at ICFP. >> >> >> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: >> >>> I completely agree. I would love to spend some time during ICFP and >>> friends talking about what it could look like. My small array for STM >>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>> confusing naming choices (sorry), but should cover all the details >>> needed to implement it for a non-STM context. The biggest surprise >>> for me was following small array too closely and having a word/byte >>> offset miss-match [2]. >>> >>> [1]: >>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>> >>> Ryan >>> >>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett wrote: >>> > I'd love to have that last 10%, but its a lot of work to get there and >>> more >>> > importantly I don't know quite what it should look like. >>> > >>> > On the other hand, I do have a pretty good idea of how the primitives >>> above >>> > could be banged out and tested in a long evening, well in time for >>> 7.12. And >>> > as noted earlier, those remain useful even if a nicer typed version >>> with an >>> > extra level of indirection to the sizes is built up after. >>> > >>> > The rest sounds like a good graduate student project for someone who >>> has >>> > graduate students lying around. Maybe somebody at Indiana University >>> who has >>> > an interest in type theory and parallelism can find us one. =) >>> > >>> > -Edward >>> > >>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >>> wrote: >>> >> >>> >> I think from my perspective, the motivation for getting the type >>> >> checker involved is primarily bringing this to the level where users >>> >> could be expected to build these structures. it is reasonable to >>> >> think that there are people who want to use STM (a context with >>> >> mutation already) to implement a straight forward data structure that >>> >> avoids extra indirection penalty. There should be some places where >>> >> knowing that things are field accesses rather then array indexing >>> >> could be helpful, but I think GHC is good right now about handling >>> >> constant offsets. In my code I don't do any bounds checking as I know >>> >> I will only be accessing my arrays with constant indexes. I make >>> >> wrappers for each field access and leave all the unsafe stuff in >>> >> there. When things go wrong though, the compiler is no help. Maybe >>> >> template Haskell that generates the appropriate wrappers is the right >>> >> direction to go. >>> >> There is another benefit for me when working with these as arrays in >>> >> that it is quite simple and direct (given the hoops already jumped >>> >> through) to play with alignment. I can ensure two pointers are never >>> >> on the same cache-line by just spacing things out in the array. >>> >> >>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >>> wrote: >>> >> > They just segfault at this level. ;) >>> >> > >>> >> > Sent from my iPhone >>> >> > >>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >>> wrote: >>> >> > >>> >> > You presumably also save a bounds check on reads by hard-coding the >>> >> > sizes? >>> >> > >>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >>> wrote: >>> >> >> >>> >> >> Also there are 4 different "things" here, basically depending on >>> two >>> >> >> independent questions: >>> >> >> >>> >> >> a.) if you want to shove the sizes into the info table, and >>> >> >> b.) if you want cardmarking. >>> >> >> >>> >> >> Versions with/without cardmarking for different sizes can be done >>> >> >> pretty >>> >> >> easily, but as noted, the infotable variants are pretty invasive. >>> >> >> >>> >> >> -Edward >>> >> >> >>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >>> wrote: >>> >> >>> >>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds >>> up >>> >> >>> if >>> >> >>> they were small enough and there are enough of them. You get a bit >>> >> >>> better >>> >> >>> locality of reference in terms of what fits in the first cache >>> line of >>> >> >>> them. >>> >> >>> >>> >> >>> -Edward >>> >> >>> >>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >>> >> >>> wrote: >>> >> >>>> >>> >> >>>> Yes. And for the short term I can imagine places we will settle >>> with >>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>> >> >>>> unsafeCoercing >>> >> >>>> pointers whose types don't actually match their siblings. >>> >> >>>> >>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>> sized >>> >> >>>> array >>> >> >>>> objects *other* than using them to fake structs? (Much to >>> >> >>>> derecommend, as >>> >> >>>> you mentioned!) >>> >> >>>> >>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>> >> >>>> wrote: >>> >> >>>>> >>> >> >>>>> I think both are useful, but the one you suggest requires a lot >>> more >>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >>> >> >>>>> >>> >> >>>>> -Edward >>> >> >>>>> >>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton < >>> rrnewton at gmail.com> >>> >> >>>>> wrote: >>> >> >>>>>> >>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>> >> >>>>>> unbounded >>> >> >>>>>> length) with extra payload. >>> >> >>>>>> >>> >> >>>>>> I can see how we can do without structs if we have arrays, >>> >> >>>>>> especially >>> >> >>>>>> with the extra payload at front. But wouldn't the general >>> solution >>> >> >>>>>> for >>> >> >>>>>> structs be one that that allows new user data type defs for # >>> >> >>>>>> types? >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> > >>> >> >>>>>> wrote: >>> >> >>>>>>> >>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a >>> >> >>>>>>> known >>> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting >>> >> >>>>>>> above, but >>> >> >>>>>>> where the word counts were stored in the objects themselves. >>> >> >>>>>>> >>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >>> >> >>>>>>> likely >>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>> than a >>> >> >>>>>>> replacement. >>> >> >>>>>>> >>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>> info >>> >> >>>>>>> tables that knew them, and typechecker support, for instance, >>> it'd >>> >> >>>>>>> get >>> >> >>>>>>> rather invasive. >>> >> >>>>>>> >>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>> versions >>> >> >>>>>>> above, like working with evil unsized c-style arrays directly >>> >> >>>>>>> inline at the >>> >> >>>>>>> end of the structure cease to be possible, so it isn't even a >>> pure >>> >> >>>>>>> win if we >>> >> >>>>>>> did the engineering effort. >>> >> >>>>>>> >>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>> the one >>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>> >> >>>>>>> >>> >> >>>>>>> -Edward >>> >> >>>>>>> >>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >>> rrnewton at gmail.com> >>> >> >>>>>>> wrote: >>> >> >>>>>>>> >>> >> >>>>>>>> I like the possibility of a general solution for mutable >>> structs >>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's >>> hard. >>> >> >>>>>>>> >>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>> object >>> >> >>>>>>>> identity problems. But what about directly supporting an >>> >> >>>>>>>> extensible set of >>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>> replacing) >>> >> >>>>>>>> MutVar#? That >>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>> >> >>>>>>>> >>> >> >>>>>>>> Needless to say, this is also critical if we ever want best >>> in >>> >> >>>>>>>> class >>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>> sequential >>> >> >>>>>>>> counterparts. >>> >> >>>>>>>> >>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>> >> >>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>> short >>> >> >>>>>>>>> article. >>> >> >>>>>>>>> >>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and >>> >> >>>>>>>>> maybe >>> >> >>>>>>>>> make a ticket for it. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Thanks >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Simon >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>> >> >>>>>>>>> To: Simon Peyton Jones >>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>> >> >>>>>>>>> Subject: Re: ArrayArrays >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. >>> It >>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>> ByteArray#'s. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> While those live in #, they are garbage collected objects, >>> so >>> >> >>>>>>>>> this >>> >> >>>>>>>>> all lives on the heap. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it >>> has >>> >> >>>>>>>>> to >>> >> >>>>>>>>> deal with nested arrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>> thing. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The Problem >>> >> >>>>>>>>> >>> >> >>>>>>>>> ----------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Consider the scenario where you write a classic >>> doubly-linked >>> >> >>>>>>>>> list >>> >> >>>>>>>>> in Haskell. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>> pointers >>> >> >>>>>>>>> on >>> >> >>>>>>>>> the heap. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> >>> >> >>>>>>>>> Maybe >>> >> >>>>>>>>> DLL ~> DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> That is 3 levels of indirection. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>> and >>> >> >>>>>>>>> worsening our representation. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> This means that every operation we perform on this structure >>> >> >>>>>>>>> will >>> >> >>>>>>>>> be about half of the speed of an implementation in most >>> other >>> >> >>>>>>>>> languages >>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Making Progress >>> >> >>>>>>>>> >>> >> >>>>>>>>> ---------------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I have been working on a number of data structures where the >>> >> >>>>>>>>> indirection of going from something in * out to an object >>> in # >>> >> >>>>>>>>> which >>> >> >>>>>>>>> contains the real pointer to my target and coming back >>> >> >>>>>>>>> effectively doubles >>> >> >>>>>>>>> my runtime. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >>> >> >>>>>>>>> MutVar# >>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>> defined >>> >> >>>>>>>>> write-barrier. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I could change out the representation to use >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>> time, >>> >> >>>>>>>>> but >>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>> amount of >>> >> >>>>>>>>> distinct >>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >>> >> >>>>>>>>> object to 2. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>> the >>> >> >>>>>>>>> array >>> >> >>>>>>>>> object and then chase it to the next DLL and chase that to >>> the >>> >> >>>>>>>>> next array. I >>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>> paying for >>> >> >>>>>>>>> a card >>> >> >>>>>>>>> marking table as well, which I don't particularly need with >>> just >>> >> >>>>>>>>> two >>> >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" >>> >> >>>>>>>>> machinery added >>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >>> >> >>>>>>>>> type, which can >>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have >>> two >>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>> types. >>> >> >>>>>>>>> What >>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>> the >>> >> >>>>>>>>> impedence >>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and then >>> just >>> >> >>>>>>>>> let the >>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >>> >> >>>>>>>>> special >>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even >>> >> >>>>>>>>> abuse pattern >>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to >>> >> >>>>>>>>> make this >>> >> >>>>>>>>> cheaper. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>> preceding >>> >> >>>>>>>>> and next >>> >> >>>>>>>>> entry in the linked list. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>> into a >>> >> >>>>>>>>> strict world, and everything there lives in #. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> next :: DLL -> IO DLL >>> >> >>>>>>>>> >>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>> >> >>>>>>>>> >>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>> code to >>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >>> >> >>>>>>>>> easily when they >>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Cleaning it Up >>> >> >>>>>>>>> >>> >> >>>>>>>>> ------------------ >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>> that >>> >> >>>>>>>>> points directly to other arrays. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I >>> can >>> >> >>>>>>>>> fix >>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>> using a >>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store >>> a >>> >> >>>>>>>>> mixture of >>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>> structure. >>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >>> >> >>>>>>>>> existing >>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of >>> the >>> >> >>>>>>>>> arguments it >>> >> >>>>>>>>> takes. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>> that >>> >> >>>>>>>>> would >>> >> >>>>>>>>> be best left unboxed. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>> currently >>> >> >>>>>>>>> at >>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a >>> >> >>>>>>>>> boxed or at a >>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>> int in >>> >> >>>>>>>>> question in >>> >> >>>>>>>>> there. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need >>> to >>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>> Having to >>> >> >>>>>>>>> go off to >>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>> pointer >>> >> >>>>>>>>> chase. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >>> >> >>>>>>>>> construct that had n words with unsafe access and m >>> pointers to >>> >> >>>>>>>>> other heap >>> >> >>>>>>>>> objects, one that could put itself on the mutable list when >>> any >>> >> >>>>>>>>> of those >>> >> >>>>>>>>> pointers changed then I could shed this last factor of two >>> in >>> >> >>>>>>>>> all >>> >> >>>>>>>>> circumstances. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Prototype >>> >> >>>>>>>>> >>> >> >>>>>>>>> ------------- >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Over the last few days I've put together a small prototype >>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>> structures >>> >> >>>>>>>>> for things >>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>> >> >>>>>>>>> order-maintenance. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> https://github.com/ekmett/structs >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Notable bits: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >>> >> >>>>>>>>> link-cut >>> >> >>>>>>>>> trees in this style. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>> that >>> >> >>>>>>>>> make >>> >> >>>>>>>>> it go fast. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>> almost >>> >> >>>>>>>>> all >>> >> >>>>>>>>> the references to the LinkCut or Object data constructor get >>> >> >>>>>>>>> optimized away, >>> >> >>>>>>>>> and we're left with beautiful strict code directly mutating >>> out >>> >> >>>>>>>>> underlying >>> >> >>>>>>>>> representation. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>> short >>> >> >>>>>>>>> article. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>> >> >>>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>> thread. >>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>> there a >>> >> >>>>>>>>> ticket? Is >>> >> >>>>>>>>> there a wiki page? >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would be >>> a >>> >> >>>>>>>>> good >>> >> >>>>>>>>> thing. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Simon >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>> Behalf >>> >> >>>>>>>>> Of >>> >> >>>>>>>>> Edward Kmett >>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>> >> >>>>>>>>> To: Manuel M T Chakravarty >>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>> >> >>>>>>>>> Subject: Re: ArrayArrays >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>> would be >>> >> >>>>>>>>> very handy as well. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Consider right now if I have something like an >>> order-maintenance >>> >> >>>>>>>>> structure I have: >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>> {-# >>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >>> >> >>>>>>>>> (Upper s)) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>> {-# >>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >>> >> >>>>>>>>> (Lower s)) {-# >>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>> >> >>>>>>>>> pointers, >>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>> basically >>> >> >>>>>>>>> the same >>> >> >>>>>>>>> thing with a mutable reference up pointing at the structure >>> >> >>>>>>>>> above. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On the heap this is an object that points to a structure >>> for the >>> >> >>>>>>>>> bytearray, and points to another structure for each mutvar >>> which >>> >> >>>>>>>>> each point >>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>> >> >>>>>>>>> indirection smeared >>> >> >>>>>>>>> over everything. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link >>> >> >>>>>>>>> from >>> >> >>>>>>>>> the structure below to the structure above. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>> and >>> >> >>>>>>>>> the >>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>> objects, >>> >> >>>>>>>>> represented >>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>> >> >>>>>>>>> sameMutableArrayArray# on these >>> >> >>>>>>>>> for object identity, which lets me check for the ends of the >>> >> >>>>>>>>> lists by tying >>> >> >>>>>>>>> things back on themselves. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> and below that >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>> up to >>> >> >>>>>>>>> an >>> >> >>>>>>>>> upper structure. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>> the >>> >> >>>>>>>>> slots >>> >> >>>>>>>>> in question, while it has gained a level of indirection >>> between >>> >> >>>>>>>>> the wrapper >>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one >>> can >>> >> >>>>>>>>> be basically >>> >> >>>>>>>>> erased by ghc. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Unlike before I don't have several separate objects on the >>> heap >>> >> >>>>>>>>> for >>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>> the >>> >> >>>>>>>>> object itself, >>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>> around the >>> >> >>>>>>>>> mutable >>> >> >>>>>>>>> int. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> The only pain points are >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me >>> >> >>>>>>>>> from >>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>> into an >>> >> >>>>>>>>> ArrayArray >>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >>> >> >>>>>>>>> Haskell, >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> and >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid >>> the >>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>> pointers >>> >> >>>>>>>>> wide. Card >>> >> >>>>>>>>> marking doesn't help. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>> >> >>>>>>>>> convert >>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>> >> >>>>>>>>> unsafeCoerce my way >>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>> >> >>>>>>>>> directly into the >>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here by >>> >> >>>>>>>>> aping the >>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>> dangerous! >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >>> >> >>>>>>>>> altar >>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>> them >>> >> >>>>>>>>> and collect >>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>> >> >>>>>>>>> wrote: >>> >> >>>>>>>>> >>> >> >>>>>>>>> That?s an interesting idea. >>> >> >>>>>>>>> >>> >> >>>>>>>>> Manuel >>> >> >>>>>>>>> >>> >> >>>>>>>>> > Edward Kmett : >>> >> >>>>>>>>> >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# >>> and >>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >>> >> >>>>>>>>> > ArrayArray# entries >>> >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection >>> for >>> >> >>>>>>>>> > the containing >>> >> >>>>>>>>> > structure is amazing, but I can only currently use it if >>> my >>> >> >>>>>>>>> > leaf level data >>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>> It'd be >>> >> >>>>>>>>> > nice to be >>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down >>> at >>> >> >>>>>>>>> > the leaves to >>> >> >>>>>>>>> > hold lifted contents. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >>> >> >>>>>>>>> > access >>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do >>> that >>> >> >>>>>>>>> > if i tried to >>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a >>> >> >>>>>>>>> > ByteArray# >>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>> preventing >>> >> >>>>>>>>> > this. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >>> >> >>>>>>>>> > could shoehorn a >>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>> unnecessary >>> >> >>>>>>>>> > indirection compared to c/java and this could reduce that >>> pain >>> >> >>>>>>>>> > to just 1 >>> >> >>>>>>>>> > level of unnecessary indirection. >>> >> >>>>>>>>> > >>> >> >>>>>>>>> > -Edward >>> >> >>>>>>>>> >>> >> >>>>>>>>> > _______________________________________________ >>> >> >>>>>>>>> > ghc-devs mailing list >>> >> >>>>>>>>> > ghc-devs at haskell.org >>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> >>> >> >>>>>>>>> _______________________________________________ >>> >> >>>>>>>>> ghc-devs mailing list >>> >> >>>>>>>>> ghc-devs at haskell.org >>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> >>>>>>> >>> >> >>>>>>> >>> >> >>>>> >>> >> >>> >>> >> >> >>> >> > >>> >> > >>> >> > _______________________________________________ >>> >> > ghc-devs mailing list >>> >> > ghc-devs at haskell.org >>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> > >>> > >>> > >>> >> >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rf at rufflewind.com Sun Aug 30 02:00:37 2015 From: rf at rufflewind.com (Phil Ruffwind) Date: Sat, 29 Aug 2015 22:00:37 -0400 Subject: Documentation for GHC.IO.Exception Message-ID: On Hackage, there seems to be no documentation for GHC.IO.Exception in base, but the package can in fact be imported so it's not exactly an internal package. directory and likely many other packages do use the GHC-specific error types like InappropriateType in exceptions, so it would be useful if there were a documentation page for these things even if there isn't any text. As it is right now, the discoverability of these error types is very low as you need to know the magical URL to show the source code: https://hackage.haskell.org/package/base/docs/src/GHC.IO.Exception.html That's also partly a problem with Haddock; AFAIK there's no way to navigate to the Source Code packages of modules whose documentation is disabled, even though they are in fact present if you can figure out the URL. So would it be OK to open up this module or is there a reason for keeping them discreet? From ekmett at gmail.com Sun Aug 30 04:24:58 2015 From: ekmett at gmail.com (Edward Kmett) Date: Sat, 29 Aug 2015 21:24:58 -0700 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Without a custom primitive it doesn't help much there, you have to store the indirection to the mask. With a custom primitive it should cut the on heap root-to-leaf path of everything in the HAMT in half. A shorter HashMap was actually one of the motivating factors for me doing this. It is rather astoundingly difficult to beat the performance of HashMap, so I had to start cheating pretty badly. ;) -Edward On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell wrote: > I'd also be interested to chat at ICFP to see if I can use this for my > HAMT implementation. > > On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: > >> Sounds good to me. Right now I'm just hacking up composable accessors for >> "typed slots" in a fairly lens-like fashion, and treating the set of slots >> I define and the 'new' function I build for the data type as its API, and >> build atop that. This could eventually graduate to template-haskell, but >> I'm not entirely satisfied with the solution I have. I currently >> distinguish between what I'm calling "slots" (things that point directly to >> another SmallMutableArrayArray# sans wrapper) and "fields" which point >> directly to the usual Haskell data types because unifying the two notions >> meant that I couldn't lift some coercions out "far enough" to make them >> vanish. >> >> I'll be happy to run through my current working set of issues in person >> and -- as things get nailed down further -- in a longer lived medium than >> in personal conversations. ;) >> >> -Edward >> >> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: >> >>> I'd also love to meet up at ICFP and discuss this. I think the array >>> primops plus a TH layer that lets (ab)use them many times without too much >>> marginal cost sounds great. And I'd like to learn how we could be either >>> early users of, or help with, this infrastructure. >>> >>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >>> who is currently working on concurrent data structures in Haskell, but will >>> not be at ICFP. >>> >>> >>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates wrote: >>> >>>> I completely agree. I would love to spend some time during ICFP and >>>> friends talking about what it could look like. My small array for STM >>>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>>> confusing naming choices (sorry), but should cover all the details >>>> needed to implement it for a non-STM context. The biggest surprise >>>> for me was following small array too closely and having a word/byte >>>> offset miss-match [2]. >>>> >>>> [1]: >>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>>> >>>> Ryan >>>> >>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett >>>> wrote: >>>> > I'd love to have that last 10%, but its a lot of work to get there >>>> and more >>>> > importantly I don't know quite what it should look like. >>>> > >>>> > On the other hand, I do have a pretty good idea of how the primitives >>>> above >>>> > could be banged out and tested in a long evening, well in time for >>>> 7.12. And >>>> > as noted earlier, those remain useful even if a nicer typed version >>>> with an >>>> > extra level of indirection to the sizes is built up after. >>>> > >>>> > The rest sounds like a good graduate student project for someone who >>>> has >>>> > graduate students lying around. Maybe somebody at Indiana University >>>> who has >>>> > an interest in type theory and parallelism can find us one. =) >>>> > >>>> > -Edward >>>> > >>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >>>> wrote: >>>> >> >>>> >> I think from my perspective, the motivation for getting the type >>>> >> checker involved is primarily bringing this to the level where users >>>> >> could be expected to build these structures. it is reasonable to >>>> >> think that there are people who want to use STM (a context with >>>> >> mutation already) to implement a straight forward data structure that >>>> >> avoids extra indirection penalty. There should be some places where >>>> >> knowing that things are field accesses rather then array indexing >>>> >> could be helpful, but I think GHC is good right now about handling >>>> >> constant offsets. In my code I don't do any bounds checking as I >>>> know >>>> >> I will only be accessing my arrays with constant indexes. I make >>>> >> wrappers for each field access and leave all the unsafe stuff in >>>> >> there. When things go wrong though, the compiler is no help. Maybe >>>> >> template Haskell that generates the appropriate wrappers is the right >>>> >> direction to go. >>>> >> There is another benefit for me when working with these as arrays in >>>> >> that it is quite simple and direct (given the hoops already jumped >>>> >> through) to play with alignment. I can ensure two pointers are never >>>> >> on the same cache-line by just spacing things out in the array. >>>> >> >>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >>>> wrote: >>>> >> > They just segfault at this level. ;) >>>> >> > >>>> >> > Sent from my iPhone >>>> >> > >>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >>>> wrote: >>>> >> > >>>> >> > You presumably also save a bounds check on reads by hard-coding the >>>> >> > sizes? >>>> >> > >>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >>>> wrote: >>>> >> >> >>>> >> >> Also there are 4 different "things" here, basically depending on >>>> two >>>> >> >> independent questions: >>>> >> >> >>>> >> >> a.) if you want to shove the sizes into the info table, and >>>> >> >> b.) if you want cardmarking. >>>> >> >> >>>> >> >> Versions with/without cardmarking for different sizes can be done >>>> >> >> pretty >>>> >> >> easily, but as noted, the infotable variants are pretty invasive. >>>> >> >> >>>> >> >> -Edward >>>> >> >> >>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >>>> wrote: >>>> >> >>> >>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which >>>> adds up >>>> >> >>> if >>>> >> >>> they were small enough and there are enough of them. You get a >>>> bit >>>> >> >>> better >>>> >> >>> locality of reference in terms of what fits in the first cache >>>> line of >>>> >> >>> them. >>>> >> >>> >>>> >> >>> -Edward >>>> >> >>> >>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >>> > >>>> >> >>> wrote: >>>> >> >>>> >>>> >> >>>> Yes. And for the short term I can imagine places we will settle >>>> with >>>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>>> >> >>>> unsafeCoercing >>>> >> >>>> pointers whose types don't actually match their siblings. >>>> >> >>>> >>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>>> sized >>>> >> >>>> array >>>> >> >>>> objects *other* than using them to fake structs? (Much to >>>> >> >>>> derecommend, as >>>> >> >>>> you mentioned!) >>>> >> >>>> >>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>>> >> >>>> wrote: >>>> >> >>>>> >>>> >> >>>>> I think both are useful, but the one you suggest requires a >>>> lot more >>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >>>> >> >>>>> >>>> >> >>>>> -Edward >>>> >> >>>>> >>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton < >>>> rrnewton at gmail.com> >>>> >> >>>>> wrote: >>>> >> >>>>>> >>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>>> >> >>>>>> unbounded >>>> >> >>>>>> length) with extra payload. >>>> >> >>>>>> >>>> >> >>>>>> I can see how we can do without structs if we have arrays, >>>> >> >>>>>> especially >>>> >> >>>>>> with the extra payload at front. But wouldn't the general >>>> solution >>>> >> >>>>>> for >>>> >> >>>>>> structs be one that that allows new user data type defs for # >>>> >> >>>>>> types? >>>> >> >>>>>> >>>> >> >>>>>> >>>> >> >>>>>> >>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett < >>>> ekmett at gmail.com> >>>> >> >>>>>> wrote: >>>> >> >>>>>>> >>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and >>>> a >>>> >> >>>>>>> known >>>> >> >>>>>>> number of pointers is basically what Ryan Yates was >>>> suggesting >>>> >> >>>>>>> above, but >>>> >> >>>>>>> where the word counts were stored in the objects themselves. >>>> >> >>>>>>> >>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >>>> >> >>>>>>> likely >>>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>>> than a >>>> >> >>>>>>> replacement. >>>> >> >>>>>>> >>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>>> info >>>> >> >>>>>>> tables that knew them, and typechecker support, for >>>> instance, it'd >>>> >> >>>>>>> get >>>> >> >>>>>>> rather invasive. >>>> >> >>>>>>> >>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>>> versions >>>> >> >>>>>>> above, like working with evil unsized c-style arrays directly >>>> >> >>>>>>> inline at the >>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even >>>> a pure >>>> >> >>>>>>> win if we >>>> >> >>>>>>> did the engineering effort. >>>> >> >>>>>>> >>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>>> the one >>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>>> >> >>>>>>> >>>> >> >>>>>>> -Edward >>>> >> >>>>>>> >>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >>>> rrnewton at gmail.com> >>>> >> >>>>>>> wrote: >>>> >> >>>>>>>> >>>> >> >>>>>>>> I like the possibility of a general solution for mutable >>>> structs >>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's >>>> hard. >>>> >> >>>>>>>> >>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>>> object >>>> >> >>>>>>>> identity problems. But what about directly supporting an >>>> >> >>>>>>>> extensible set of >>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>>> replacing) >>>> >> >>>>>>>> MutVar#? That >>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>>> >> >>>>>>>> >>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best >>>> in >>>> >> >>>>>>>> class >>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>>> sequential >>>> >> >>>>>>>> counterparts. >>>> >> >>>>>>>> >>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>>> >> >>>>>>>> wrote: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>> short >>>> >> >>>>>>>>> article. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, >>>> and >>>> >> >>>>>>>>> maybe >>>> >> >>>>>>>>> make a ticket for it. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Thanks >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Simon >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>>> >> >>>>>>>>> To: Simon Peyton Jones >>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified >>>> invariant. It >>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>>> ByteArray#'s. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> While those live in #, they are garbage collected objects, >>>> so >>>> >> >>>>>>>>> this >>>> >> >>>>>>>>> all lives on the heap. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it >>>> has >>>> >> >>>>>>>>> to >>>> >> >>>>>>>>> deal with nested arrays. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>>> thing. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> The Problem >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> ----------------- >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Consider the scenario where you write a classic >>>> doubly-linked >>>> >> >>>>>>>>> list >>>> >> >>>>>>>>> in Haskell. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>>> pointers >>>> >> >>>>>>>>> on >>>> >> >>>>>>>>> the heap. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) >>>> ~> >>>> >> >>>>>>>>> Maybe >>>> >> >>>>>>>>> DLL ~> DLL >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> That is 3 levels of indirection. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>>> and >>>> >> >>>>>>>>> worsening our representation. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> This means that every operation we perform on this >>>> structure >>>> >> >>>>>>>>> will >>>> >> >>>>>>>>> be about half of the speed of an implementation in most >>>> other >>>> >> >>>>>>>>> languages >>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Making Progress >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> ---------------------- >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I have been working on a number of data structures where >>>> the >>>> >> >>>>>>>>> indirection of going from something in * out to an object >>>> in # >>>> >> >>>>>>>>> which >>>> >> >>>>>>>>> contains the real pointer to my target and coming back >>>> >> >>>>>>>>> effectively doubles >>>> >> >>>>>>>>> my runtime. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >>>> >> >>>>>>>>> MutVar# >>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>>> defined >>>> >> >>>>>>>>> write-barrier. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I could change out the representation to use >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>>> time, >>>> >> >>>>>>>>> but >>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>>> amount of >>>> >> >>>>>>>>> distinct >>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per >>>> >> >>>>>>>>> object to 2. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>>> the >>>> >> >>>>>>>>> array >>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that to >>>> the >>>> >> >>>>>>>>> next array. I >>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>>> paying for >>>> >> >>>>>>>>> a card >>>> >> >>>>>>>>> marking table as well, which I don't particularly need >>>> with just >>>> >> >>>>>>>>> two >>>> >> >>>>>>>>> pointers, but we can shed that with the >>>> "SmallMutableArray#" >>>> >> >>>>>>>>> machinery added >>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data >>>> >> >>>>>>>>> type, which can >>>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and >>>> have two >>>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>>> types. >>>> >> >>>>>>>>> What >>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>>> the >>>> >> >>>>>>>>> impedence >>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and >>>> then just >>>> >> >>>>>>>>> let the >>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a >>>> >> >>>>>>>>> special >>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can >>>> even >>>> >> >>>>>>>>> abuse pattern >>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further >>>> to >>>> >> >>>>>>>>> make this >>>> >> >>>>>>>>> cheaper. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>>> preceding >>>> >> >>>>>>>>> and next >>>> >> >>>>>>>>> entry in the linked list. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>>> into a >>>> >> >>>>>>>>> strict world, and everything there lives in #. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> next :: DLL -> IO DLL >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>>> code to >>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >>>> >> >>>>>>>>> easily when they >>>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Cleaning it Up >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> ------------------ >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>>> that >>>> >> >>>>>>>>> points directly to other arrays. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but >>>> I can >>>> >> >>>>>>>>> fix >>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>>> using a >>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me >>>> store a >>>> >> >>>>>>>>> mixture of >>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>>> structure. >>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >>>> >> >>>>>>>>> existing >>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of >>>> the >>>> >> >>>>>>>>> arguments it >>>> >> >>>>>>>>> takes. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>>> that >>>> >> >>>>>>>>> would >>>> >> >>>>>>>>> be best left unboxed. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>>> currently >>>> >> >>>>>>>>> at >>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at >>>> a >>>> >> >>>>>>>>> boxed or at a >>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>>> int in >>>> >> >>>>>>>>> question in >>>> >> >>>>>>>>> there. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I >>>> need to >>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>>> Having to >>>> >> >>>>>>>>> go off to >>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>>> pointer >>>> >> >>>>>>>>> chase. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >>>> >> >>>>>>>>> construct that had n words with unsafe access and m >>>> pointers to >>>> >> >>>>>>>>> other heap >>>> >> >>>>>>>>> objects, one that could put itself on the mutable list >>>> when any >>>> >> >>>>>>>>> of those >>>> >> >>>>>>>>> pointers changed then I could shed this last factor of two >>>> in >>>> >> >>>>>>>>> all >>>> >> >>>>>>>>> circumstances. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Prototype >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> ------------- >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Over the last few days I've put together a small prototype >>>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>>> structures >>>> >> >>>>>>>>> for things >>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>> >> >>>>>>>>> order-maintenance. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> https://github.com/ekmett/structs >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Notable bits: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >>>> >> >>>>>>>>> link-cut >>>> >> >>>>>>>>> trees in this style. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>>> that >>>> >> >>>>>>>>> make >>>> >> >>>>>>>>> it go fast. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>>> almost >>>> >> >>>>>>>>> all >>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor >>>> get >>>> >> >>>>>>>>> optimized away, >>>> >> >>>>>>>>> and we're left with beautiful strict code directly >>>> mutating out >>>> >> >>>>>>>>> underlying >>>> >> >>>>>>>>> representation. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>> short >>>> >> >>>>>>>>> article. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> -Edward >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>>> >> >>>>>>>>> wrote: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>>> thread. >>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>>> there a >>>> >> >>>>>>>>> ticket? Is >>>> >> >>>>>>>>> there a wiki page? >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would >>>> be a >>>> >> >>>>>>>>> good >>>> >> >>>>>>>>> thing. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Simon >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>>> Behalf >>>> >> >>>>>>>>> Of >>>> >> >>>>>>>>> Edward Kmett >>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>>> >> >>>>>>>>> To: Manuel M T Chakravarty >>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>>> would be >>>> >> >>>>>>>>> very handy as well. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Consider right now if I have something like an >>>> order-maintenance >>>> >> >>>>>>>>> structure I have: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>>> {-# >>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >>>> >> >>>>>>>>> (Upper s)) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>>> {-# >>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >>>> >> >>>>>>>>> (Lower s)) {-# >>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>>> >> >>>>>>>>> pointers, >>>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>>> basically >>>> >> >>>>>>>>> the same >>>> >> >>>>>>>>> thing with a mutable reference up pointing at the structure >>>> >> >>>>>>>>> above. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> On the heap this is an object that points to a structure >>>> for the >>>> >> >>>>>>>>> bytearray, and points to another structure for each mutvar >>>> which >>>> >> >>>>>>>>> each point >>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>>> >> >>>>>>>>> indirection smeared >>>> >> >>>>>>>>> over everything. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward >>>> link >>>> >> >>>>>>>>> from >>>> >> >>>>>>>>> the structure below to the structure above. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>>> and >>>> >> >>>>>>>>> the >>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>>> objects, >>>> >> >>>>>>>>> represented >>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>>> >> >>>>>>>>> sameMutableArrayArray# on these >>>> >> >>>>>>>>> for object identity, which lets me check for the ends of >>>> the >>>> >> >>>>>>>>> lists by tying >>>> >> >>>>>>>>> things back on themselves. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> and below that >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>>> up to >>>> >> >>>>>>>>> an >>>> >> >>>>>>>>> upper structure. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>>> the >>>> >> >>>>>>>>> slots >>>> >> >>>>>>>>> in question, while it has gained a level of indirection >>>> between >>>> >> >>>>>>>>> the wrapper >>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one >>>> can >>>> >> >>>>>>>>> be basically >>>> >> >>>>>>>>> erased by ghc. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the >>>> heap >>>> >> >>>>>>>>> for >>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>>> the >>>> >> >>>>>>>>> object itself, >>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>>> around the >>>> >> >>>>>>>>> mutable >>>> >> >>>>>>>>> int. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> The only pain points are >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents >>>> me >>>> >> >>>>>>>>> from >>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>>> into an >>>> >> >>>>>>>>> ArrayArray >>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of >>>> >> >>>>>>>>> Haskell, >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> and >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us >>>> avoid the >>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>>> pointers >>>> >> >>>>>>>>> wide. Card >>>> >> >>>>>>>>> marking doesn't help. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>>> >> >>>>>>>>> convert >>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>>> >> >>>>>>>>> unsafeCoerce my way >>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>>> >> >>>>>>>>> directly into the >>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here >>>> by >>>> >> >>>>>>>>> aping the >>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>>> dangerous! >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the >>>> >> >>>>>>>>> altar >>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>>> them >>>> >> >>>>>>>>> and collect >>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> -Edward >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>>> >> >>>>>>>>> wrote: >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> That?s an interesting idea. >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> Manuel >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> > Edward Kmett : >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> > >>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array# >>>> and >>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >>>> >> >>>>>>>>> > ArrayArray# entries >>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of >>>> indirection for >>>> >> >>>>>>>>> > the containing >>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it if >>>> my >>>> >> >>>>>>>>> > leaf level data >>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>>> It'd be >>>> >> >>>>>>>>> > nice to be >>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down >>>> at >>>> >> >>>>>>>>> > the leaves to >>>> >> >>>>>>>>> > hold lifted contents. >>>> >> >>>>>>>>> > >>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to >>>> >> >>>>>>>>> > access >>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do >>>> that >>>> >> >>>>>>>>> > if i tried to >>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as >>>> a >>>> >> >>>>>>>>> > ByteArray# >>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>>> preventing >>>> >> >>>>>>>>> > this. >>>> >> >>>>>>>>> > >>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection >>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I >>>> >> >>>>>>>>> > could shoehorn a >>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>>> >> >>>>>>>>> > >>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>>> unnecessary >>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce >>>> that pain >>>> >> >>>>>>>>> > to just 1 >>>> >> >>>>>>>>> > level of unnecessary indirection. >>>> >> >>>>>>>>> > >>>> >> >>>>>>>>> > -Edward >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> > _______________________________________________ >>>> >> >>>>>>>>> > ghc-devs mailing list >>>> >> >>>>>>>>> > ghc-devs at haskell.org >>>> >> >>>>>>>>> > >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> >>>> >> >>>>>>>>> _______________________________________________ >>>> >> >>>>>>>>> ghc-devs mailing list >>>> >> >>>>>>>>> ghc-devs at haskell.org >>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >> >>>>>>> >>>> >> >>>>>>> >>>> >> >>>>> >>>> >> >>> >>>> >> >> >>>> >> > >>>> >> > >>>> >> > _______________________________________________ >>>> >> > ghc-devs mailing list >>>> >> > ghc-devs at haskell.org >>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >> > >>>> > >>>> > >>>> >>> >>> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From voldermort at hotmail.com Sun Aug 30 07:36:04 2015 From: voldermort at hotmail.com (Harry .) Date: Sun, 30 Aug 2015 07:36:04 +0000 Subject: Planning for the 7.12 release Message-ID: What's the status of splitting up base? This has been spoken about a few times but I can't find any public activity from GHC or Libraries HQ. From iricanaycan at gmail.com Sun Aug 30 07:37:16 2015 From: iricanaycan at gmail.com (=?utf-8?Q?Aycan_=C4=B0rican?=) Date: Sun, 30 Aug 2015 10:37:16 +0300 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: References: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> Message-ID: <5583E5ED-EFD4-40F6-85CC-3183ABDC859D@gmail.com> Hi Levent, For a quick fix, you may want to create a wrapper script which uses LD_PRELOAD to inject `set_threadname_np` to your runtime. An example is given in this blog page: http://hackerboss.com/overriding-system-functions-for-fun-and-profit/ -aycan > On 30 Aug 2015, at 03:02, Levent Erkok wrote: > > I really like the idea of nix. Alas, generating native binaries that can run on SuSE without being in the nix environment is a requirement that's hard to let go. (Everyone in my group would have to start using nix, a tall order.) > > Thanks for the advice however, it can indeed come handy for one-off trials if needed. In the meantime, I'm still looking for a binary-linux-distro that doesn't require the set_threadname_np functionality, if anyone can point me in that direction. > > Thanks, > > -Levent. > > > > > On Sat, Aug 29, 2015 at 3:23 PM, Kosyrev Serge <_deepfire at feelingofgreen.ru > wrote: > Levent Erkok > writes: > > Hello all, > > > > I've been having a lot of trouble installing the binary-distro's on a > > SuSE machine. Unfortunately, I don't have root privileges and thus my > > options are rather limited. > > > > The problem seem to boil down to the use of the function > > pthread_setname_np. It appears the problem was noted before, and Simon > > Marlow added a corresponding configure check for platforms that do not > > have this function. See here: > > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html > > > > Alas, none of the binary distributions listed on > > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be > > built against a system that does not have this function. So, I was > > unable to install 7.10.2 successfully. > > > > Essentially, I'm looking for a binary distro on SuSE, or with a libc > > that doesn't have the GNU extensions such as pthread_setname_np; if > > anyone would be kind enough to put out such a binary distro, that'd > > really be appreciated. > > > > (Yes, I tried building from the source; but in the corporate > > environment with so many things controlled, that did not go very far.) > > You could try the Nix route, which, conceptually, would boil down to: > > 1. Installing the Nix package manager into your $HOME on the SuSE system > 2. Use Nix to install GHC > > Which expands to: > > 1. Following the instructions at: > > https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation > > 2. Invoking: > > nix-env -iA haskellPackages.ghc > > This would require only HTTP access, which, I presume, should be > available within the corporate environment. > > All the packages from Hackage can be had precompiled from Nixpkgs, > but that's slightly more involved and requires some reading: > > http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure > > Should you meet trouble, you can always seek help either at > nix-dev at lists.science.uu.nl , or on the #nixos/irc.freenode.net IRC > channel -- both have a vibrant nightlife^W Haskell community. > > -- > ? ???????e? / respectfully, > ??????? ?????? > -- > ?And those who were seen dancing were thought to be insane > by those who could not hear the music.? > ? Friedrich Wilhelm Nietzsche > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuncer.ayaz at gmail.com Sun Aug 30 09:49:46 2015 From: tuncer.ayaz at gmail.com (Tuncer Ayaz) Date: Sun, 30 Aug 2015 11:49:46 +0200 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: References: Message-ID: On Sat, Aug 29, 2015 at 10:41 PM, Levent Erkok wrote: > Hello all, [...] > (Yes, I tried building from the source; but in the corporate > environment with so many things controlled, that did not go very > far.) What exactly failed? Have you tried using a same-distro-version SuSE chroot/container, where you install the SuSE-packaged GHC, for building 7.10.2? This is of course assuming that the autoconf check works. From erkokl at gmail.com Sun Aug 30 22:50:33 2015 From: erkokl at gmail.com (Levent Erkok) Date: Sun, 30 Aug 2015 15:50:33 -0700 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: <5583E5ED-EFD4-40F6-85CC-3183ABDC859D@gmail.com> References: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> <5583E5ED-EFD4-40F6-85CC-3183ABDC859D@gmail.com> Message-ID: Thanks Aycan. The LD_PRELOAD solution did indeed work. I didn't know about that facility before, so I'm pleasantly surprised. One gotcha though: I had to add the "fake-library" into the runtime as well; with a command that looked like this: ar q libHsrts_thr.a ghcFakeLib.o where the libHsrts_thr.a comes from the binary distro, and the ghcFakeLib.o is the object file I got by defining those pthread_setname_np and pthread_getname_np. (They do nothing but return 0.) Thanks, -Levent. On Sun, Aug 30, 2015 at 12:37 AM, Aycan ?rican wrote: > Hi Levent, > > For a quick fix, you may want to create a wrapper script which uses > LD_PRELOAD to inject `set_threadname_np` to your runtime. An example is > given in this blog page: > > http://hackerboss.com/overriding-system-functions-for-fun-and-profit/ > > -aycan > > > On 30 Aug 2015, at 03:02, Levent Erkok wrote: > > I really like the idea of nix. Alas, generating native binaries that can > run on SuSE without being in the nix environment is a requirement that's > hard to let go. (Everyone in my group would have to start using nix, a tall > order.) > > Thanks for the advice however, it can indeed come handy for one-off trials > if needed. In the meantime, I'm still looking for a binary-linux-distro > that doesn't require the set_threadname_np functionality, if anyone can > point me in that direction. > > Thanks, > > -Levent. > > > > > On Sat, Aug 29, 2015 at 3:23 PM, Kosyrev Serge < > _deepfire at feelingofgreen.ru> wrote: > >> Levent Erkok writes: >> > Hello all, >> > >> > I've been having a lot of trouble installing the binary-distro's on a >> > SuSE machine. Unfortunately, I don't have root privileges and thus my >> > options are rather limited. >> > >> > The problem seem to boil down to the use of the function >> > pthread_setname_np. It appears the problem was noted before, and Simon >> > Marlow added a corresponding configure check for platforms that do not >> > have this function. See here: >> > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html >> > >> > Alas, none of the binary distributions listed on >> > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be >> > built against a system that does not have this function. So, I was >> > unable to install 7.10.2 successfully. >> > >> > Essentially, I'm looking for a binary distro on SuSE, or with a libc >> > that doesn't have the GNU extensions such as pthread_setname_np; if >> > anyone would be kind enough to put out such a binary distro, that'd >> > really be appreciated. >> > >> > (Yes, I tried building from the source; but in the corporate >> > environment with so many things controlled, that did not go very far.) >> >> You could try the Nix route, which, conceptually, would boil down to: >> >> 1. Installing the Nix package manager into your $HOME on the SuSE system >> 2. Use Nix to install GHC >> >> Which expands to: >> >> 1. Following the instructions at: >> >> >> https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation >> >> 2. Invoking: >> >> nix-env -iA haskellPackages.ghc >> >> This would require only HTTP access, which, I presume, should be >> available within the corporate environment. >> >> All the packages from Hackage can be had precompiled from Nixpkgs, >> but that's slightly more involved and requires some reading: >> >> >> http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure >> >> Should you meet trouble, you can always seek help either at >> nix-dev at lists.science.uu.nl, or on the #nixos/irc.freenode.net IRC >> channel -- both have a vibrant nightlife^W Haskell community. >> >> -- >> ? ???????e? / respectfully, >> ??????? ?????? >> -- >> ?And those who were seen dancing were thought to be insane >> by those who could not hear the music.? >> ? Friedrich Wilhelm Nietzsche >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kazu at iij.ad.jp Mon Aug 31 00:18:25 2015 From: kazu at iij.ad.jp (Kazu Yamamoto (=?iso-2022-jp?B?GyRCOzNLXE9CSScbKEI=?=)) Date: Mon, 31 Aug 2015 09:18:25 +0900 (JST) Subject: GHC 7.10 complie time regression In-Reply-To: <20150829085049.17bfe1a91b231a4a303c0fd5@mega-nerd.com> References: <20150827.102417.940015966115425781.kazu@iij.ad.jp> <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> <20150829085049.17bfe1a91b231a4a303c0fd5@mega-nerd.com> Message-ID: <20150831.091825.1286710541233390953.kazu@iij.ad.jp> Erik, >> no it's not expected to take "much longer". Can you make a ticket with >> a reproducible test case? > > An make sure you are using ghc 7.10.2 and not 7.10.1 because 7.10.2 > had some signifcant fixes for these kinds of issues. I'm certainly using GHC 7.10.2. --Kazu From kazu at iij.ad.jp Mon Aug 31 06:44:07 2015 From: kazu at iij.ad.jp (Kazu Yamamoto (=?iso-2022-jp?B?GyRCOzNLXE9CSScbKEI=?=)) Date: Mon, 31 Aug 2015 15:44:07 +0900 (JST) Subject: GHC 7.10 complie time regression In-Reply-To: <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> References: <20150827.102417.940015966115425781.kazu@iij.ad.jp> <831565db44da4eb3be569b9b346d8c54@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <20150831.154407.943240048487422744.kazu@iij.ad.jp> Simon, > no it's not expected to take "much longer". Can you make a ticket > with a reproducible test case? OK. Now Filed: https://ghc.haskell.org/trac/ghc/ticket/10818 --Kazu From iricanaycan at gmail.com Mon Aug 31 09:44:45 2015 From: iricanaycan at gmail.com (Aycan iRiCAN) Date: Mon, 31 Aug 2015 09:44:45 +0000 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: References: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> <5583E5ED-EFD4-40F6-85CC-3183ABDC859D@gmail.com> Message-ID: I'm happy to see it worked. If you want to get rid of LD_PRELOAD, you may use patchelf to modify your binary to inject your fakelib. See 'add-needed' parameter here: https://github.com/NixOS/patchelf/blob/master/README On Mon, 31 Aug 2015 01:50 Levent Erkok wrote: > Thanks Aycan. The LD_PRELOAD solution did indeed work. I didn't know about > that facility before, so I'm pleasantly surprised. > > One gotcha though: I had to add the "fake-library" into the runtime as > well; with a command that looked like this: > > ar q libHsrts_thr.a ghcFakeLib.o > > where the libHsrts_thr.a comes from the binary distro, and the > ghcFakeLib.o is the object file I got by defining those pthread_setname_np > and pthread_getname_np. (They do nothing but return 0.) > > Thanks, > > -Levent. > > On Sun, Aug 30, 2015 at 12:37 AM, Aycan ?rican > wrote: > >> Hi Levent, >> >> For a quick fix, you may want to create a wrapper script which uses >> LD_PRELOAD to inject `set_threadname_np` to your runtime. An example is >> given in this blog page: >> >> http://hackerboss.com/overriding-system-functions-for-fun-and-profit/ >> >> -aycan >> >> >> On 30 Aug 2015, at 03:02, Levent Erkok wrote: >> >> I really like the idea of nix. Alas, generating native binaries that can >> run on SuSE without being in the nix environment is a requirement that's >> hard to let go. (Everyone in my group would have to start using nix, a tall >> order.) >> >> Thanks for the advice however, it can indeed come handy for one-off >> trials if needed. In the meantime, I'm still looking for a >> binary-linux-distro that doesn't require the set_threadname_np >> functionality, if anyone can point me in that direction. >> >> Thanks, >> >> -Levent. >> >> >> >> >> On Sat, Aug 29, 2015 at 3:23 PM, Kosyrev Serge < >> _deepfire at feelingofgreen.ru> wrote: >> >>> Levent Erkok writes: >>> > Hello all, >>> > >>> > I've been having a lot of trouble installing the binary-distro's on a >>> > SuSE machine. Unfortunately, I don't have root privileges and thus my >>> > options are rather limited. >>> > >>> > The problem seem to boil down to the use of the function >>> > pthread_setname_np. It appears the problem was noted before, and Simon >>> > Marlow added a corresponding configure check for platforms that do not >>> > have this function. See here: >>> > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html >>> > >>> > Alas, none of the binary distributions listed on >>> > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be >>> > built against a system that does not have this function. So, I was >>> > unable to install 7.10.2 successfully. >>> > >>> > Essentially, I'm looking for a binary distro on SuSE, or with a libc >>> > that doesn't have the GNU extensions such as pthread_setname_np; if >>> > anyone would be kind enough to put out such a binary distro, that'd >>> > really be appreciated. >>> > >>> > (Yes, I tried building from the source; but in the corporate >>> > environment with so many things controlled, that did not go very far.) >>> >>> You could try the Nix route, which, conceptually, would boil down to: >>> >>> 1. Installing the Nix package manager into your $HOME on the SuSE system >>> 2. Use Nix to install GHC >>> >>> Which expands to: >>> >>> 1. Following the instructions at: >>> >>> >>> https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation >>> >>> 2. Invoking: >>> >>> nix-env -iA haskellPackages.ghc >>> >>> This would require only HTTP access, which, I presume, should be >>> available within the corporate environment. >>> >>> All the packages from Hackage can be had precompiled from Nixpkgs, >>> but that's slightly more involved and requires some reading: >>> >>> >>> http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure >>> >>> Should you meet trouble, you can always seek help either at >>> nix-dev at lists.science.uu.nl, or on the #nixos/irc.freenode.net IRC >>> channel -- both have a vibrant nightlife^W Haskell community. >>> >>> -- >>> ? ???????e? / respectfully, >>> ??????? ?????? >>> -- >>> ?And those who were seen dancing were thought to be insane >>> by those who could not hear the music.? >>> ? Friedrich Wilhelm Nietzsche >>> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From corentin.dupont at gmail.com Mon Aug 31 13:18:15 2015 From: corentin.dupont at gmail.com (Corentin Dupont) Date: Mon, 31 Aug 2015 15:18:15 +0200 Subject: Thread-safe GHC Message-ID: Hello, I am wondering if GHC itself is thread-safe. I am using Hint, and it reports that GHC is not thread-safe, and that I can't safely run two instances of the interpreter simultaneously. Is that still the case? Thanks! Corentin ---------- Forwarded message ---------- From: Daniel Gor?n Date: Thu, Aug 27, 2015 at 5:09 PM Subject: Re: Thread-safe Hint To: Corentin Dupont Hi Corentin, sorry for the late reply. Until relatively recently, the problem was still on. But I too remember seeing something related to this issue being fixed (iirc, the problem was the runtime linker, which used global state), so perhaps it is already fixed in 7.10. If you can verify this, it shouldn?t be hard to show the error message only on old versions of ghc. I?ll be away for a couple of weeks, but if you want to look into this and send a patch, I?ll merge it when I return. Cheers, Daniel > On 24 Aug 2015, at 10:43 am, Corentin Dupont wrote: > > Hello Daniel, > I noticed the following message in Hint: > This version of GHC is not thread-safe,can't safely run two instances of the interpreter simultaneously. > > Is it still the case with recent versions of GHC? > It would be neat to be able to launch several instances of the interpreter. In my game Nomyx I have several "match-up" going on and having one instance of the interpreter would be nicer. Otherwise I am obliged to reset the interpret each time I want to interpret something, which is time consuming (2-3 seconds). > > Thanks, > C -------------- next part -------------- An HTML attachment was scrubbed... URL: From erkokl at gmail.com Mon Aug 31 15:57:46 2015 From: erkokl at gmail.com (Levent Erkok) Date: Mon, 31 Aug 2015 08:57:46 -0700 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: References: <87vbbxent4.fsf@andromedae.feelingofgreen.ru> <5583E5ED-EFD4-40F6-85CC-3183ABDC859D@gmail.com> Message-ID: Hi Aycan.. Indeed I was able to use patchelf, so I don't need the LD_PRELOAD trick anymore either. Thanks for the pointer.. As a side note: While I'm impressed with the level of trickery available to inject/remove/change arbitrary functionality to binaries, I'm also surprised to see how easy it would be to insert Trojan-horses on the fly using these mechanisms as well. With "Trusted Haskell" brand, it almost begs the question if GHC should do a "self-check" to make sure it's binary hasn't been mocked up in this way. That would make my life harder of course, but food for thought. -Levent. On Mon, Aug 31, 2015 at 2:44 AM, Aycan iRiCAN wrote: > I'm happy to see it worked. If you want to get rid of LD_PRELOAD, you may > use patchelf to modify your binary to inject your fakelib. See 'add-needed' > parameter here: > > https://github.com/NixOS/patchelf/blob/master/README > > On Mon, 31 Aug 2015 01:50 Levent Erkok wrote: > >> Thanks Aycan. The LD_PRELOAD solution did indeed work. I didn't know >> about that facility before, so I'm pleasantly surprised. >> >> One gotcha though: I had to add the "fake-library" into the runtime as >> well; with a command that looked like this: >> >> ar q libHsrts_thr.a ghcFakeLib.o >> >> where the libHsrts_thr.a comes from the binary distro, and the >> ghcFakeLib.o is the object file I got by defining those pthread_setname_np >> and pthread_getname_np. (They do nothing but return 0.) >> >> Thanks, >> >> -Levent. >> >> On Sun, Aug 30, 2015 at 12:37 AM, Aycan ?rican >> wrote: >> >>> Hi Levent, >>> >>> For a quick fix, you may want to create a wrapper script which uses >>> LD_PRELOAD to inject `set_threadname_np` to your runtime. An example is >>> given in this blog page: >>> >>> http://hackerboss.com/overriding-system-functions-for-fun-and-profit/ >>> >>> -aycan >>> >>> >>> On 30 Aug 2015, at 03:02, Levent Erkok wrote: >>> >>> I really like the idea of nix. Alas, generating native binaries that can >>> run on SuSE without being in the nix environment is a requirement that's >>> hard to let go. (Everyone in my group would have to start using nix, a tall >>> order.) >>> >>> Thanks for the advice however, it can indeed come handy for one-off >>> trials if needed. In the meantime, I'm still looking for a >>> binary-linux-distro that doesn't require the set_threadname_np >>> functionality, if anyone can point me in that direction. >>> >>> Thanks, >>> >>> -Levent. >>> >>> >>> >>> >>> On Sat, Aug 29, 2015 at 3:23 PM, Kosyrev Serge < >>> _deepfire at feelingofgreen.ru> wrote: >>> >>>> Levent Erkok writes: >>>> > Hello all, >>>> > >>>> > I've been having a lot of trouble installing the binary-distro's on a >>>> > SuSE machine. Unfortunately, I don't have root privileges and thus my >>>> > options are rather limited. >>>> > >>>> > The problem seem to boil down to the use of the function >>>> > pthread_setname_np. It appears the problem was noted before, and Simon >>>> > Marlow added a corresponding configure check for platforms that do not >>>> > have this function. See here: >>>> > https://mail.haskell.org/pipermail/ghc-devs/2014-October/006707.html >>>> > >>>> > Alas, none of the binary distributions listed on >>>> > https://www.haskell.org/ghc/download_ghc_7_10_2#binaries seem to be >>>> > built against a system that does not have this function. So, I was >>>> > unable to install 7.10.2 successfully. >>>> > >>>> > Essentially, I'm looking for a binary distro on SuSE, or with a libc >>>> > that doesn't have the GNU extensions such as pthread_setname_np; if >>>> > anyone would be kind enough to put out such a binary distro, that'd >>>> > really be appreciated. >>>> > >>>> > (Yes, I tried building from the source; but in the corporate >>>> > environment with so many things controlled, that did not go very far.) >>>> >>>> You could try the Nix route, which, conceptually, would boil down to: >>>> >>>> 1. Installing the Nix package manager into your $HOME on the SuSE system >>>> 2. Use Nix to install GHC >>>> >>>> Which expands to: >>>> >>>> 1. Following the instructions at: >>>> >>>> >>>> https://nixos.org/wiki/How_to_install_nix_in_home_%28on_another_distribution%29#PRoot_Installation >>>> >>>> 2. Invoking: >>>> >>>> nix-env -iA haskellPackages.ghc >>>> >>>> This would require only HTTP access, which, I presume, should be >>>> available within the corporate environment. >>>> >>>> All the packages from Hackage can be had precompiled from Nixpkgs, >>>> but that's slightly more involved and requires some reading: >>>> >>>> >>>> http://nixos.org/nixpkgs/manual/#users-guide-to-the-haskell-infrastructure >>>> >>>> Should you meet trouble, you can always seek help either at >>>> nix-dev at lists.science.uu.nl, or on the #nixos/irc.freenode.net IRC >>>> channel -- both have a vibrant nightlife^W Haskell community. >>>> >>>> -- >>>> ? ???????e? / respectfully, >>>> ??????? ?????? >>>> -- >>>> ?And those who were seen dancing were thought to be insane >>>> by those who could not hear the music.? >>>> ? Friedrich Wilhelm Nietzsche >>>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From erkokl at gmail.com Mon Aug 31 16:00:41 2015 From: erkokl at gmail.com (Levent Erkok) Date: Mon, 31 Aug 2015 09:00:41 -0700 Subject: Installing ghc-7.10.2 linux binary distro on SuSE In-Reply-To: References: Message-ID: Tuncer: The LD_PRELOAD trick and the patchelf magic solved the problem nicely.. I'd still prefer if a SuSE binary distro was available from the downloads page, for the good of the community. When I tried to build from source, I was getting all sorts of error messages even from the configure step; so I gave up fairly quickly. If it was a machine I had root access to then I'd have given it more time; but with only user-level access, I didn't pursue anything further. On Sun, Aug 30, 2015 at 2:49 AM, Tuncer Ayaz wrote: > On Sat, Aug 29, 2015 at 10:41 PM, Levent Erkok wrote: > > Hello all, > > [...] > > > (Yes, I tried building from the source; but in the corporate > > environment with so many things controlled, that did not go very > > far.) > > What exactly failed? Have you tried using a same-distro-version SuSE > chroot/container, where you install the SuSE-packaged GHC, for > building 7.10.2? This is of course assuming that the autoconf check > works. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at well-typed.com Mon Aug 31 17:49:49 2015 From: ben at well-typed.com (Ben Gamari) Date: Mon, 31 Aug 2015 19:49:49 +0200 Subject: Planning for the 7.12 release In-Reply-To: References: <87r3mo68t0.fsf@smart-cactus.org> <87h9nj4scu.fsf@feelingofgreen.ru> <87bndrvdnf.fsf@smart-cactus.org> <73f40469b133402ead3b0c09d5d1cd08@DB4PR30MB030.064d.mgd.msft.net> Message-ID: <87mvx7e4aq.fsf@smart-cactus.org> Andrey sent this report regarding the status of the new Shake build system he is working on but it seems it must have bounced from ghc-devs. Hopefully this time it makes it. Cheers, - Ben Andrey Mokhov writes: > Hi all, > > I aim at releasing a first working prototype of the new build system before 7.12 comes out. It will have limited functionality, and will most likely be fragile until we adjust it to multiple working environments. (I take great care to translate all flags with appropriate conditions to the new build system, but mistakes/omissions are inevitable.) > > Do we want to include it into 7.12? I don't have a strong opinion on this but tend to agree with Niklas: we don't necessarily need to tie this to a GHC release, since it will only be used by a small number of early adopters first, and they will be able to simply clone the latest version of the build system from the github repository. > > > For the record, here is my current to-do list: > > * Build utils (ghc-cabal, etc). At the moment I rely on the old build system for this, hence the error shown by Ben. > > * Build rts. > > * Build auto-generated code (GHC/Prim.hs etc). > > * Release a first prototype and start fixing issues found by early adopters. > > * Add support for validation & testing. > > * Add support for cross compilation. > > * Write a tutorial for using/extending the build system. > > > I'm currently on holidays, and September will be busy (teaching starts), but I believe a prototype will be ready by December. > > > Once all of the above is complete we may consider putting the new build system into the GHC source tree and all GHC devs will start migrating to it. I imagine the old build system will still be used for a while, so both will coexist. > > Cheers, > Andrey > > P.S. Hope this reaches the ghc-devs mailing list this time... > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 472 bytes Desc: not available URL: From rrnewton at gmail.com Mon Aug 31 22:11:00 2015 From: rrnewton at gmail.com (Ryan Newton) Date: Mon, 31 Aug 2015 22:11:00 +0000 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Dear Edward, Ryan Yates, and other interested parties -- So when should we meet up about this? May I propose the Tues afternoon break for everyone at ICFP who is interested in this topic? We can meet out in the coffee area and congregate around Edward Kmett, who is tall and should be easy to find ;-). I think Ryan is going to show us how to use his new primops for combined array + other fields in one heap object? On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: > Without a custom primitive it doesn't help much there, you have to store > the indirection to the mask. > > With a custom primitive it should cut the on heap root-to-leaf path of > everything in the HAMT in half. A shorter HashMap was actually one of the > motivating factors for me doing this. It is rather astoundingly difficult > to beat the performance of HashMap, so I had to start cheating pretty > badly. ;) > > -Edward > > On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell > wrote: > >> I'd also be interested to chat at ICFP to see if I can use this for my >> HAMT implementation. >> >> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: >> >>> Sounds good to me. Right now I'm just hacking up composable accessors >>> for "typed slots" in a fairly lens-like fashion, and treating the set of >>> slots I define and the 'new' function I build for the data type as its API, >>> and build atop that. This could eventually graduate to template-haskell, >>> but I'm not entirely satisfied with the solution I have. I currently >>> distinguish between what I'm calling "slots" (things that point directly to >>> another SmallMutableArrayArray# sans wrapper) and "fields" which point >>> directly to the usual Haskell data types because unifying the two notions >>> meant that I couldn't lift some coercions out "far enough" to make them >>> vanish. >>> >>> I'll be happy to run through my current working set of issues in person >>> and -- as things get nailed down further -- in a longer lived medium than >>> in personal conversations. ;) >>> >>> -Edward >>> >>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: >>> >>>> I'd also love to meet up at ICFP and discuss this. I think the array >>>> primops plus a TH layer that lets (ab)use them many times without too much >>>> marginal cost sounds great. And I'd like to learn how we could be either >>>> early users of, or help with, this infrastructure. >>>> >>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >>>> who is currently working on concurrent data structures in Haskell, but will >>>> not be at ICFP. >>>> >>>> >>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates >>>> wrote: >>>> >>>>> I completely agree. I would love to spend some time during ICFP and >>>>> friends talking about what it could look like. My small array for STM >>>>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>>>> confusing naming choices (sorry), but should cover all the details >>>>> needed to implement it for a non-STM context. The biggest surprise >>>>> for me was following small array too closely and having a word/byte >>>>> offset miss-match [2]. >>>>> >>>>> [1]: >>>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>>>> >>>>> Ryan >>>>> >>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett >>>>> wrote: >>>>> > I'd love to have that last 10%, but its a lot of work to get there >>>>> and more >>>>> > importantly I don't know quite what it should look like. >>>>> > >>>>> > On the other hand, I do have a pretty good idea of how the >>>>> primitives above >>>>> > could be banged out and tested in a long evening, well in time for >>>>> 7.12. And >>>>> > as noted earlier, those remain useful even if a nicer typed version >>>>> with an >>>>> > extra level of indirection to the sizes is built up after. >>>>> > >>>>> > The rest sounds like a good graduate student project for someone who >>>>> has >>>>> > graduate students lying around. Maybe somebody at Indiana University >>>>> who has >>>>> > an interest in type theory and parallelism can find us one. =) >>>>> > >>>>> > -Edward >>>>> > >>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >>>>> wrote: >>>>> >> >>>>> >> I think from my perspective, the motivation for getting the type >>>>> >> checker involved is primarily bringing this to the level where users >>>>> >> could be expected to build these structures. it is reasonable to >>>>> >> think that there are people who want to use STM (a context with >>>>> >> mutation already) to implement a straight forward data structure >>>>> that >>>>> >> avoids extra indirection penalty. There should be some places where >>>>> >> knowing that things are field accesses rather then array indexing >>>>> >> could be helpful, but I think GHC is good right now about handling >>>>> >> constant offsets. In my code I don't do any bounds checking as I >>>>> know >>>>> >> I will only be accessing my arrays with constant indexes. I make >>>>> >> wrappers for each field access and leave all the unsafe stuff in >>>>> >> there. When things go wrong though, the compiler is no help. Maybe >>>>> >> template Haskell that generates the appropriate wrappers is the >>>>> right >>>>> >> direction to go. >>>>> >> There is another benefit for me when working with these as arrays in >>>>> >> that it is quite simple and direct (given the hoops already jumped >>>>> >> through) to play with alignment. I can ensure two pointers are >>>>> never >>>>> >> on the same cache-line by just spacing things out in the array. >>>>> >> >>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >>>>> wrote: >>>>> >> > They just segfault at this level. ;) >>>>> >> > >>>>> >> > Sent from my iPhone >>>>> >> > >>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >>>>> wrote: >>>>> >> > >>>>> >> > You presumably also save a bounds check on reads by hard-coding >>>>> the >>>>> >> > sizes? >>>>> >> > >>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >>>>> wrote: >>>>> >> >> >>>>> >> >> Also there are 4 different "things" here, basically depending on >>>>> two >>>>> >> >> independent questions: >>>>> >> >> >>>>> >> >> a.) if you want to shove the sizes into the info table, and >>>>> >> >> b.) if you want cardmarking. >>>>> >> >> >>>>> >> >> Versions with/without cardmarking for different sizes can be done >>>>> >> >> pretty >>>>> >> >> easily, but as noted, the infotable variants are pretty invasive. >>>>> >> >> >>>>> >> >> -Edward >>>>> >> >> >>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >>>>> wrote: >>>>> >> >>> >>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which >>>>> adds up >>>>> >> >>> if >>>>> >> >>> they were small enough and there are enough of them. You get a >>>>> bit >>>>> >> >>> better >>>>> >> >>> locality of reference in terms of what fits in the first cache >>>>> line of >>>>> >> >>> them. >>>>> >> >>> >>>>> >> >>> -Edward >>>>> >> >>> >>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton < >>>>> rrnewton at gmail.com> >>>>> >> >>> wrote: >>>>> >> >>>> >>>>> >> >>>> Yes. And for the short term I can imagine places we will >>>>> settle with >>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>>>> >> >>>> unsafeCoercing >>>>> >> >>>> pointers whose types don't actually match their siblings. >>>>> >> >>>> >>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>>>> sized >>>>> >> >>>> array >>>>> >> >>>> objects *other* than using them to fake structs? (Much to >>>>> >> >>>> derecommend, as >>>>> >> >>>> you mentioned!) >>>>> >> >>>> >>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>>> > >>>>> >> >>>> wrote: >>>>> >> >>>>> >>>>> >> >>>>> I think both are useful, but the one you suggest requires a >>>>> lot more >>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other. >>>>> >> >>>>> >>>>> >> >>>>> -Edward >>>>> >> >>>>> >>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton < >>>>> rrnewton at gmail.com> >>>>> >> >>>>> wrote: >>>>> >> >>>>>> >>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>>>> >> >>>>>> unbounded >>>>> >> >>>>>> length) with extra payload. >>>>> >> >>>>>> >>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >>>>> >> >>>>>> especially >>>>> >> >>>>>> with the extra payload at front. But wouldn't the general >>>>> solution >>>>> >> >>>>>> for >>>>> >> >>>>>> structs be one that that allows new user data type defs for # >>>>> >> >>>>>> types? >>>>> >> >>>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett < >>>>> ekmett at gmail.com> >>>>> >> >>>>>> wrote: >>>>> >> >>>>>>> >>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words >>>>> and a >>>>> >> >>>>>>> known >>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was >>>>> suggesting >>>>> >> >>>>>>> above, but >>>>> >> >>>>>>> where the word counts were stored in the objects themselves. >>>>> >> >>>>>>> >>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd >>>>> >> >>>>>>> likely >>>>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>>>> than a >>>>> >> >>>>>>> replacement. >>>>> >> >>>>>>> >>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>>>> info >>>>> >> >>>>>>> tables that knew them, and typechecker support, for >>>>> instance, it'd >>>>> >> >>>>>>> get >>>>> >> >>>>>>> rather invasive. >>>>> >> >>>>>>> >>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>>>> versions >>>>> >> >>>>>>> above, like working with evil unsized c-style arrays >>>>> directly >>>>> >> >>>>>>> inline at the >>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even >>>>> a pure >>>>> >> >>>>>>> win if we >>>>> >> >>>>>>> did the engineering effort. >>>>> >> >>>>>>> >>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>>>> the one >>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>>>> >> >>>>>>> >>>>> >> >>>>>>> -Edward >>>>> >> >>>>>>> >>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton < >>>>> rrnewton at gmail.com> >>>>> >> >>>>>>> wrote: >>>>> >> >>>>>>>> >>>>> >> >>>>>>>> I like the possibility of a general solution for mutable >>>>> structs >>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why >>>>> it's hard. >>>>> >> >>>>>>>> >>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>>>> object >>>>> >> >>>>>>>> identity problems. But what about directly supporting an >>>>> >> >>>>>>>> extensible set of >>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>>>> replacing) >>>>> >> >>>>>>>> MutVar#? That >>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>>>> >> >>>>>>>> >>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want >>>>> best in >>>>> >> >>>>>>>> class >>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>>>> sequential >>>>> >> >>>>>>>> counterparts. >>>>> >> >>>>>>>> >>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>>>> >> >>>>>>>> wrote: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>> short >>>>> >> >>>>>>>>> article. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, >>>>> and >>>>> >> >>>>>>>>> maybe >>>>> >> >>>>>>>>> make a ticket for it. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Thanks >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Simon >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>>>> >> >>>>>>>>> To: Simon Peyton Jones >>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified >>>>> invariant. It >>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>>>> ByteArray#'s. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> While those live in #, they are garbage collected >>>>> objects, so >>>>> >> >>>>>>>>> this >>>>> >> >>>>>>>>> all lives on the heap. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when >>>>> it has >>>>> >> >>>>>>>>> to >>>>> >> >>>>>>>>> deal with nested arrays. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>>>> thing. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> The Problem >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> ----------------- >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Consider the scenario where you write a classic >>>>> doubly-linked >>>>> >> >>>>>>>>> list >>>>> >> >>>>>>>>> in Haskell. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>>>> pointers >>>>> >> >>>>>>>>> on >>>>> >> >>>>>>>>> the heap. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) >>>>> ~> >>>>> >> >>>>>>>>> Maybe >>>>> >> >>>>>>>>> DLL ~> DLL >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> That is 3 levels of indirection. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>>>> and >>>>> >> >>>>>>>>> worsening our representation. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> This means that every operation we perform on this >>>>> structure >>>>> >> >>>>>>>>> will >>>>> >> >>>>>>>>> be about half of the speed of an implementation in most >>>>> other >>>>> >> >>>>>>>>> languages >>>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Making Progress >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> ---------------------- >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I have been working on a number of data structures where >>>>> the >>>>> >> >>>>>>>>> indirection of going from something in * out to an object >>>>> in # >>>>> >> >>>>>>>>> which >>>>> >> >>>>>>>>> contains the real pointer to my target and coming back >>>>> >> >>>>>>>>> effectively doubles >>>>> >> >>>>>>>>> my runtime. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the >>>>> >> >>>>>>>>> MutVar# >>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>>>> defined >>>>> >> >>>>>>>>> write-barrier. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I could change out the representation to use >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>>>> time, >>>>> >> >>>>>>>>> but >>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>>>> amount of >>>>> >> >>>>>>>>> distinct >>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 >>>>> per >>>>> >> >>>>>>>>> object to 2. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>>>> the >>>>> >> >>>>>>>>> array >>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that >>>>> to the >>>>> >> >>>>>>>>> next array. I >>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>>>> paying for >>>>> >> >>>>>>>>> a card >>>>> >> >>>>>>>>> marking table as well, which I don't particularly need >>>>> with just >>>>> >> >>>>>>>>> two >>>>> >> >>>>>>>>> pointers, but we can shed that with the >>>>> "SmallMutableArray#" >>>>> >> >>>>>>>>> machinery added >>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new >>>>> data >>>>> >> >>>>>>>>> type, which can >>>>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and >>>>> have two >>>>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier? >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>>>> types. >>>>> >> >>>>>>>>> What >>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>>>> the >>>>> >> >>>>>>>>> impedence >>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and >>>>> then just >>>>> >> >>>>>>>>> let the >>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be >>>>> a >>>>> >> >>>>>>>>> special >>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can >>>>> even >>>>> >> >>>>>>>>> abuse pattern >>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further >>>>> to >>>>> >> >>>>>>>>> make this >>>>> >> >>>>>>>>> cheaper. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>>>> preceding >>>>> >> >>>>>>>>> and next >>>>> >> >>>>>>>>> entry in the linked list. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>>>> into a >>>>> >> >>>>>>>>> strict world, and everything there lives in #. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> next :: DLL -> IO DLL >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>>>> code to >>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty >>>>> >> >>>>>>>>> easily when they >>>>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Cleaning it Up >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> ------------------ >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>>>> that >>>>> >> >>>>>>>>> points directly to other arrays. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but >>>>> I can >>>>> >> >>>>>>>>> fix >>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>>>> using a >>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me >>>>> store a >>>>> >> >>>>>>>>> mixture of >>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>>>> structure. >>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the >>>>> >> >>>>>>>>> existing >>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one >>>>> of the >>>>> >> >>>>>>>>> arguments it >>>>> >> >>>>>>>>> takes. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>>>> that >>>>> >> >>>>>>>>> would >>>>> >> >>>>>>>>> be best left unboxed. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>>>> currently >>>>> >> >>>>>>>>> at >>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# >>>>> at a >>>>> >> >>>>>>>>> boxed or at a >>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>>>> int in >>>>> >> >>>>>>>>> question in >>>>> >> >>>>>>>>> there. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I >>>>> need to >>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>>>> Having to >>>>> >> >>>>>>>>> go off to >>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>>>> pointer >>>>> >> >>>>>>>>> chase. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could >>>>> >> >>>>>>>>> construct that had n words with unsafe access and m >>>>> pointers to >>>>> >> >>>>>>>>> other heap >>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list >>>>> when any >>>>> >> >>>>>>>>> of those >>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of >>>>> two in >>>>> >> >>>>>>>>> all >>>>> >> >>>>>>>>> circumstances. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Prototype >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> ------------- >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Over the last few days I've put together a small prototype >>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>>>> structures >>>>> >> >>>>>>>>> for things >>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem >>>>> and >>>>> >> >>>>>>>>> order-maintenance. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> https://github.com/ekmett/structs >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Notable bits: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of >>>>> >> >>>>>>>>> link-cut >>>>> >> >>>>>>>>> trees in this style. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>>>> that >>>>> >> >>>>>>>>> make >>>>> >> >>>>>>>>> it go fast. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>>>> almost >>>>> >> >>>>>>>>> all >>>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor >>>>> get >>>>> >> >>>>>>>>> optimized away, >>>>> >> >>>>>>>>> and we're left with beautiful strict code directly >>>>> mutating out >>>>> >> >>>>>>>>> underlying >>>>> >> >>>>>>>>> representation. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>> short >>>>> >> >>>>>>>>> article. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> -Edward >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>>>> >> >>>>>>>>> wrote: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>>>> thread. >>>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>>>> there a >>>>> >> >>>>>>>>> ticket? Is >>>>> >> >>>>>>>>> there a wiki page? >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would >>>>> be a >>>>> >> >>>>>>>>> good >>>>> >> >>>>>>>>> thing. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Simon >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>>>> Behalf >>>>> >> >>>>>>>>> Of >>>>> >> >>>>>>>>> Edward Kmett >>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>>>> >> >>>>>>>>> To: Manuel M T Chakravarty >>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>>>> would be >>>>> >> >>>>>>>>> very handy as well. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Consider right now if I have something like an >>>>> order-maintenance >>>>> >> >>>>>>>>> structure I have: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>>>> {-# >>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s >>>>> >> >>>>>>>>> (Upper s)) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>>>> {-# >>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s >>>>> >> >>>>>>>>> (Lower s)) {-# >>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>>>> >> >>>>>>>>> pointers, >>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>>>> basically >>>>> >> >>>>>>>>> the same >>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the >>>>> structure >>>>> >> >>>>>>>>> above. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> On the heap this is an object that points to a structure >>>>> for the >>>>> >> >>>>>>>>> bytearray, and points to another structure for each >>>>> mutvar which >>>>> >> >>>>>>>>> each point >>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>>>> >> >>>>>>>>> indirection smeared >>>>> >> >>>>>>>>> over everything. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward >>>>> link >>>>> >> >>>>>>>>> from >>>>> >> >>>>>>>>> the structure below to the structure above. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>>>> and >>>>> >> >>>>>>>>> the >>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>>>> objects, >>>>> >> >>>>>>>>> represented >>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>>>> >> >>>>>>>>> sameMutableArrayArray# on these >>>>> >> >>>>>>>>> for object identity, which lets me check for the ends of >>>>> the >>>>> >> >>>>>>>>> lists by tying >>>>> >> >>>>>>>>> things back on themselves. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> and below that >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>>>> up to >>>>> >> >>>>>>>>> an >>>>> >> >>>>>>>>> upper structure. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>>>> the >>>>> >> >>>>>>>>> slots >>>>> >> >>>>>>>>> in question, while it has gained a level of indirection >>>>> between >>>>> >> >>>>>>>>> the wrapper >>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that >>>>> one can >>>>> >> >>>>>>>>> be basically >>>>> >> >>>>>>>>> erased by ghc. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on >>>>> the heap >>>>> >> >>>>>>>>> for >>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>>>> the >>>>> >> >>>>>>>>> object itself, >>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>>>> around the >>>>> >> >>>>>>>>> mutable >>>>> >> >>>>>>>>> int. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> The only pain points are >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents >>>>> me >>>>> >> >>>>>>>>> from >>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>>>> into an >>>>> >> >>>>>>>>> ArrayArray >>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest >>>>> of >>>>> >> >>>>>>>>> Haskell, >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> and >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us >>>>> avoid the >>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>>>> pointers >>>>> >> >>>>>>>>> wide. Card >>>>> >> >>>>>>>>> marking doesn't help. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>>>> >> >>>>>>>>> convert >>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>>>> >> >>>>>>>>> unsafeCoerce my way >>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>>>> >> >>>>>>>>> directly into the >>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here >>>>> by >>>>> >> >>>>>>>>> aping the >>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>>>> dangerous! >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on >>>>> the >>>>> >> >>>>>>>>> altar >>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>>>> them >>>>> >> >>>>>>>>> and collect >>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> -Edward >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>>>> >> >>>>>>>>> wrote: >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> That?s an interesting idea. >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> Manuel >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> > Edward Kmett : >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> > >>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add >>>>> Array# and >>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the >>>>> >> >>>>>>>>> > ArrayArray# entries >>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of >>>>> indirection for >>>>> >> >>>>>>>>> > the containing >>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it >>>>> if my >>>>> >> >>>>>>>>> > leaf level data >>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>>>> It'd be >>>>> >> >>>>>>>>> > nice to be >>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff >>>>> down at >>>>> >> >>>>>>>>> > the leaves to >>>>> >> >>>>>>>>> > hold lifted contents. >>>>> >> >>>>>>>>> > >>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go >>>>> to >>>>> >> >>>>>>>>> > access >>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd >>>>> do that >>>>> >> >>>>>>>>> > if i tried to >>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# >>>>> as a >>>>> >> >>>>>>>>> > ByteArray# >>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>>>> preventing >>>>> >> >>>>>>>>> > this. >>>>> >> >>>>>>>>> > >>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the >>>>> indirection >>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and >>>>> I >>>>> >> >>>>>>>>> > could shoehorn a >>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>>>> >> >>>>>>>>> > >>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>>>> unnecessary >>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce >>>>> that pain >>>>> >> >>>>>>>>> > to just 1 >>>>> >> >>>>>>>>> > level of unnecessary indirection. >>>>> >> >>>>>>>>> > >>>>> >> >>>>>>>>> > -Edward >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> > _______________________________________________ >>>>> >> >>>>>>>>> > ghc-devs mailing list >>>>> >> >>>>>>>>> > ghc-devs at haskell.org >>>>> >> >>>>>>>>> > >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> >>>>> >> >>>>>>>>> _______________________________________________ >>>>> >> >>>>>>>>> ghc-devs mailing list >>>>> >> >>>>>>>>> ghc-devs at haskell.org >>>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>> >> >>>>>>> >>>>> >> >>>>>>> >>>>> >> >>>>> >>>>> >> >>> >>>>> >> >> >>>>> >> > >>>>> >> > >>>>> >> > _______________________________________________ >>>>> >> > ghc-devs mailing list >>>>> >> > ghc-devs at haskell.org >>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>> >> > >>>>> > >>>>> > >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fryguybob at gmail.com Mon Aug 31 22:50:17 2015 From: fryguybob at gmail.com (Ryan Yates) Date: Mon, 31 Aug 2015 18:50:17 -0400 Subject: ArrayArrays In-Reply-To: References: <4DACFC45-0E7E-4B3F-8435-5365EC3F7749@cse.unsw.edu.au> <65158505c7be41afad85374d246b7350@DB4PR30MB030.064d.mgd.msft.net> <2FCB6298-A4FF-4F7B-8BF8-4880BB3154AB@gmail.com> Message-ID: Any time works for me. Ryan On Mon, Aug 31, 2015 at 6:11 PM, Ryan Newton wrote: > Dear Edward, Ryan Yates, and other interested parties -- > > So when should we meet up about this? > > May I propose the Tues afternoon break for everyone at ICFP who is > interested in this topic? We can meet out in the coffee area and congregate > around Edward Kmett, who is tall and should be easy to find ;-). > > I think Ryan is going to show us how to use his new primops for combined > array + other fields in one heap object? > > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett wrote: >> >> Without a custom primitive it doesn't help much there, you have to store >> the indirection to the mask. >> >> With a custom primitive it should cut the on heap root-to-leaf path of >> everything in the HAMT in half. A shorter HashMap was actually one of the >> motivating factors for me doing this. It is rather astoundingly difficult to >> beat the performance of HashMap, so I had to start cheating pretty badly. ;) >> >> -Edward >> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell >> wrote: >>> >>> I'd also be interested to chat at ICFP to see if I can use this for my >>> HAMT implementation. >>> >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett wrote: >>>> >>>> Sounds good to me. Right now I'm just hacking up composable accessors >>>> for "typed slots" in a fairly lens-like fashion, and treating the set of >>>> slots I define and the 'new' function I build for the data type as its API, >>>> and build atop that. This could eventually graduate to template-haskell, but >>>> I'm not entirely satisfied with the solution I have. I currently distinguish >>>> between what I'm calling "slots" (things that point directly to another >>>> SmallMutableArrayArray# sans wrapper) and "fields" which point directly to >>>> the usual Haskell data types because unifying the two notions meant that I >>>> couldn't lift some coercions out "far enough" to make them vanish. >>>> >>>> I'll be happy to run through my current working set of issues in person >>>> and -- as things get nailed down further -- in a longer lived medium than in >>>> personal conversations. ;) >>>> >>>> -Edward >>>> >>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton wrote: >>>>> >>>>> I'd also love to meet up at ICFP and discuss this. I think the array >>>>> primops plus a TH layer that lets (ab)use them many times without too much >>>>> marginal cost sounds great. And I'd like to learn how we could be either >>>>> early users of, or help with, this infrastructure. >>>>> >>>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in >>>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student >>>>> who is currently working on concurrent data structures in Haskell, but will >>>>> not be at ICFP. >>>>> >>>>> >>>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates >>>>> wrote: >>>>>> >>>>>> I completely agree. I would love to spend some time during ICFP and >>>>>> friends talking about what it could look like. My small array for STM >>>>>> changes for the RTS can be seen here [1]. It is on a branch somewhere >>>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some >>>>>> confusing naming choices (sorry), but should cover all the details >>>>>> needed to implement it for a non-STM context. The biggest surprise >>>>>> for me was following small array too closely and having a word/byte >>>>>> offset miss-match [2]. >>>>>> >>>>>> [1]: >>>>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut >>>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413 >>>>>> >>>>>> Ryan >>>>>> >>>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett >>>>>> wrote: >>>>>> > I'd love to have that last 10%, but its a lot of work to get there >>>>>> > and more >>>>>> > importantly I don't know quite what it should look like. >>>>>> > >>>>>> > On the other hand, I do have a pretty good idea of how the >>>>>> > primitives above >>>>>> > could be banged out and tested in a long evening, well in time for >>>>>> > 7.12. And >>>>>> > as noted earlier, those remain useful even if a nicer typed version >>>>>> > with an >>>>>> > extra level of indirection to the sizes is built up after. >>>>>> > >>>>>> > The rest sounds like a good graduate student project for someone who >>>>>> > has >>>>>> > graduate students lying around. Maybe somebody at Indiana University >>>>>> > who has >>>>>> > an interest in type theory and parallelism can find us one. =) >>>>>> > >>>>>> > -Edward >>>>>> > >>>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates >>>>>> > wrote: >>>>>> >> >>>>>> >> I think from my perspective, the motivation for getting the type >>>>>> >> checker involved is primarily bringing this to the level where >>>>>> >> users >>>>>> >> could be expected to build these structures. it is reasonable to >>>>>> >> think that there are people who want to use STM (a context with >>>>>> >> mutation already) to implement a straight forward data structure >>>>>> >> that >>>>>> >> avoids extra indirection penalty. There should be some places >>>>>> >> where >>>>>> >> knowing that things are field accesses rather then array indexing >>>>>> >> could be helpful, but I think GHC is good right now about handling >>>>>> >> constant offsets. In my code I don't do any bounds checking as I >>>>>> >> know >>>>>> >> I will only be accessing my arrays with constant indexes. I make >>>>>> >> wrappers for each field access and leave all the unsafe stuff in >>>>>> >> there. When things go wrong though, the compiler is no help. >>>>>> >> Maybe >>>>>> >> template Haskell that generates the appropriate wrappers is the >>>>>> >> right >>>>>> >> direction to go. >>>>>> >> There is another benefit for me when working with these as arrays >>>>>> >> in >>>>>> >> that it is quite simple and direct (given the hoops already jumped >>>>>> >> through) to play with alignment. I can ensure two pointers are >>>>>> >> never >>>>>> >> on the same cache-line by just spacing things out in the array. >>>>>> >> >>>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett >>>>>> >> wrote: >>>>>> >> > They just segfault at this level. ;) >>>>>> >> > >>>>>> >> > Sent from my iPhone >>>>>> >> > >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton >>>>>> >> > wrote: >>>>>> >> > >>>>>> >> > You presumably also save a bounds check on reads by hard-coding >>>>>> >> > the >>>>>> >> > sizes? >>>>>> >> > >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett >>>>>> >> > wrote: >>>>>> >> >> >>>>>> >> >> Also there are 4 different "things" here, basically depending on >>>>>> >> >> two >>>>>> >> >> independent questions: >>>>>> >> >> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and >>>>>> >> >> b.) if you want cardmarking. >>>>>> >> >> >>>>>> >> >> Versions with/without cardmarking for different sizes can be >>>>>> >> >> done >>>>>> >> >> pretty >>>>>> >> >> easily, but as noted, the infotable variants are pretty >>>>>> >> >> invasive. >>>>>> >> >> >>>>>> >> >> -Edward >>>>>> >> >> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett >>>>>> >> >> wrote: >>>>>> >> >>> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which >>>>>> >> >>> adds up >>>>>> >> >>> if >>>>>> >> >>> they were small enough and there are enough of them. You get a >>>>>> >> >>> bit >>>>>> >> >>> better >>>>>> >> >>> locality of reference in terms of what fits in the first cache >>>>>> >> >>> line of >>>>>> >> >>> them. >>>>>> >> >>> >>>>>> >> >>> -Edward >>>>>> >> >>> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton >>>>>> >> >>> >>>>>> >> >>> wrote: >>>>>> >> >>>> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will >>>>>> >> >>>> settle with >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and >>>>>> >> >>>> unsafeCoercing >>>>>> >> >>>> pointers whose types don't actually match their siblings. >>>>>> >> >>>> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed >>>>>> >> >>>> sized >>>>>> >> >>>> array >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to >>>>>> >> >>>> derecommend, as >>>>>> >> >>>> you mentioned!) >>>>>> >> >>>> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>>>>> >> >>>> >>>>>> >> >>>> wrote: >>>>>> >> >>>>> >>>>>> >> >>>>> I think both are useful, but the one you suggest requires a >>>>>> >> >>>>> lot more >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the >>>>>> >> >>>>> other. >>>>>> >> >>>>> >>>>>> >> >>>>> -Edward >>>>>> >> >>>>> >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton >>>>>> >> >>>>> >>>>>> >> >>>>> wrote: >>>>>> >> >>>>>> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type, >>>>>> >> >>>>>> unbounded >>>>>> >> >>>>>> length) with extra payload. >>>>>> >> >>>>>> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays, >>>>>> >> >>>>>> especially >>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general >>>>>> >> >>>>>> solution >>>>>> >> >>>>>> for >>>>>> >> >>>>>> structs be one that that allows new user data type defs for >>>>>> >> >>>>>> # >>>>>> >> >>>>>> types? >>>>>> >> >>>>>> >>>>>> >> >>>>>> >>>>>> >> >>>>>> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >>>>>> >> >>>>>> >>>>>> >> >>>>>> wrote: >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words >>>>>> >> >>>>>>> and a >>>>>> >> >>>>>>> known >>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was >>>>>> >> >>>>>>> suggesting >>>>>> >> >>>>>>> above, but >>>>>> >> >>>>>>> where the word counts were stored in the objects >>>>>> >> >>>>>>> themselves. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts >>>>>> >> >>>>>>> it'd >>>>>> >> >>>>>>> likely >>>>>> >> >>>>>>> want to be something we build in addition to MutVar# rather >>>>>> >> >>>>>>> than a >>>>>> >> >>>>>>> replacement. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build >>>>>> >> >>>>>>> info >>>>>> >> >>>>>>> tables that knew them, and typechecker support, for >>>>>> >> >>>>>>> instance, it'd >>>>>> >> >>>>>>> get >>>>>> >> >>>>>>> rather invasive. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized' >>>>>> >> >>>>>>> versions >>>>>> >> >>>>>>> above, like working with evil unsized c-style arrays >>>>>> >> >>>>>>> directly >>>>>> >> >>>>>>> inline at the >>>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even >>>>>> >> >>>>>>> a pure >>>>>> >> >>>>>>> win if we >>>>>> >> >>>>>>> did the engineering effort. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding >>>>>> >> >>>>>>> the one >>>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive. >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> -Edward >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> wrote: >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable >>>>>> >> >>>>>>>> structs >>>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why >>>>>> >> >>>>>>>> it's hard. >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of >>>>>> >> >>>>>>>> object >>>>>> >> >>>>>>>> identity problems. But what about directly supporting an >>>>>> >> >>>>>>>> extensible set of >>>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even >>>>>> >> >>>>>>>> replacing) >>>>>> >> >>>>>>>> MutVar#? That >>>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise? >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want >>>>>> >> >>>>>>>> best in >>>>>> >> >>>>>>>> class >>>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and >>>>>> >> >>>>>>>> sequential >>>>>> >> >>>>>>>> counterparts. >>>>>> >> >>>>>>>> >>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones >>>>>> >> >>>>>>>> wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>>> >> >>>>>>>>> short >>>>>> >> >>>>>>>>> article. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> maybe >>>>>> >> >>>>>>>>> make a ticket for it. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Thanks >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Simon >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com] >>>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54 >>>>>> >> >>>>>>>>> To: Simon Peyton Jones >>>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified >>>>>> >> >>>>>>>>> invariant. It >>>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or >>>>>> >> >>>>>>>>> ByteArray#'s. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> While those live in #, they are garbage collected >>>>>> >> >>>>>>>>> objects, so >>>>>> >> >>>>>>>>> this >>>>>> >> >>>>>>>>> all lives on the heap. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when >>>>>> >> >>>>>>>>> it has >>>>>> >> >>>>>>>>> to >>>>>> >> >>>>>>>>> deal with nested arrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better >>>>>> >> >>>>>>>>> thing. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The Problem >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ----------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Consider the scenario where you write a classic >>>>>> >> >>>>>>>>> doubly-linked >>>>>> >> >>>>>>>>> list >>>>>> >> >>>>>>>>> in Haskell. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3 >>>>>> >> >>>>>>>>> pointers >>>>>> >> >>>>>>>>> on >>>>>> >> >>>>>>>>> the heap. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) >>>>>> >> >>>>>>>>> ~> >>>>>> >> >>>>>>>>> Maybe >>>>>> >> >>>>>>>>> DLL ~> DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> That is 3 levels of indirection. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with >>>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> worsening our representation. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> This means that every operation we perform on this >>>>>> >> >>>>>>>>> structure >>>>>> >> >>>>>>>>> will >>>>>> >> >>>>>>>>> be about half of the speed of an implementation in most >>>>>> >> >>>>>>>>> other >>>>>> >> >>>>>>>>> languages >>>>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Making Progress >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ---------------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I have been working on a number of data structures where >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> indirection of going from something in * out to an object >>>>>> >> >>>>>>>>> in # >>>>>> >> >>>>>>>>> which >>>>>> >> >>>>>>>>> contains the real pointer to my target and coming back >>>>>> >> >>>>>>>>> effectively doubles >>>>>> >> >>>>>>>>> my runtime. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> MutVar# >>>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well >>>>>> >> >>>>>>>>> defined >>>>>> >> >>>>>>>>> write-barrier. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I could change out the representation to use >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every >>>>>> >> >>>>>>>>> time, >>>>>> >> >>>>>>>>> but >>>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the >>>>>> >> >>>>>>>>> amount of >>>>>> >> >>>>>>>>> distinct >>>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 >>>>>> >> >>>>>>>>> per >>>>>> >> >>>>>>>>> object to 2. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> array >>>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that >>>>>> >> >>>>>>>>> to the >>>>>> >> >>>>>>>>> next array. I >>>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm >>>>>> >> >>>>>>>>> paying for >>>>>> >> >>>>>>>>> a card >>>>>> >> >>>>>>>>> marking table as well, which I don't particularly need >>>>>> >> >>>>>>>>> with just >>>>>> >> >>>>>>>>> two >>>>>> >> >>>>>>>>> pointers, but we can shed that with the >>>>>> >> >>>>>>>>> "SmallMutableArray#" >>>>>> >> >>>>>>>>> machinery added >>>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new >>>>>> >> >>>>>>>>> data >>>>>> >> >>>>>>>>> type, which can >>>>>> >> >>>>>>>>> speed things up a bit when you don't have very big >>>>>> >> >>>>>>>>> arrays: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and >>>>>> >> >>>>>>>>> have two >>>>>> >> >>>>>>>>> mutable fields and be able to share the sme write >>>>>> >> >>>>>>>>> barrier? >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array >>>>>> >> >>>>>>>>> types. >>>>>> >> >>>>>>>>> What >>>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> impedence >>>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and >>>>>> >> >>>>>>>>> then just >>>>>> >> >>>>>>>>> let the >>>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be >>>>>> >> >>>>>>>>> a >>>>>> >> >>>>>>>>> special >>>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can >>>>>> >> >>>>>>>>> even >>>>>> >> >>>>>>>>> abuse pattern >>>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further >>>>>> >> >>>>>>>>> to >>>>>> >> >>>>>>>>> make this >>>>>> >> >>>>>>>>> cheaper. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and >>>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the >>>>>> >> >>>>>>>>> preceding >>>>>> >> >>>>>>>>> and next >>>>>> >> >>>>>>>>> entry in the linked list. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' >>>>>> >> >>>>>>>>> into a >>>>>> >> >>>>>>>>> strict world, and everything there lives in #. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> next :: DLL -> IO DLL >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that >>>>>> >> >>>>>>>>> code to >>>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed >>>>>> >> >>>>>>>>> pretty >>>>>> >> >>>>>>>>> easily when they >>>>>> >> >>>>>>>>> are known strict and you chain operations of this sort! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Cleaning it Up >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ------------------ >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> points directly to other arrays. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but >>>>>> >> >>>>>>>>> I can >>>>>> >> >>>>>>>>> fix >>>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and >>>>>> >> >>>>>>>>> using a >>>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me >>>>>> >> >>>>>>>>> store a >>>>>> >> >>>>>>>>> mixture of >>>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data >>>>>> >> >>>>>>>>> structure. >>>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> existing >>>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one >>>>>> >> >>>>>>>>> of the >>>>>> >> >>>>>>>>> arguments it >>>>>> >> >>>>>>>>> takes. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> would >>>>>> >> >>>>>>>>> be best left unboxed. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can >>>>>> >> >>>>>>>>> currently >>>>>> >> >>>>>>>>> at >>>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# >>>>>> >> >>>>>>>>> at a >>>>>> >> >>>>>>>>> boxed or at a >>>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the >>>>>> >> >>>>>>>>> int in >>>>>> >> >>>>>>>>> question in >>>>>> >> >>>>>>>>> there. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I >>>>>> >> >>>>>>>>> need to >>>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree. >>>>>> >> >>>>>>>>> Having to >>>>>> >> >>>>>>>>> go off to >>>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first >>>>>> >> >>>>>>>>> pointer >>>>>> >> >>>>>>>>> chase. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we >>>>>> >> >>>>>>>>> could >>>>>> >> >>>>>>>>> construct that had n words with unsafe access and m >>>>>> >> >>>>>>>>> pointers to >>>>>> >> >>>>>>>>> other heap >>>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list >>>>>> >> >>>>>>>>> when any >>>>>> >> >>>>>>>>> of those >>>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of >>>>>> >> >>>>>>>>> two in >>>>>> >> >>>>>>>>> all >>>>>> >> >>>>>>>>> circumstances. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Prototype >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> ------------- >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Over the last few days I've put together a small >>>>>> >> >>>>>>>>> prototype >>>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data >>>>>> >> >>>>>>>>> structures >>>>>> >> >>>>>>>>> for things >>>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> order-maintenance. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> https://github.com/ekmett/structs >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Notable bits: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> link-cut >>>>>> >> >>>>>>>>> trees in this style. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts >>>>>> >> >>>>>>>>> that >>>>>> >> >>>>>>>>> make >>>>>> >> >>>>>>>>> it go fast. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core, >>>>>> >> >>>>>>>>> almost >>>>>> >> >>>>>>>>> all >>>>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor >>>>>> >> >>>>>>>>> get >>>>>> >> >>>>>>>>> optimized away, >>>>>> >> >>>>>>>>> and we're left with beautiful strict code directly >>>>>> >> >>>>>>>>> mutating out >>>>>> >> >>>>>>>>> underlying >>>>>> >> >>>>>>>>> representation. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a >>>>>> >> >>>>>>>>> short >>>>>> >> >>>>>>>>> article. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones >>>>>> >> >>>>>>>>> wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this >>>>>> >> >>>>>>>>> thread. >>>>>> >> >>>>>>>>> What is ArrayArray? What is the issue in general? Is >>>>>> >> >>>>>>>>> there a >>>>>> >> >>>>>>>>> ticket? Is >>>>>> >> >>>>>>>>> there a wiki page? >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> If it?s important, an ab-initio wiki page + ticket would >>>>>> >> >>>>>>>>> be a >>>>>> >> >>>>>>>>> good >>>>>> >> >>>>>>>>> thing. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Simon >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On >>>>>> >> >>>>>>>>> Behalf >>>>>> >> >>>>>>>>> Of >>>>>> >> >>>>>>>>> Edward Kmett >>>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25 >>>>>> >> >>>>>>>>> To: Manuel M T Chakravarty >>>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's >>>>>> >> >>>>>>>>> would be >>>>>> >> >>>>>>>>> very handy as well. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Consider right now if I have something like an >>>>>> >> >>>>>>>>> order-maintenance >>>>>> >> >>>>>>>>> structure I have: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) >>>>>> >> >>>>>>>>> {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar >>>>>> >> >>>>>>>>> s >>>>>> >> >>>>>>>>> (Upper s)) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>> >> >>>>>>>>> {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar >>>>>> >> >>>>>>>>> s >>>>>> >> >>>>>>>>> (Lower s)) {-# >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two >>>>>> >> >>>>>>>>> pointers, >>>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is >>>>>> >> >>>>>>>>> basically >>>>>> >> >>>>>>>>> the same >>>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the >>>>>> >> >>>>>>>>> structure >>>>>> >> >>>>>>>>> above. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On the heap this is an object that points to a structure >>>>>> >> >>>>>>>>> for the >>>>>> >> >>>>>>>>> bytearray, and points to another structure for each >>>>>> >> >>>>>>>>> mutvar which >>>>>> >> >>>>>>>>> each point >>>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of >>>>>> >> >>>>>>>>> indirection smeared >>>>>> >> >>>>>>>>> over everything. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward >>>>>> >> >>>>>>>>> link >>>>>> >> >>>>>>>>> from >>>>>> >> >>>>>>>>> the structure below to the structure above. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous >>>>>> >> >>>>>>>>> objects, >>>>>> >> >>>>>>>>> represented >>>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use >>>>>> >> >>>>>>>>> sameMutableArrayArray# on these >>>>>> >> >>>>>>>>> for object identity, which lets me check for the ends of >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> lists by tying >>>>>> >> >>>>>>>>> things back on themselves. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> and below that >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing >>>>>> >> >>>>>>>>> up to >>>>>> >> >>>>>>>>> an >>>>>> >> >>>>>>>>> upper structure. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> slots >>>>>> >> >>>>>>>>> in question, while it has gained a level of indirection >>>>>> >> >>>>>>>>> between >>>>>> >> >>>>>>>>> the wrapper >>>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that >>>>>> >> >>>>>>>>> one can >>>>>> >> >>>>>>>>> be basically >>>>>> >> >>>>>>>>> erased by ghc. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on >>>>>> >> >>>>>>>>> the heap >>>>>> >> >>>>>>>>> for >>>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> object itself, >>>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry >>>>>> >> >>>>>>>>> around the >>>>>> >> >>>>>>>>> mutable >>>>>> >> >>>>>>>>> int. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> The only pain points are >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents >>>>>> >> >>>>>>>>> me >>>>>> >> >>>>>>>>> from >>>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array >>>>>> >> >>>>>>>>> into an >>>>>> >> >>>>>>>>> ArrayArray >>>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest >>>>>> >> >>>>>>>>> of >>>>>> >> >>>>>>>>> Haskell, >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> and >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us >>>>>> >> >>>>>>>>> avoid the >>>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4 >>>>>> >> >>>>>>>>> pointers >>>>>> >> >>>>>>>>> wide. Card >>>>>> >> >>>>>>>>> marking doesn't help. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and >>>>>> >> >>>>>>>>> convert >>>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to >>>>>> >> >>>>>>>>> unsafeCoerce my way >>>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays >>>>>> >> >>>>>>>>> directly into the >>>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation we see here >>>>>> >> >>>>>>>>> by >>>>>> >> >>>>>>>>> aping the >>>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really >>>>>> >> >>>>>>>>> dangerous! >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on >>>>>> >> >>>>>>>>> the >>>>>> >> >>>>>>>>> altar >>>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move >>>>>> >> >>>>>>>>> them >>>>>> >> >>>>>>>>> and collect >>>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based >>>>>> >> >>>>>>>>> solutions. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty >>>>>> >> >>>>>>>>> wrote: >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> That?s an interesting idea. >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> Manuel >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > Edward Kmett : >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add >>>>>> >> >>>>>>>>> > Array# and >>>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that >>>>>> >> >>>>>>>>> > the >>>>>> >> >>>>>>>>> > ArrayArray# entries >>>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of >>>>>> >> >>>>>>>>> > indirection for >>>>>> >> >>>>>>>>> > the containing >>>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it >>>>>> >> >>>>>>>>> > if my >>>>>> >> >>>>>>>>> > leaf level data >>>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. >>>>>> >> >>>>>>>>> > It'd be >>>>>> >> >>>>>>>>> > nice to be >>>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff >>>>>> >> >>>>>>>>> > down at >>>>>> >> >>>>>>>>> > the leaves to >>>>>> >> >>>>>>>>> > hold lifted contents. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go >>>>>> >> >>>>>>>>> > to >>>>>> >> >>>>>>>>> > access >>>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd >>>>>> >> >>>>>>>>> > do that >>>>>> >> >>>>>>>>> > if i tried to >>>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# >>>>>> >> >>>>>>>>> > as a >>>>>> >> >>>>>>>>> > ByteArray# >>>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story >>>>>> >> >>>>>>>>> > preventing >>>>>> >> >>>>>>>>> > this. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the >>>>>> >> >>>>>>>>> > indirection >>>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and >>>>>> >> >>>>>>>>> > I >>>>>> >> >>>>>>>>> > could shoehorn a >>>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of >>>>>> >> >>>>>>>>> > unnecessary >>>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce >>>>>> >> >>>>>>>>> > that pain >>>>>> >> >>>>>>>>> > to just 1 >>>>>> >> >>>>>>>>> > level of unnecessary indirection. >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > -Edward >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> > _______________________________________________ >>>>>> >> >>>>>>>>> > ghc-devs mailing list >>>>>> >> >>>>>>>>> > ghc-devs at haskell.org >>>>>> >> >>>>>>>>> > >>>>>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> >>>>>> >> >>>>>>>>> _______________________________________________ >>>>>> >> >>>>>>>>> ghc-devs mailing list >>>>>> >> >>>>>>>>> ghc-devs at haskell.org >>>>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> >>>>>>> >>>>>> >> >>>>>>> >>>>>> >> >>>>> >>>>>> >> >>> >>>>>> >> >> >>>>>> >> > >>>>>> >> > >>>>>> >> > _______________________________________________ >>>>>> >> > ghc-devs mailing list >>>>>> >> > ghc-devs at haskell.org >>>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>> >> > >>>>>> > >>>>>> > >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>> >> > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >