[Haskell-cafe] Avoid sharing
Edward Z. Yang
ezyang at mit.edu
Mon Nov 7 21:32:50 UTC 2016
GHC is allowed to do common sub-expression elimination which
would cause sharing. For example:
module Opt where
{-# NOINLINE bar #-}
bar :: Int -> Int
bar x = x + 3
foo :: Int -> Int
foo s = bar s * bar s
Here's the core for 'foo':
foo =
\ (s_ayA :: Int) ->
case bar s_ayA of _ [Occ=Dead] { GHC.Types.I# x_aJa ->
GHC.Types.I# (GHC.Prim.*# x_aJa x_aJa)
}
bar is invoked only once.
In Michael's original example, CSE doesn't fire, probably because
the tuple constructor is lazy. I am not entirely certain what the
interaction here is.
Edward
Excerpts from Tom Ellis's message of 2016-11-07 21:10:41 +0000:
> I'm not going to contradict Edward's answer in practice, because he knows
> far more about the internals of GHC than I do. However, I will contradict
> it in theory.
>
> There is a straightforward operational semantics for Haskell programs as
> compiled by GHC via Core to STG (without any optimisations applied). I
> described this semantics in the following talk at Haskell eXchange 2016:
>
> https://skillsmatter.com/skillscasts/8726-haskell-programs-how-do-they-run#video
>
> and I wrote it up as an article here:
>
> http://h2.jaguarpaw.co.uk/posts/haskell-programs-how-do-they-run/
>
> Under that semantics, yes, 'someGenerator s' is bound twice and thus is not
> shared.
>
> Now, GHC may try to apply an "optimisation" and (accidentally?) introduce
> sharing. That would be a compiler bug, in my opinion. In fact, it would be
> a terrible blow to Haskell's future as a practical language, because we
> really need fine-grained control over sharing. So I really hope that Edward
> is wrong about this.
>
> Tom
>
> On Mon, Nov 07, 2016 at 12:56:02PM -0800, Edward Z. Yang wrote:
> > It's not guaranteed. Unfortunately there aren't really good ways
> > to avoid sharing; the general advice is to convert values into
> > functions, and apply them at the use site where sharing is OK.
> >
> > Unrelatedly, in your sample code, dropping 1000000000 entries
> > is not a good way to build a splittable RNG. Check out
> > http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf
> > and also its related work for some bettera pproaches.
> >
> > Edward
> >
> > Excerpts from Michael Roth's message of 2016-11-07 20:56:54 +0100:
> > > Hello! A short question, given:
> > >
> > >
> > > data Seed = ...
> > > data Value = ...
> > >
> > > someGenerator :: Seed -> [Value]
> > >
> > > createTwo :: Seed -> ([Value], [Value])
> > > createTwo s = (as, bs) where
> > > as = someGenerator s
> > > bs = drop 1000000000 (someGenerator s)
> > >
> > >
> > > Is it guaranteed that 'someGenerator s' is created twice and not shared
> > > between 'as' and 'bs'? Is this by language design? Are there any GHC
> > > options that change the behaviour?
More information about the Haskell-Cafe
mailing list