[Haskell-cafe] Re: FASTER primes

Sat Jan 16 15:37:56 EST 2010

Am Samstag 16 Januar 2010 18:53:33 schrieb Will Ness:
> Daniel Fischer <daniel.is.fischer <at> web.de> writes:
> > Am Donnerstag 14 Januar 2010 08:25:48 schrieb Will Ness:
> > > Daniel Fischer <daniel.is.fischer <at> web.de> writes:
> > > > Am Mittwoch 13 Januar 2010 10:43:42 schrieb Heinrich Apfelmus:
> > > > > I wonder whether it's really the liveness of  pair  in
> > > > >
> > > > >   mergeSP (a,b) pair
> > > > >      = let sm = spMerge b (fst pair)
> > > > >        in (a ++ fst sm, merge (snd sm) (snd pair))
> > > > >
> > > > > that is responsible for the space leak, ...
> > > >
> > > > I think that is responsible. At least that's how I understand the
> > > > core:
> > > >
> > > > mergeSP (a,b) ~(c,d) = (a ++ bc, merge b' d)
> > > >    where
> > > >       (bc, b') = spMerge b c
> > > >       spMerge ...
> > >
> > > That is equivalent to
> > >
> > >   first (a++) . second (`merge`d) $ spMerge b c
> > >
> > > and Daniel's fix is equivalent to
> > >
> > >   first (a++) $ spMerge b c d
>
> That should've been
>
>   mergeSP (a,b) p = first(a++) . second(`merge`snd p) $ spMerge b (fst
> p)
>
> and
>
>   mergeSP (a,b) p = first(a++) $ spMerge b (fst p) (snd p)
>
>
> The code fragments you've posted are essentially
>
>
>   mergeSP (a,b) p = let res = case p of (c,_) ->
>                                 case spMerge b c of (# x,y #) ->
>                                   (x,y)
>     in
>        (# (++) a (case res of (bc,_)-> bc) ,
>           case res of (_,b') ->
>             case p of (_,d) -> merge b' d  #)
>
> and
>
>   mergeSP (a,b) p = let res = case p of (c,d) ->
>                                 case spMerge b c d of (# x,y #) ->
>                                   (x,y)
>     in
>        (# (++) a (case res of (bc,_)-> bc) ,
>           case res of (_,b') -> b'         #)
>
>
> This looks like Haskell to me, with many detailes explicitely written
> out, probaly serving as immediate input to the compiler - not its
> output. So it can't say to us much about how this is actually
> implemented on the lower level. (?)

It's 'core', the intermediate language GHC uses and does its 
transformations upon. It's more or less a slimmed down version of Haskell.

That came from -ddump-simpl, thus it's the output of the simplifier, after 
numerous transformation/optimisation passes. I think that is then fairly 
directly translated to assembly (via Cmm), the STG to STG and Cmm passes do 
not much optimisation anymore (I may be wrong, what know I about the 
compiler. However, I've found that the -ddump-simpl output corresponds well 
to the actual behaviour whenever I look.).

I find that much more redable than the -fext-core output 
(http://www.haskell.org/ghc/docs/6.12.1/html/users_guide/ext-core.html 
"GHC can dump its optimized intermediate code (said to be in “Core” format) 
to a file as a side-effect of compilation."), which says the same, only 
more verbose and less readable.

Of course, if I could read assembly, that would exactly reveal what 
happens.

>
> Your theory would certainly hold if the translation was done one-to-one
> without any further code rearrangements. But it could've been further
> re-arranged by the compiler at some later stage (is there one?) into an
> equivalent of, e.g.
>
>
>   mergeSP (a,b) p = let (bc,b') = case p of (c,_) ->
>                                     case spMerge b c of (x,y) ->
>                                       (x,y)
>     in
>        (# (++) a bc ,
>           case p of (_,d) -> merge b' d  #)
>
>
> and further,
>
>
>   mergeSP (a,b) p = let (c,d)   = case p of (x,y) -> (x,y)
>                         (bc,b') = case spMerge b c of (x,y) ->
>                                       (x,y)
>     in
>        (# (++) a bc , merge b' d  #)
>
>
> could it? This would take hold on /d/ and /c/ at the same time, right?

It would. But AFAICT, after the simplifier is done, no such rearrangements 
occur anymore.

>
> What is that code that you've shown exactly? At which stage is it
> produced and is it subject to any further manipulation?

I'm no GHC expert either, so I don't know what it does when exactly.
But after parsing and desugaring, there come a few iterations of the 
simplifier, intermingled with specialising, demand-analysis, CSE, let-
floating, worker-wrapper-split, ... .
At the end of all that, the Tidy Core is generated (part of which I 
posted). What happens from then on, well ...

> I apologise if
> these are obvious questions, I don't know anything about GHC. I also
> don't know what (# x,y #) means?

Unboxed tuple (pair in this case). That means, have the components for 
themselves, don't wrap them in a (,) constructor.

>
> One thing seems certain - we should not hold explicit references to same
> entities in different parts of our code, to avoid space leaks with more
> confidence.

Except when it's good to share ;)

> To make code look as much tail-recursive as possible, so to speak.

Tail recursion isn't really important (for GHC at least, I think for lazy 
languages in general), due to different evaluation models (cf. 
http://www.haskell.org/pipermail/haskell-cafe/2009-March/058607.html and 
the thread starting with 
http://www.haskell.org/pipermail/haskell-cafe/2007-May/025570.html).

>
> Does that make sense?

In general, yes.

>
> Anyway that was a very educational (for me) and fruitful discussion, and
> I greatly appreciate your help, and fixing and improving of the code.
>
> Thanks!

You're welcome.