[Haskell-cafe] STG optimization

Tue Jul 20 21:20:59 EDT 2004

Hi,

A few years ago, I wrote a paper about compiliation LFL onto JVM based
on the STG machine. In the paper, I proposed a different presentation of
the STG machine, for example, using a single stack and showing updating
explicitly. Although my proposal did not add any optimization to
the original STG machine, I believe that it is easier one to start with.

See Section 2 of the following paper                              
   http://www.jaist.ac.jp/~khchoi/doc/flops2001.pdf 

Best regards,

Kwanghoon Choi

On Tue, 20 Jul 2004 7253146 at informatica.alumnos.uma.es wrote:

> Hi. I have been reading recently the STG machine paper by Simon Peyton-Jones. 
> It is relatively easy to follow the explanation, though the formal description 
> of operational semantics is hard to understand. In order to fully understand 
> it, I have started to implement a C-based mini STG machine, implementing the 
> obvious compilation schemes according to the STG paper. Of course, I have not 
> implemented most of optimizations: environment management is very rudimentary, 
> it has three stacks (pointer args, primitive args and continuations) for 
> simplicity, etc. At the moment I have a beatiful tree-reduction STG machine.
> 
> So far so good. Nonetheless, some doubts have arisen when I have started to try 
> to add updating. In the paper, implementation for "update in place" is 
> extensively discussed, although it is said it could greatly increase the memory 
> space usage. ¿Has this risk been measured? I think deforestation can minimize 
> the risk; however, as far as I know, deforestation is only applied to lists.
> 
> But the STG paper ignores an optimization readyly available to me: whenever a 
> constructor is allocated (constructor's register allocation aside, since the 
> push/enter - eval/apply paper says GHC no longer supports it), the STG machine 
> (the one described by the paper) takes heap space for it, although an update 
> (indirection or copying) is going to be done most of times. This leads to 
> potentially bad heap space usage and more garbage collections.
> 
> I think update frames may be detected by the constructor allocation code, so it 
> could determine whether to allocate a new constructor or build it over the 
> closure to be updated. This could be achieved by marking continuations and 
> update frames in the stack, so constructor allocation code could distinguish 
> between them. This is inconvenient because common continuations should remove 
> the mark too, so I do not like it.
> 
> Another (my favourite) similar solution should be introduce a new stack only 
> for update frames, but still pushing code labels for updating in the 
> continuation stack (which could still be merged with the primitive args stack): 
> this way, constructor's code can still simply jump to the next "continuation" 
> (the update code should be responsible of popping the update stack), and 
> constructor allocation code can simply test if the "continuation" on top of the 
> stack is a real continuation, so it can safely take newly allocated heap space 
> for the constructor, or it is an update (perhaps by testing for equality with 
> the top of the update stack), so the code should test if the constructor 
> closure can be overwritten over the closure to be updated.
> 
> To ilustrate my thoughts:
> 
> \begin{code}
> 
> some_func <some_args> = let b = <some_expression> in
>   SomeConstructor { b, some_stuff }
> 
> some_other_func <some_other_args> = let a = some_func { some_other_stuff }
> in case a of <some_alternatives>
> 
> \end{code}
> 
> In the above piece of core STG language, when the "a" closure is entered and 
> finally "SomeConstructor" must be allocated, the code of "some_func" should 
> check if the continuation leads to an update (in this case, an update of "a") 
> or an continuation. In this case it leads to an update, so then the code should 
> check if the update closure ("a") is large enough, and, in that case, build the 
> constructor closure over it.
> 
> Maybe this technique is irrelevant for performance, too expensive to implement, 
> or having a separated update stack is too inconvenient, or these effects are 
> achieved by other ways?
> 
> Thanks in advance
> 
> Jose David
> 
> 
> 
> ---------------------------------------------
> Este mensaje lo ha enviado un Alumno de la Universidad de Malaga.
> http://www.alumnos.uma.es/
> 
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>