Non-updateable thunks

Simon Marlow marlowsd at gmail.com
Fri Aug 3 10:28:13 CEST 2012


On 01/08/2012 11:38, Joachim Breitner wrote:
> Hello,
>
> I’m still working on issues of performance vs. sharing; I must assume
> some of the people here on the list must have seen my "dup"-paper¹ as
> referees.
>
> I’m now wondering about a approach where the compiler (either
> automatically or by user annotation; I’ll leave that question for later)
> would mark some thunks as reentrant, i.e. simply skip the blackholing
> and update frame pushing. A quick test showed that this should work
> quite well, take the usual example:
>
>          import System.Environment
>          main = do
>              a <- getArgs
>              let n = length a
>              print n
>              let l = [n..30000000]
>              print $ last l + last l
>
> This obviously leaks memory:
>
>          $ ./Test +RTS -t
>          0
>          60000000
>          <<ghc: 2400054760 bytes, 4596 GCs, 169560494/935354240 avg/max
>          bytes residency (11 samples), 2121M in use, 0.00 INIT (0.00
>          elapsed), 0.63 MUT (0.63 elapsed), 4.28 GC (4.29 elapsed) :ghc>>
>
>
> I then modified the the assembly (a crude but effective way of testing
> this ;-)) to not push a stack frame:
>
> $ diff -u Test.s Test-modified.s
> --- Test.s	2012-08-01 11:30:00.000000000 +0200
> +++ Test-modified.s	2012-08-01 11:29:40.000000000 +0200
> @@ -56,20 +56,20 @@
>   	leaq -40(%rbp),%rax
>   	cmpq %r15,%rax
>   	jb .LcpZ
> -	addq $16,%r12
> -	cmpq 144(%r13),%r12
> -	ja .Lcq1
> -	movq $stg_upd_frame_info,-16(%rbp)
> -	movq %rbx,-8(%rbp)
> +	//addq $16,%r12
> +	//cmpq 144(%r13),%r12
> +	//ja .Lcq1
> +	//movq $stg_upd_frame_info,-16(%rbp)
> +	//movq %rbx,-8(%rbp)
>   	movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)
>   	movq $30000000,0(%r12)
>   	leaq -7(%r12),%rax
> -	movq %rax,-24(%rbp)
> +	movq %rax,-8(%rbp)
>   	movq 16(%rbx),%rax
> -	movq %rax,-32(%rbp)
> -	movq $stg_ap_pp_info,-40(%rbp)
> +	movq %rax,-16(%rbp)
> +	movq $stg_ap_pp_info,-24(%rbp)
>   	movl $base_GHCziEnum_zdfEnumInt_closure,%r14d
> -	addq $-40,%rbp
> +	addq $-24,%rbp
>   	jmp base_GHCziEnum_enumFromTo_info
>   .Lcq1:
>   	movq $16,192(%r13)
>
> Now it runs fast and slim (and did not crash on the first try, which I
> find surprising after hand-modifying the assembly code):
>
>          $ ./Test +RTS -t
>          0
>          60000000
>          <<ghc: 4800054840 bytes, 9192 GCs, 28632/28632 avg/max bytes
>          residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed), 0.73
>          MUT (0.73 elapsed), 0.04 GC (0.04 elapsed) :ghc>>
>
>
> My question is: Has anybody worked in that direction? And are there any
> fundamental problems with the current RTS implementation and such
> closures?

Long ago GHC used to have an "update analyser" which would detect some 
thunks that would never be re-entered and omit the update frame on them. 
  I wrote a paper about this many years ago, and there were other people 
working on similar ideas, some using types (e.g. linear types) - google 
for "update avoidance".  As I understand it you want to omit doing some 
updates in order to avoid space leaks, which is slightly different.

The StgSyn abstract syntax has an UpdateFlag on each StgRhs which lets 
you turn off the update, and I believe the code generator will respect 
it although it isn't actually ever turned off at the moment.

Cheers,
	Simon



More information about the Glasgow-haskell-users mailing list