"+RTS -A" parameter and CPU cache size

Bulat Ziganshin bulat.ziganshin at gmail.com
Fri Jun 16 08:27:17 EDT 2006


Hello Simon,

Friday, June 16, 2006, 3:48:06 PM, you wrote:

>> char *ghc_rts_opts = "-A10m";
> Do you have some evidence that -A10m is a good default?  Better than
> -A6m, or -A16m, for example?  GHC currently runs with -H6m by default.

of course, "-a6m" is not worser than "-a10m". but "-h" is not a good
alternative because after memory usage grows above 6 mb, ghc starts to
do minor GCs after each 256 kb allocated. with "-A6m" number of minor
GCs will be dramatically cut off

i just quickly tested non-optimized compilation of my program with "-A6m" and
with "-H6m" (ghc 6.4.2.20060609) - wall clock time of compilation was
63 seconds against 76 seconds, while memory usage was 70 mb against 68 mb

btw, now i use "+RTS -c" in my compilations - it works fine. i'm happy :)

>> 2) i propose to write "L2 cache size detection" code and use it in GHC
>> 6.6 RTS to setup initial value of "-A" option. in order to allow program
>> tune itself to any cpu architecture, with cache sizes ranging from
>> 128kb to 4mb. this will allow low-level cpus to run significantly
>> faster on some algorithms (up to 2x, as i said above) and can give
>> 5-10% speedup for high-level cpus, that is also not so bad :)

> That sounds like a good plan.  I've been experimenting with PAPI 
> recently (http://icl.cs.utk.edu/papi/) which can tell you the size of 
> your caches, but it requires kernel patches.

for me, Windows compatibility is most important issue :)



-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Glasgow-haskell-users mailing list