newtypes and optimization

Simon Peyton-Jones simonpj at microsoft.com
Thu Dec 13 05:31:13 EST 2007


| However when I do this:
|
| > newtype Quaternion = Q (Vec4 Double)
|
| Everything is ruined. Functions like peek and vadd are no longer inlined,
| intermediate linked lists are created all over the place. The Quaternion
| Storable instance looks like this

Turns out this is a perf bug in 6.8 that I fixed a couple of weeks ago in the HEAD, but didn't merge.  (Implication constraints aren't getting INLINE pragmas.)

With the HEAD we get this, which should make you happy.  The HEAD allocates only 9kbytes in both -DSLOW and -DFAST, whereas 6.8 allocates 21kbytes in -DFAST (and off the map for -DSLOW).

I guess we should get this patch into the 6.8 branch.

Simon


$gpj --make -O2 -DFAST Test -o Test-fast
[1 of 2] Compiling VecMath          ( VecMath.hs, VecMath.o )
NOTE: Simplifier still going after 4 iterations; bailing out.  Size = 7311
[2 of 2] Compiling Main             ( Test.hs, Test.o )
Linking Test-fast ...
bash-3.1$ rm -f *.o
bash-3.1$ $gpj --make -O2 -DSLOW Test -o Test-slow
[1 of 2] Compiling VecMath          ( VecMath.hs, VecMath.o )
NOTE: Simplifier still going after 4 iterations; bailing out.  Size = 7311
[2 of 2] Compiling Main             ( Test.hs, Test.o )
Linking Test-slow ...
bash-3.1$ time ./Test-fast +RTS -sstderr
./Test-fast +RTS -sstderr
      9,432 bytes allocated in the heap
        552 bytes copied during GC (scavenged)
          0 bytes copied during GC (not scavenged)
     32,768 bytes maximum residency (1 sample(s))

          1 collections in generation 0 (  0.00s)
          1 collections in generation 1 (  0.00s)

          1 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    5.88s  (  6.87s elapsed)
  GC    time    0.00s  (  0.01s elapsed)
  EXIT  time    0.00s  (  0.01s elapsed)
  Total time    5.88s  (  6.88s elapsed)

  %GC time       0.0%  (0.1% elapsed)

  Alloc rate    1,605 bytes per MUT second

  Productivity 100.0% of total user, 85.4% of total elapsed


real    0m6.973s
user    0m5.880s
sys     0m0.956s
bash-3.1$ time ./Test-slow +RTS -sstderr
./Test-slow +RTS -sstderr
      9,432 bytes allocated in the heap
        552 bytes copied during GC (scavenged)
          0 bytes copied during GC (not scavenged)
     32,768 bytes maximum residency (1 sample(s))

          1 collections in generation 0 (  0.00s)
          1 collections in generation 1 (  0.00s)

          1 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    5.90s  (  6.83s elapsed)
  GC    time    0.00s  (  0.03s elapsed)
  EXIT  time    0.00s  (  0.03s elapsed)
  Total time    5.90s  (  6.86s elapsed)

  %GC time       0.0%  (0.4% elapsed)

  Alloc rate    1,597 bytes per MUT second

  Productivity 100.0% of total user, 86.0% of total elapsed


real    0m6.958s
user    0m5.904s
sys     0m1.004s
bash-3.1$

ghc --make -O2 -DFAST Test -o Test-682
[1 of 2] Compiling VecMath          ( VecMath.hs, VecMath.o )
[2 of 2] Compiling Main             ( Test.hs, Test.o )
Linking Test-682 ...
bash-3.1$ time ./Test-682 +RTS -sstderr
./Test-682 +RTS -sstderr
     21,752 bytes allocated in the heap
        552 bytes copied during GC (scavenged)
          0 bytes copied during GC (not scavenged)
     32,768 bytes maximum residency (1 sample(s))

          1 collections in generation 0 (  0.00s)
          1 collections in generation 1 (  0.00s)

          1 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    5.77s  (  6.69s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    5.77s  (  6.69s elapsed)

  %GC time       0.0%  (0.0% elapsed)

  Alloc rate    3,770 bytes per MUT second

  Productivity 100.0% of total user, 86.2% of total elapsed


real    0m6.787s
user    0m5.768s
sys     0m1.016s
bash-3.1$


More information about the Glasgow-haskell-users mailing list