Unpacking multi-type constructors

Wed Jul 22 13:56:39 EDT 2009

Some very rough benchmarks: folding over these two implementations of a
binary tree
data Node1 a = Node1 a (Maybe (Node1 a)) (Maybe (Node1 a))

data Node2 a = Node2A a | Node2B a (Node2 a) | Node2C a (Node2 a) | Node2D a
(Node2 a) (Node2 a)

with the latter of Simon's examples gives the following measurements when
performing 100 traversals over randomly generated trees of size 3000:
                min    mean    +/-sd  median    max
Packed:    12.001  16.001   4.000  16.001  24.002
Unpacked:  12.000  13.144   1.952  12.001  16.001

(I'm not entirely sure what I'm doing, as this is my first contribution to
the compiler proper, and pointers as to how to obtain more convincing
benchmarks would be appreciated.)

My motivation is more philosophical than anything else, though:

   - The behavior of unpacking several multi-constructor types can be highly
   useful to the programmer, and is also (precisely because of the exponential
   growth) difficult and time-consuming for the programmer to duplicate by
   hand.
   - Currently, UNPACK pragmas have *no effect* when used on
   multi-constructor types, and give no warnings or other hints that they are
   having no effect.  There is currently no reason for a programmer to use an
   UNPACK pragma on a multi-constructor type, so I would not expect to see code
   bloat occurring in older programs.  The primary exception to this would be
   in programmers that use the (already regarded as a sledgehammer)
   -funbox-strict-fields option.
   - As it stands, it is already the programmer's burden to know how many
   constructors an UNPACK'd type has, because the pragma will only have any
   effect if it is a single-constructor type.  The effect of the proposal is to
   attach consequences to UNPACKing multi-constructor types, and placing the
   burden on the programmer to be sure that exponentially great code expansion
   does not get out of hand.

The first two bullets are really the ones that grate on me -- that use of
the UNPACK pragma in this fashion on multi-constructor types could be
seriously useful to a programmer, but currently has no effect, and as a
result, I would expect that implementing this proposal wouldn't cause
problems in old programs and could be useful in new ones.

I think I'm making sense.  Would anyone else care to chime in?

Louis Wasserman
wasserman.louis at gmail.com

On Tue, Jul 21, 2009 at 6:38 PM, Don Stewart <dons at galois.com> wrote:

> Might be interesting to try the transformation manually and benchmark?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20090722/841cbf21/attachment.html