efficiency question

Keith Wansbrough Keith.Wansbrough@cl.cam.ac.uk
Mon, 11 Feb 2002 14:34:23 +0000


Hal Daume III <hdaume@ISI.EDU> writes:

> So instaed of storing 1 and 2 in the list, it stores pointers to 
> them.  The reason it does this is so that it makes it easy to generate    
> code for polymorhpic functions.  (,) being boxed means that instead of    
> storing a pair of element, you're storing a pair of pointers.  That means
> you get worse performance because you have to chase more pointers.

Not quite: (,) being boxed means that the *tuple* is boxed, not that 
its elements are boxed.  Only boxed things can live on the heap.  
Unboxed things cannot live on the heap, but can live in registers.

Ints are represented in GHC by the data type

data Int = I# Int#

where Int# is the type of *unboxed integers*.  The I# constructor forms 
the box that allows the Int# to live on the heap (since it can't live 
on the heap on its own, only in a register):

  +----+----+
  | I# | 42 |   is a heap object of (boxed) type Int.
  +----+----+

Now the representation you want for lists is something like:

data MyList = Nil | Cons Int# MyList

which would have elements like this:

  +------+---+---+  +------+---+---+  +------+---+---+  +-----+
  | Cons | 1 | *--->| Cons | 1 | *--->| Cons | 1 | *--->| Nil |
  +------+---+---+  +------+---+---+  +------+---+---+  +-----+

like you want.

You can't have an unboxed list, since things of unbounded size must
live on the heap and unboxed lists can't live on the heap.


I hope this helps, and isn't just more confusing.

In summary: (,) being boxed means that *it lives on the heap*, in its
own box.  If you want the *elements* to be stored directly in their
cells, without a box, then you should make the elements have unboxed
type, not the container.

HTH.

--KW 8-)
-- 
Keith Wansbrough <kw217@cl.cam.ac.uk>
http://www.cl.cam.ac.uk/users/kw217/
University of Cambridge Computer Laboratory.