Working character by character in Haskell

19 Oct 2001 11:17:33 +0200

"Simon Marlow" <simonmar@microsoft.com> writes:

> Well, in Haskell each character of the string takes 20 bytes: 12 bytes
> for the list cell, and 8 bytes for the character itself 

Why does a list cell consume as much as 12 bytes?  Two pointers (data
and next) and a 32-bit tag field, perhaps?  And a 64-bit minimum
allocation quatnity, accounting for the 8-byte character?

Isn't it possible to optimize this, e.g. by embedding small data
directly in the cons cell?  21 bits for a Unicode character should
leave enough bits for tagging, shouldn't it?

(Since I'm halfway planning to use Haskell next Spring to process long
lists of data with a small set of values (for instance:

        data Base = A | C | G | T

) I'm curious about the performance.)

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants