Why are strings linked lists?

Koen Claessen koen at cs.chalmers.se
Fri Nov 28 12:10:46 EST 2003


Wolfgang Jeltsch wrote:

 | > 1.  Today I spend a few hours trying to track down a memory leak.  It
 | >     turns out I just didn't realize how much space a string takes up.
 | >     On my machine "replicate 5000000 'a'" will use 90MB of space!
 |
 | You have to take into account that Chars (in GHC) take 4
 | bytes of memory because they denote Unicode codepoints.
 | 5,000,000 times 4 bytes is already 20 MB. (The rest is
 | only a constant factor. ;-))

You have to realize that the space usage does not
(necessarily) come from duplicating the character 'a'.
In a lazy implementation, the representation of that
character will be shared by all elements in the list.

So, what is happening that there is 1 cell in the heap
containing the representation of 'a', and then a linked list
of length 5000000, where each element points to that cell.

Just my 2 öre.

/Koen


More information about the Haskell mailing list