Why are strings linked lists?

Wolfgang Jeltsch wolfgang at jeltsch.net
Fri Nov 28 12:37:30 EST 2003


Am Freitag, 28. November 2003 12:10 schrieb Koen Claessen:
> Wolfgang Jeltsch wrote:
>  | > 1.  Today I spend a few hours trying to track down a memory leak.  It
>  | >     turns out I just didn't realize how much space a string takes up.
>  | >     On my machine "replicate 5000000 'a'" will use 90MB of space!
>  |
>  | You have to take into account that Chars (in GHC) take 4
>  | bytes of memory because they denote Unicode codepoints.
>  | 5,000,000 times 4 bytes is already 20 MB. (The rest is
>  | only a constant factor. ;-))
>
> You have to realize that the space usage does not
> (necessarily) come from duplicating the character 'a'.
> In a lazy implementation, the representation of that
> character will be shared by all elements in the list.
>
> So, what is happening that there is 1 cell in the heap
> containing the representation of 'a', and then a linked list
> of length 5000000, where each element points to that cell.

Yes, you're right. But if you choose the array alternative, you cannot use 
sharing and would, therefore, still need 20 MB.

> Just my 2 öre.
>
> /Koen

Wolfgang



More information about the Haskell mailing list