[Haskell-cafe] Re: sort and lazyness (?)

Luke Palmer lrpalmer at gmail.com
Fri Dec 19 10:12:02 EST 2008


On Fri, Dec 19, 2008 at 7:58 AM, Daniel Kraft <d at domob.eu> wrote:

>
> Not having looked at your code, I think you are benefiting from
>> fusion/deforestation in the first three main functions. In this case,
>> although you may appear to be evaluating the entire list, in fact the
>> list elements can be discarded as they are generated, so functions like
>> length and reverse can run using constant space, rather than O(n) space.
>>
>
> How does reverse work in constant space?  At the moment I can't imagine it
> doing so; that's why I tried it, but of course you could be right.


No, you are correct, reverse is not constant space.  However, as Duncan
explained, reverse does not force any elements of the list, so even if you
had a list of elements which consumed 1Mb each (when fully evaluated), they
would not be forced so the memory would look exactly the same.


>
>  The sort function, however, requires that the entire list is retained,
>> hence greater memory usage. I also think you are optimistic in the
>> memory requirements of your 3 million element list. A list of Ints will
>> take a lot more than 4 bytes per element (on 32-bit machines) because
>> there's overhead for the list pointers, plus possibly the boxes for the
>> Ints themselves. I think there are 3 machine words for each list entry
>> (pointer to this element, pointer to next element, info-table pointer),
>> and 2 words for each Int, but I'm just guessing:
>> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObje
>> cts
>>
>
> Of course that's the case, but the list being 3 million elements, and not,
> say 100 (which would still fit into memory for a simple C array of ints) I
> thought would make it possible.  Otherwise, how can one handle such amounts
> in data anyway?  Only using arrays?


Well, if you know the size beforehand and you know the whole thing needs to
fit into memory at the same time, an array is usually a better choice than a
list.  Lists are more like loops -- i.e. control flow rather than data,
whereas Arrays are definitely data.  I realize the imprecision of that
statement...

However, I am not sure what all this jabber about swapping is.  28 bytes/elt
* 3,000,000 elts = 84 Mb, which easily fits into a modern machine's memory.

There are a lot of traps for the unwary in memory performance though.
Depending on how things are defined, you may be computing *too* lazily,
building up thunks when you should just be crunching numbers.  Still,
swapping on this 3,000,000 element list is absurd, and we should look closer
into your example.  Post the rest (eg. the instances?)?

Luke


>
>  You might get some mileage by suggesting to GHC that your Fraction type
>> is strict e.g.
>>
>>  data Fraction = Fraction !Int !Int
>>>
>>
>> which might persuade it to unbox the Ints, giving some space savings.
>>
>
> I already tried so, but this doesn't change anything to the performance.  I
> will however try now to use the provided rational type, maybe this helps.
>
> Thanks for the answers,
> Daniel
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20081219/b2456f11/attachment.htm


More information about the Haskell-Cafe mailing list