[Haskell-cafe] Re: "no-coding" functional data structures via lazyness

Tue Jul 10 05:47:46 EDT 2007

Dave Bayer wrote:
> Learning Haskell, the Prelude.ShowS type stood out as odd, exploiting
> the implementation of lazy evaluation to avoid explicitly writing an
> efficient concatenable list data structure.

ShowS has nothing to do with lazy evaluation, the same trick can be done
in a call-by-value language. The idea is to represent a string xs as a
context (xs ++ •).

>> merge xs@(x:xt) ys@(y:yt) = case compare x y of
>>     LT -> x : (merge xt ys)
>>     EQ -> x : (merge xt yt)
>>     GT -> y : (merge xs yt)
>>
>> diff xs@(x:xt) ys@(y:yt) = case compare x y of
>>     LT -> x : (diff xt ys)
>>     EQ -> diff xt yt
>>     GT -> diff xs yt
>>
>> merge' (x:xt) ys = x : (merge xt ys)
>>
>> primes = ps ++ (diff ns $ foldr1 merge' $ map f $ tail primes)
>>     where ps  = [2,3,5]
>>           ns  = [7,9..]
>>           f p = [ m*p | m <- [p,p+2..]]
> 
> The code is very fast for its size; I haven't seen Haskell code posted
> on the web that comes close, and it is faster than any of my other tries
> (I posted this code to http://www.haskell.org/haskellwiki/Prime_numbers).
> Effectively, it steals a heap data structure out of thin air by
exploiting the
> implementation of lazy evaluation. It would seem that GHC's core data
> structures are coded closer to the machine that anything I can write
> _in_ Haskell. So much for studying how to explicitly write a good heap!

While your observation that merge may create an implicit heap is true,
it doesn't happen in your code :) When unfolding the foldr1, we get
something like

  2:.. `merge'` (3:.. `merge'` (5:.. `merge1` (...)))

i.e. just a linear chain of merges. Retrieving the least element is
linear time in the worst case. This shape will not change with
subsequent reductions of  merge. In other words, it's the responsibility
of  fold  to build a heap. Mergesort shows how a fold can build a heap:

  http://thread.gmane.org/gmane.comp.lang.haskell.general/15007

For  primes , the heap shape has to be chosen carefully in order to
ensure termination. It's the same problem that forces you to use  foldr1
merge'  instead of  foldr1 merge .

There's also a long thread about prime sieves

  http://thread.gmane.org/gmane.comp.lang.haskell.cafe/19699

Regards,
apfelmus