[Haskell-beginners] Re: Appending to a list

Mon Feb 9 05:36:51 EST 2009

Francesco Bochicchio wrote:
> Heinrich Apfelmus wrote:
>> 
>> No,  reverse  has nothing to do with laziness and still costs O(n)
>> time. It's just that building the list in linear time and then
>> reversing it in linear time is cheaper than building the list in
>> quadratic time.
> 
> I see. Thanks for the information. I keep thinking to haskell lists
> as they are 'generators', but this is not always the case ... and I
> realize now that there is no way to reverse the list without
> expanding it first.

Well, they are 'generators' from an operational point of view, in that
elements are generated on demand. But other than that, they are just a tree

     :
    / \
   1   :
      / \
     2   :
        / \
       3  ...

>> Now, let's reformulate it so that the result won't be built in
>> reverse. The key is to eliminate the unnecessary accumulating
>> parameter and work with the result of the recursive call to  groups
>> directly:
>> 
>>    groups p xs = let (g,rest) = span p xs in
>>        case rest of
>>            r:est -> let (gs,seps) = groups p est in (g:gs,r:seps)
>>            []    -> ([g], [])
>> 
>> I'd say that this is the canonical implementation.
> 
> The reason I used accumulators is that I try to teach myself to
> always write tail-recursive functions. I have been bitten by stack
> exhaustion in one of my first exercises (working on large amount of
> data), so I thought that non tail-recursive functions should be
> always avoided. But using accumulators often leads to appending to
> lists, so one has to find the balance between the two things...

Due to lazy evaluation, something that looks tail recursive is not
necessarily tail recursive at all; I strongly advise to unlearn your
habit. For example, the version of  groups  that uses an accumulator is
less efficient that the one without.

In general, a lazy approach to performance is best: just use the
simplest possible implementation and postpone performance considerations
to when the program actually takes ages to compute the result.

Concerning stack exhaustion and large amounts of data, there is one very
common pattern worth knowing, namely foldr vs. foldl' , see also

  http://en.wikibooks.org/wiki/Haskell/Performance_Introduction#Space
  http://book.realworldhaskell.org/read/functional-programming.html

Other than that, there is no way around understanding Haskell's
evaluation model properly.

Regards,
apfelmus

-- 
http://apfelmus.nfshost.com