[Haskell-beginners] Re: Appending to a list

Sun Feb 8 05:57:47 EST 2009

Francesco Bochicchio wrote:
> But, as you said, appending is expensive in haskell

We can say how expensive. Namely, fully evaluating

   xs ++ ys

takes O(length xs) time. Always appending an element at the end will
lead to something like

   (..((([1] ++ [2]) ++ [3]) ++ [4]) ++ ..) ++ [n]

which will take O(n^2) time instead of linear time. This is probably
best demonstrated with the following two implementations of  reverse

   reverse1 []     = []
   reverse1 (x:xs) = reverse1 xs ++ [x]

   reverse2 xs     = go [] xs
       where
       go ys []     = ys
       go ys (x:xs) = go (x:ys) xs

The first runs in quadratic time while second runs in linear time.

> so instead I build the lists using (:), which gives me the list in
> reverse order, and then I use the 'final step' of the function - the
> base case of the recursion - to reverse the result. If I understand
> laziness, this should be a low-cost operation because it does not
> actual build a reversed list, but only make the list to be
> 'consumed' from the tail.

No,  reverse  has nothing to do with laziness and still costs O(n) time.
It's just that building the list in linear time and then reversing it in
linear time is cheaper than building the list in quadratic time.

> Here is the code I was referring to:
>
> groups :: (a->Bool)-> [a] -> ([[a]],[a])
> groups f lst = groups_ f ([],[]) lst where
>     groups_ f (gs,seps) [] = (reverse gs, reverse seps)
>     groups_ f (gs,seps) [a] = if (f a)
>                               then ( reverse ( [a]:gs), reverse seps )
>                               else ( reverse gs , reverse (a:seps) )
>     groups_ f (gs, seps) lst = groups_ f (g:gs, r:seps) est
>         where
>           (g, (r:est)) = groupWhile f lst

While using  reverse  instead of appending at the end of the list is a
good idea, not building the list in reverse at all is much better.

First, let me restate your code as

    groups p xs = go ([],[]) xs
        where
        go (gs,seps) xs = let (g,rest) = span p xs in
            case rest of
                r:est -> go (g:gs, r:seps) est
                []    -> (reverse (g:gs), reverse seps)

The  groupWhile  function can be found in the Prelude, it's called  span
. This function  groups  differs from yours, for instance we have

  groups odd [2,2,2] == ([[],[],[],[]],[2,2,2])
  groups odd [1,2]   == ([[1],[]],[2])
  groups odd [1,1]   == ([[1,1]],[])

By the way, your code crashes in the last case.

Now, let's reformulate it so that the result won't be built in reverse.
The key is to eliminate the unnecessary accumulating parameter and work
with the result of the recursive call to  groups  directly:

    groups p xs = let (g,rest) = span p xs in
        case rest of
            r:est -> let (gs,seps) = groups p est in (g:gs,r:seps)
            []    -> ([g], [])

I'd say that this is the canonical implementation.

In case you wrote this function to split strings, have a look at

  http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split

Regards,
apfelmus

-- 
http://apfelmus.nfshost.com