[Haskell-beginners] Re: Appending to a list
Heinrich Apfelmus
apfelmus at quantentunnel.de
Sun Feb 8 05:57:47 EST 2009
Francesco Bochicchio wrote:
> But, as you said, appending is expensive in haskell
We can say how expensive. Namely, fully evaluating
xs ++ ys
takes O(length xs) time. Always appending an element at the end will
lead to something like
(..((([1] ++ [2]) ++ [3]) ++ [4]) ++ ..) ++ [n]
which will take O(n^2) time instead of linear time. This is probably
best demonstrated with the following two implementations of reverse
reverse1 [] = []
reverse1 (x:xs) = reverse1 xs ++ [x]
reverse2 xs = go [] xs
where
go ys [] = ys
go ys (x:xs) = go (x:ys) xs
The first runs in quadratic time while second runs in linear time.
> so instead I build the lists using (:), which gives me the list in
> reverse order, and then I use the 'final step' of the function - the
> base case of the recursion - to reverse the result. If I understand
> laziness, this should be a low-cost operation because it does not
> actual build a reversed list, but only make the list to be
> 'consumed' from the tail.
No, reverse has nothing to do with laziness and still costs O(n) time.
It's just that building the list in linear time and then reversing it in
linear time is cheaper than building the list in quadratic time.
> Here is the code I was referring to:
>
> groups :: (a->Bool)-> [a] -> ([[a]],[a])
> groups f lst = groups_ f ([],[]) lst where
> groups_ f (gs,seps) [] = (reverse gs, reverse seps)
> groups_ f (gs,seps) [a] = if (f a)
> then ( reverse ( [a]:gs), reverse seps )
> else ( reverse gs , reverse (a:seps) )
> groups_ f (gs, seps) lst = groups_ f (g:gs, r:seps) est
> where
> (g, (r:est)) = groupWhile f lst
While using reverse instead of appending at the end of the list is a
good idea, not building the list in reverse at all is much better.
First, let me restate your code as
groups p xs = go ([],[]) xs
where
go (gs,seps) xs = let (g,rest) = span p xs in
case rest of
r:est -> go (g:gs, r:seps) est
[] -> (reverse (g:gs), reverse seps)
The groupWhile function can be found in the Prelude, it's called span
. This function groups differs from yours, for instance we have
groups odd [2,2,2] == ([[],[],[],[]],[2,2,2])
groups odd [1,2] == ([[1],[]],[2])
groups odd [1,1] == ([[1,1]],[])
By the way, your code crashes in the last case.
Now, let's reformulate it so that the result won't be built in reverse.
The key is to eliminate the unnecessary accumulating parameter and work
with the result of the recursive call to groups directly:
groups p xs = let (g,rest) = span p xs in
case rest of
r:est -> let (gs,seps) = groups p est in (g:gs,r:seps)
[] -> ([g], [])
I'd say that this is the canonical implementation.
In case you wrote this function to split strings, have a look at
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split
Regards,
apfelmus
--
http://apfelmus.nfshost.com
More information about the Beginners
mailing list