[Haskell-beginners] Merge Sort Stack Overflow

Mon Sep 16 20:34:34 CEST 2013

On 16 September 2013 18:35, Florian Lammel <mail at florianlammel.com> wrote:
> Hi,

Hi Florian,

>
> I've just started learning Haskell and am trying to implement some basic
> algorithms.

I'm a beginner in Haskell as well so my advice may be corrected by
someone else :)

> My shot at merge sort looks like this:
>
> mergeSort :: Ord a => [a] -> [a]
> mergeSort [] = []
> mergeSort [x] = [x]
> mergeSort xs = merge (mergeSort as) (mergeSort bs)
>     where
>         (as, bs) = splitInHalf xs
>
> splitInHalf :: [a] -> ([a], [a])
> splitInHalf [] = ([], [])
> splitInHalf [x] = ([x], [])
> splitInHalf (x:y:xys) = (x:xs, y:ys)
>     where (xs, ys) = splitInHalf xys
>
> merge :: Ord a => [a] -> [a] -> [a]
> merge xs [] = xs
> merge [] ys = ys
> merge (x:xs) (y:ys) = if x < y
>                       then x:(merge xs (y:ys))
>                       else y:(merge ys (x:xs))
>
> As far as I can tell, my implementation is more or less the same as on
> rosetta code [0], literate programs [1] and several stack overflow questions
> [2][3].

Well actually the Rosetta code link shows 3 different implementations
and yours is the first of the three. Did you try the second one (the
bottom up merge)? This is also what's used in the answer to the SO
post [3] and according to the author there it's roughly what's used by
ghc for Data.List.sort so it's presumably a good algorithm.

> The algorithm works, but for large lists (for example, 500k random numbers),
> I get a stack overflow.

I'm not sure how this gets optimised but your call-stack is
essentially as long as the input list. That's only going to work with
such big inputs if the recursion is somehow optimised which I think
happens because of lazy evaluation. However this depends on your merge
function pulling a similar number of items from the 'as' list as from
the 'bs' list. Otherwise (I think) you'll have a whole load of
splitInHalf frames in the stack and maybe that gives your overflow.

I may be *entirely* wrong but I think the problem comes from returning
two lists from a function as you do in splitInHalf but then not
consuming the same number of items from each at any one time. I don't
think this can be optimised in quite the same lazy manner. It would be
fine if you consumed from both lists in unison as in e.g. a merge-sum
algorithm.

> So my question is: How can I change my code so that it works on larger
> inputs? Or, more generally, how can I write recursive algorithms in
> functional languages that require more nested function calls than fit in the
> stack?

Have you tried the other algorithm from e..g Rosetta code? That one
looks like it would work better in a lazy evaluation context since it
just works sequentially on a list of lists.

Oscar