[Haskell-cafe] Honest constant-space mergesort

Frank Staals frank at fstaals.net
Sat Jan 12 21:59:33 UTC 2019


Ryan Reich <ryan.reich at gmail.com> writes:

> I am, for entertainment, trying to figure out how to implement a
> constant-space bottom-up mergesort in Haskell.  This is quite easy to at
> least describe imperatively: break the list into ordered blocks, merge
> adjacent ones, and repeat with double the length until there's only one
> block and you're done.  For a linked list like [a] in Haskell, this should
> be constant-space.  Unfortunately, it is not.
>
> You can definitely express the above algorithm nicely if you organize those
> blocks into an [[a]], but then you are creating O(n) new lists.  Even if
> you go to extremes to keep the blocks concatenated, though, you still have
> the issue that every time you merge two lists, you are creating a new list
> and that involves the allocator.  (Note that the implementation of
> Data.List.sort does this.)
>
> The C equivalent would proceed by just overwriting the linking pointers in
> each node, without allocating new nodes to do so, but of course, that's not
> going to work in Haskell.  I considered using something like STRef to
> represent the links (so, departing from the plain [a] type) but even then,
> if you overwrite the contents of one of those, the garbage collector still
> needs to know.  This is better than allocating more space, but it is not
> necessary because the algorithm never actually makes any of the nodes
> unreachable.  Also, I understand that using a lot of STRefs is bad for
> efficiency.
>
> Is there any way to do this in (GHC) Haskell and get the algorithm down to
> the bare minimum?  If there is, is it any different than just "writing C in
> Haskell"?  That is, can one coerce GHC into emitting the mutable-link
> operations and not doing collections, from high-level code?  Although I've
> definitely seen examples of algorithms that compile down to such unboxed
> depths, this one has been really resistant to improvement.
>
> Thanks,
> Ryan Reich
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

If you want your sort function to have type: Ord a => [a] -> [a] then I
would think that you'd need to at least n new space anyway (where n is
the length of the list), since you cannot actually overwrite the input
list anyway. 

--

- Frank


More information about the Haskell-Cafe mailing list