[Haskell-cafe] Re: External Sort and unsafeInterleaveIO
apfelmus
apfelmus at quantentunnel.de
Tue Jul 17 14:30:12 EDT 2007
Ben wrote:
> a haskell newbie here, searching for comments and wisdom on my code.
>
> i had a project to try to implement "external sort" in haskell as a
> learning exercise. (external sort is sorting a list that is too large
> to fit in main memory, by sorting in chunks, spooling to disk, and
> then merging. more properly there probably should be multiple stages,
> but for simplicity i'm doing a one-stage external sort.)
>
> i'd appreciate any testing, comments and suggestions from the haskell
> gods out there.
I'm not a god but I like it very much :) Especially because it's
laziness in action.
> blocks <- mapM (\x -> do {bs <- hGetContentsWithCursor h x;
> return $ Bin.decode bs}) idx
(Minuscule cosmetics:
blocks <- mapM ((liftM Bin.decode) . hGetContentsWithCursor h) idx
)
> Merging chunks. K-way merge (and in fact external sort in general) is
> detailed in Knuth, where he recommends tournament trees. The easiest
> thing is to probably use one of Okasaki's heaps. I'll use splay
> heaps, because I don't know any better.
>
> It would be better if I changed Ord for blocks to only check the first
> element.
>
>> kMerge :: (Ord a) => [[a]] -> [a]
>> kMerge [] = []
>> kMerge l =
>> let h = Splay.fromSeq l in
>> kM (Splay.minElem h) (Splay.deleteMin h)
>> where
>> kM :: (Ord a) => [a] -> Splay.Heap [a] -> [a]
>> kM l h
>> | h == Splay.empty = l
>> | otherwise =
>> let next = Splay.minElem h
>> (f, b) = span (\x -> x <= head next) l
>> in
>> f ++ (kM next (if null b then Splay.deleteMin h
>> else (Splay.insert b $ Splay.deleteMin h)))
>>
>> kMergeSort :: (Ord a) => [a] -> [a]
>> kMergeSort l = kMerge $ blockify blocksize l
Oh, I would have expected a lazy mergesort here. Internally, this will
work like a tournament heap. See also
http://article.gmane.org/gmane.comp.lang.haskell.cafe/24180
Regards,
apfelmus
More information about the Haskell-Cafe
mailing list