[Haskell-cafe] excercise - a completely lazy sorting algorithm
Petr Pudlak
deb at pudlak.name
Fri Jul 17 14:56:49 EDT 2009
Hi all,
I apologize that I didn't react to your posts, I was on a vacation. (BTW, if
you ever come to Slovakia, I strongly recommend visiting Mala (Lesser) Fatra
mountains. IMHO it's more beautiful than more-known Tatra mountains.)
Thanks for your interest and many intriguing ideas. Especially, I like
cata-/ana-/hylo-morphisms, it looks to me as a very useful concept to learn.
I hope I'll manage to create my own version of the sorting algorithm based on
your advices. Maybe I'll also try to do some real benchmarks, if I have time.
-Petr
On Tue, Jul 07, 2009 at 02:49:08AM +0200, Matthias Görgens wrote:
> The "sorted array of bags of unsorted input" is a nice idea. However,
> you have to use the data structure in a single-threaded [1] fashion to
> obtain the claimed bounds.
>
> Here's a pure solution that uses amortization and laziness.
>
> > import qualified Data.Sequence as S
> > import Data.Sequence ((><))
> > import Data.Foldable
> > import Data.Monoid
>
> Suppose we have a function to find the the median of a list, and
> partition it into three sublists: Smaller than the median, equal to
> the media, larger than the median. That function should run in linear
> time.
>
> > partitionOnMedian :: forall a. (Ord a) => (S.Seq a) -> BTreeRaw a (S.Seq a)
>
> where the following data structure holds the sublists and some
> bookkeeping information:
>
> > data BTreeRaw a m = Leaf
> > | Node {cmp::(a->Ordering)
> > , lN :: Int
> > , less::m
> > , eq :: (S.Seq a)
> > , gN :: Int
> > , greater::m
> > }
>
> where 'lN' and 'gN' are the length of 'less' and 'greater'.
>
> We can make BTreeRaw a functor:
>
> > instance Functor (BTreeRaw a) where
> > fmap f Leaf = Leaf
> > fmap f (Node c lN l e gN g) = Node c lN (f l) e gN (f g)
>
> Now using a fixed-point construction we can bootstrap a sorting
> algorithm from partitionOnMedian:
>
> > data Fix m = Fix {unfix :: (m (Fix m))}
> > type BTree a = Fix (BTreeRaw a)
>
> > treeSort :: forall a. (Ord a) => S.Seq a -> BTree a
> > treeSort = Fix . helper . partitionOnMedian
> > where helper = fmap (Fix . helper . partitionOnMedian)
>
> Now treeSort produces the thunk of a balanced binary search tree. Of
> course we can get a sorted list out of it (forcing the whole
> structure):
>
> > flatten :: BTree a -> S.Seq a
> > flatten (Fix Leaf) = S.empty
> > flatten (Fix (Node _ lN l e gN g)) = flatten l >< e >< flatten g
>
> > mySort = flatten . treeSort
>
> But we can also get elements efficently, forcing only a linear amount
> of comparisions in the worst case:
>
> > index :: BTree a -> Int -> a
> > index (Fix Leaf) _ = error "tried to get an element of Leaf"
> > index (Fix (Node lN l e gN g)) i | i < lN
> > = index l i
> > | i - lN < S.length e
> > = S.index e (i-lN)
> > | i - lN - S.length e < gN
> > = index g (i - lN - S.length e)
> > | i - lN - S.length e - gN >= 0
> > = error "index out of bounds"
>
> Although we do have to force comparisions only once every time we
> touch the same element in the tree, we do still have to traverse the
> tree (in logarithmic time).
>
> If you want linear time access on first touch of an element and
> constant time access afterwards us toArray:
>
> > toArray :: (IA.IArray a t) => Fix (BTreeRaw t) -> a Int t
> > toArray tree = IA.listArray (0,maxI) (map (index tree) [0..maxI])
> > where size (Fix Leaf) = 0
> > size (Fix (Node lN _ e gN _)) = lN + S.length e + gN
> > maxI = size tree - 1
>
> [1] Single-Threaded in the sense of Okasaki's use of the word.
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list