GHC Data.List.sort performance question
Don Stewart
dons at galois.com
Mon Jan 14 18:34:11 EST 2008
mdg:
> Hello,
>
> By a rather indirect route, I discovered that I obtain an almost
> factor of two improvement in performance in Data.List.sort if I make
> one small change in the implementation of the function merge which
> supports mergesort and hence sortBy and sort. Admittedly, the
> improvement was only noticeable to me when sorting for example one
> million random Int. The current code in libraries/base/Data/List.hs
> for merge is
>
> \begin{code}
>
> merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
> merge cmp xs [] = xs
> merge cmp [] ys = ys
> merge cmp (x:xs) (y:ys)
> = case x `cmp` y of
> GT -> y : merge cmp (x:xs) ys
> _ -> x : merge cmp xs (y:ys)
>
> \end{code}
>
> and all that I did was
>
> \begin{code}
>
> merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
> merge cmp [] ys = ys
> merge cmp xs [] = xs
> merge cmp (x:xs) (y:ys)
> = case x `cmp` y of
> GT -> y : merge cmp (x:xs) ys
> _ -> x : merge cmp xs (y:ys)
>
> \end{code}
>
> that is, I swapped the order of the first two lines. By the way, the
> second version is much less likely to overflow the stack than the
> first version.
>
> Can some confirm this? If you confirm it, can someone explain to me
> why one obtains this performance improvement? I currently just do not
> grasp the point.
>
> Thanks,
> - Marcus
Ian, you looked at sort most recently. What to check the generated code
(or rerun your benchmarks?)
-- Don
More information about the Glasgow-haskell-users
mailing list