GHC Data.List.sort performance question

Bernie Pope bjpop at csse.unimelb.edu.au
Wed Jan 16 00:02:05 EST 2008


Hi again,

Please ignore my last post.

I just realised that the problem was in the way I was testing sort. I  
printed the length of the list out to make sure the spine was  
evaluated, but the problem is that the elements of the list are  
thunks which depend on each other in a left-right manner, the  
rightmost ones becoming very big. It is effectively the same issue  
as: sort $ take x [1..] that Bertram Felgenhauer identified much  
earlier.

Apologies for the noise.

Cheers,
Bernie.

On 16/01/2008, at 12:40 PM, Bernie Pope wrote:

> Hi all,
>
> I was looking into this issue as well, by coincidence.
>
> I haven't figured out what is wrong, but I believe Hugs (September  
> 2006) exhibits similar behaviour, so it is not just GHC.
>
> I also noticed that in the third equation of merge_pairs, swapping  
> the order of the arguments to merge also avoids the stack overflow:
>
> Original third equation:
>
>   merge_pairs cmp (xs:ys:xss)
>       = merge cmp xs ys : merge_pairs cmp xss
>
> Swapped version:
>
>    merge_pairs cmp (xs:ys:xss)
>        = merge cmp ys xs : merge_pairs cmp xss
>                                ^^^^^
>
> And for reference, here is how I was testing it:
>
>    main = do
>       args <- getArgs
>       let size = (read $ head args) :: Int
>       main' size
>
>    main' size = do
>       gen <- getStdGen
>       let rs = (randoms gen) :: [Int]
>       let list = take size rs
>       print $ length list
>       print $ sort list
>
> I'm keen to find out what is going on here.
>
> Cheers,
> Bernie.
>
> On 16/01/2008, at 2:43 AM, Simon Peyton-Jones wrote:
>
>> Weird.  I see no difference in the compiled code (GHC HEAD).
>>
>> Simon
>>
>> | -----Original Message-----
>> | From: glasgow-haskell-users-bounces at haskell.org [mailto:glasgow- 
>> haskell-users-bounces at haskell.org] On Behalf Of
>> | Marcus D. Gabriel
>> | Sent: 14 January 2008 21:02
>> | To: glasgow-haskell-users at haskell.org
>> | Subject: GHC Data.List.sort performance question
>> |
>> | Hello,
>> |
>> | By a rather indirect route, I discovered that I obtain an almost
>> | factor of two improvement in performance in Data.List.sort if I  
>> make
>> | one small change in the implementation of the function merge which
>> | supports mergesort and hence sortBy and sort.  Admittedly, the
>> | improvement was only noticeable to me when sorting for example one
>> | million random Int.  The current code in libraries/base/Data/ 
>> List.hs
>> | for merge is
>> |
>> | \begin{code}
>> |
>> | merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
>> | merge cmp xs [] = xs
>> | merge cmp [] ys = ys
>> | merge cmp (x:xs) (y:ys)
>> |  = case x `cmp` y of
>> |         GT -> y : merge cmp (x:xs)   ys
>> |         _  -> x : merge cmp    xs (y:ys)
>> |
>> | \end{code}
>> |
>> | and all that I did was
>> |
>> | \begin{code}
>> |
>> | merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
>> | merge cmp [] ys = ys
>> | merge cmp xs [] = xs
>> | merge cmp (x:xs) (y:ys)
>> |  = case x `cmp` y of
>> |         GT -> y : merge cmp (x:xs)   ys
>> |         _  -> x : merge cmp    xs (y:ys)
>> |
>> | \end{code}
>> |
>> | that is, I swapped the order of the first two lines.  By the  
>> way, the
>> | second version is much less likely to overflow the stack than the
>> | first version.
>> |
>> | Can some confirm this?  If you confirm it, can someone explain  
>> to me
>> | why one obtains this performance improvement?  I currently just  
>> do not
>> | grasp the point.
>> |
>> | Thanks,
>> | - Marcus
>> |
>> | --
>> |   Marcus D. Gabriel, Ph.D.                    Email:     
>> mdg at wanadoo.fr
>> |   213 ter, rue de Mulhouse                      Tel:  
>> +33.3.89.69.05.06
>> |   F68300 Saint Louis  FRANCE               Portable:  
>> +33.6.34.56.07.75
>> |
>> |
>> | _______________________________________________
>> | Glasgow-haskell-users mailing list
>> | Glasgow-haskell-users at haskell.org
>> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>> _______________________________________________
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users at haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



More information about the Glasgow-haskell-users mailing list