Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.
Siddhanathan Shanmugam
siddhanathan+eml at gmail.com
Mon Mar 27 16:53:10 UTC 2017
We can improve things a bit further by forcing evaluation (with seq) along
the way appropriately.
gregSortBy cmp [] = []
gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
where
pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
| otherwise = [x, y] : pair t
pair [x] = [[x]]
pair [] = []
reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
where v' = merge v w `seq` merge v w
x' = merge x y `seq` merge x y
reduce (x:y:t) = merge x y `seq` merge x y : reduce t
reduce xs = xs
merge xs [] = xs
merge [] ys = ys
merge xs@(x:xs') ys@(y:ys')
| x `cmp` y == GT = y : merge xs ys'
| otherwise = x : merge xs' ys
gSortBy cmp = mergeAll . sequences
where
sequences (a:b:xs)
| a `cmp` b == GT = descending b [a] xs
| otherwise = ascending b (a:) xs
sequences xs = [xs]
descending a as (b:bs)
| a `cmp` b == GT = descending b (a:as) bs
descending a as bs = (a:as) `seq` (a:as) : sequences bs
ascending a as (b:bs)
| a `cmp` b /= GT = ascending b (as . (a:)) bs
ascending a as bs = as [a] `seq` as [a] : sequences bs
mergeAll [x] = x
mergeAll xs = mergeAll (mergePairs xs)
mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
mergePairs xs = xs
merge as@(a:as') bs@(b:bs')
| a `cmp` b == GT = b : merge as bs'
| otherwise = a : merge as' bs
merge [] bs = bs
merge as [] = as
*Before the change:*
benchmarking random ints/ghc
time 3.687 s (3.541 s .. NaN s)
1.000 R² (1.000 R² .. 1.000 R²)
mean 3.691 s (3.669 s .. 3.705 s)
std dev 21.45 ms (0.0 s .. 24.76 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking random ints/greg
time 2.648 s (2.482 s .. 2.822 s)
0.999 R² (0.998 R² .. 1.000 R²)
mean 2.704 s (2.670 s .. 2.736 s)
std dev 52.68 ms (0.0 s .. 54.49 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking random ints/gSort
time 2.733 s (2.682 s .. 2.758 s)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.707 s (2.689 s .. 2.718 s)
std dev 16.84 ms (0.0 s .. 19.20 ms)
variance introduced by outliers: 19% (moderately inflated)
*After the change:*
benchmarking random ints/greg
time 2.576 s (2.548 s .. 2.628 s)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.590 s (2.578 s .. 2.599 s)
std dev 12.99 ms (0.0 s .. 14.89 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking random ints/gSort
time 2.538 s (2.412 s .. 2.627 s)
1.000 R² (0.999 R² .. 1.000 R²)
mean 2.543 s (2.517 s .. 2.560 s)
std dev 26.16 ms (0.0 s .. 30.21 ms)
variance introduced by outliers: 19% (moderately inflated)
On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <
siddhanathan+eml at gmail.com> wrote:
> Theoretically, we could do better. We currently only exploit monotonic
> runs in merge sort, but we could also exploit bitonic runs:
>
> dlist as = as [] `seq` as []
>
> sequences [] = [[]]
> sequences [a] = [[a]]
> sequences (a:xs) = bitonic a a (a:) xs
>
> bitonic min max as (b:bs)
> | b `cmp` max /= LT = bitonic min b (as . (b:)) bs
> | b `cmp` min /= GT = bitonic b max ((b:) . as) bs
> | otherwise = dlist as : sequences (b:bs)
> bitonic _ _ as [] = [dlist as]
>
>
> The constant factors here might be too high to notice the difference
> though.
>
>
> > However, still my version is more laziness-friendly, i.e. it requires
> fewer
> > comparisons to get the
> > N smallest elements of a list (see
> > https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
> ).
> >
> > I wonder if this might not be a more useful trait than being able to sort
> > already sorted lists super fast.
>
> This comes down to a discussion of merge sort vs natural merge sort.
>
> Data.List.sort is an implementation of a variant of merge sort called
> natural merge sort. The algorithm is linearithmic in the worst case, but
> linear in the best case (already sorted list).
>
>
>
> On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <greg7mdp at gmail.com>
> wrote:
>
>> Thanks again @Siddhanathan! Looks like your gSort fixes the main issue
>> with
>> Data.List.sort().
>>
>> I have updated the test programs in https://github.com/greg7mdp/ghc-sort
>> to
>> include your new version.
>>
>> Here are the results (your new version looks like a definite improvement
>> vs
>> the current GHC one):
>>
>> input GHC sort My Orig proposal gSort
>> ------------------------------------------------------------
>> ----------------
>> ---
>> sorted ints (ascending) 151 456 148
>> sorted ints (descending) 152 466 155
>> random ints 2732 2006 2004
>> random strings 6564 5549 5528
>>
>>
>> So replacing the current GHC version with gSort is a no brainer, as it is
>> better in all regards.
>>
>> However, still my version is more laziness-friendly, i.e. it requires
>> fewer
>> comparisons to get the
>> N smallest elements of a list (see
>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>>
>> I wonder if this might not be a more useful trait than being able to sort
>> already sorted lists super fast.
>>
>> Thanks,
>>
>> greg
>>
>> ________________________________
>>
>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On Behalf Of
>> Siddhanathan Shanmugam
>> Sent: Sunday, March 26, 2017 1:05 PM
>> To: Gregory Popovitch
>> Cc: Haskell Libraries
>> Subject: Re: Proposal: a new implementation for Data.List.sort and
>> Data.List.sortBy, which has better performance characteristics and is more
>> laziness-friendly.
>>
>>
>> Interesting. You are right, performance for sorting random lists has
>> priority over performance for sorting already-sorted lists.
>>
>> Ignore the numbers for my previous version. Can you compare GHC's sort,
>> your
>> proposal, and gSort below?
>>
>>
>> gSort :: Ord a => [a] -> [a]
>> gSort = gSortBy compare
>> gSortBy cmp = mergeAll . sequences
>> where
>> sequences (a:b:xs)
>> | a `cmp` b == GT = descending b [a] xs
>> | otherwise = ascending b (a:) xs
>> sequences xs = [xs]
>>
>>
>> descending a as (b:bs)
>> | a `cmp` b == GT = descending b (a:as) bs
>> descending a as bs = (a:as) : sequences bs
>>
>>
>> ascending a as (b:bs)
>> | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
>> ascending a as bs = as [a] `seq` as [a] : sequences bs
>>
>>
>> mergeAll [x] = x
>> mergeAll xs = mergeAll (mergePairs xs)
>>
>>
>> mergePairs (a:b:xs) = merge a b : mergePairs xs
>> mergePairs xs = xs
>>
>>
>> merge as@(a:as') bs@(b:bs')
>> | a `cmp` b == GT = b : merge as bs'
>> | otherwise = a : merge as' bs
>> merge [] bs = bs
>> merge as [] = as
>>
>>
>> Thanks,
>> Sid
>>
>>
>> On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <greg7mdp at gmail.com>
>> wrote:
>>
>>
>> Thank you @Siddhanathan! I welcome any improvement you may make,
>> as
>> I said I
>> am very far from a Haskell expert.
>>
>> I just tested your change with my test project
>> (https://github.com/greg7mdp/ghc-sort
>> <https://github.com/greg7mdp/ghc-sort> )
>> and here are my results (mean times in ms):
>>
>> input GHC sort Orig proposal
>> your
>> change
>>
>> ------------------------------------------------------------
>> ----------------
>> ---
>> sorted ints (ascending) 153 467
>> 139
>> sorted ints (descending) 152 472
>> 599
>> random ints 2824 2077
>> 2126
>> random strings 6564 5613
>> 5983
>>
>> Your change is a definite improvement for sorted integers in
>> ascending
>> order, but is worse for other cases.
>>
>> Is there a real need to optimize the sort for already sorted list?
>> Of course
>> it should not be a degenerate
>> case and take longer than sorting random numbers, but this is not
>> the case
>> here. Sorting already sorted
>> lists is, even with my version, over 4 times faster than sorting
>> random
>> lists. This sounds perfectly
>> acceptable to me, and I feel that trying to optimize this specific
>> case
>> further, if it comes at the
>> detriment of the general case, is not desirable.
>>
>> Thanks,
>>
>> greg
>>
>> ________________________________
>>
>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On
>> Behalf Of
>> Siddhanathan Shanmugam
>> Sent: Sunday, March 26, 2017 11:41 AM
>> To: Gregory Popovitch
>> Cc: Haskell Libraries
>> Subject: Re: Proposal: a new implementation for Data.List.sort and
>> Data.List.sortBy, which has better performance characteristics and
>> is more
>> laziness-friendly.
>>
>>
>>
>> Thank you! This identifies a space leak in base which went
>> unnoticed
>> for 7
>> years.
>>
>> Your implementation can be improved further. Instead of splitting
>> into
>> pairs, you could instead split into lists of sorted sublists by
>> replacing
>> the pairs function with the following
>>
>> pair = foldr f []
>> where
>> f x [] = [[x]]
>> f x (y:ys)
>> | x `cmp` head y == LT = (x:y):ys
>> | otherwise = [x]:y:ys
>>
>> This should give you the same performance improvements for sorting
>> random
>> lists, but better performance while sorting ascending lists.
>>
>> The version in base takes it one step further by using a DList to
>> handle the
>> descending case efficiently as well, except there's a space leak
>> right now
>> because of which it is slower.
>>
>> On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
>> <greg7mdp at gmail.com>
>> wrote:
>>
>>
>>
>> Motivation:
>> ----------
>>
>> Data.List.sort is a very important functionality in
>> Haskell.
>> I
>> believe that
>> the proposed implementation is:
>>
>> - significantly faster than the current implementation on
>> unsorted
>> lists,
>> typically 14% to 27% faster
>> - more laziness-friendly, i.e.:
>> take 3 $ sort l
>> will require significantly less comparisons than the
>> current
>> implementation
>>
>> Proposed Implementation
>> -----------------------
>>
>> sort :: (Ord a) => [a] -> [a]
>> sort = sortBy compare
>>
>> sortBy cmp [] = []
>> sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
>> where
>> pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
>> | otherwise = [x, y] : pair t
>> pair [x] = [[x]]
>> pair [] = []
>>
>> reduce (v:w:x:y:t) = merge v' x' : reduce t
>> where v' = merge v w
>> x' = merge x y
>>
>> reduce (x:y:t) = merge x y : reduce t
>> reduce xs = xs
>>
>> merge xs [] = xs
>> merge [] ys = ys
>> merge xs@(x:xs') ys@(y:ys')
>> | x `cmp` y == GT = y : merge xs ys'
>> | otherwise = x : merge xs' ys
>>
>>
>> Effect and Interactions
>> -----------------------
>>
>> I have a stack project with a criterion test for this new
>> implementation,
>> available at https://github.com/greg7mdp/ghc-sort
>> <https://github.com/greg7mdp/ghc-sort>
>>
>> <https://github.com/greg7mdp/ghc-sort
>> <https://github.com/greg7mdp/ghc-sort> > .
>> I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
>> had the
>> following
>> results:
>>
>> - sorting of random lists of integers is 27% faster
>> - sorting of random lists of strings is 14% faster
>> - sorting of already sorted lists is significantly slower,
>> but still
>> much
>> faster than sorting random lists
>> - proposed version is more laziness friendly. For example
>> this
>> version of
>> sortBy requires 11 comparisons to find
>> the smallest element of a 15 element list, while the
>> default
>> Data.List.sortBy requires 15 comparisons.
>> (see
>>
>>
>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>
>>
>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>
>> >
>> )
>>
>>
>>
>> Test results
>> ------------
>>
>> Criterion output (descending/ascending results are for
>> already
>> sorted
>> lists).
>> I barely understand what Criterion does, and I am puzzled
>> with the
>> various
>> "T" output - maybe there is a bug in my bench code:
>>
>> vagrant at vagrant-ubuntu-trusty-64:/vagrant$ stack exec
>> ghc-sort
>> benchmarking ascending ints/ghc
>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>> 160.6 ms
>> (153.4
>> ms .. 167.8 ms)
>> 0.997 R² (0.986 R² .. 1.000 R²)
>> mean 161.7 ms (158.3 ms .. 165.9 ms)
>> std dev 5.210 ms (3.193 ms .. 7.006 ms)
>> variance introduced by outliers: 12% (moderately inflated)
>>
>> benchmarking ascending ints/greg
>> TTTTTTTTTTTTTTTTtime 473.8 ms (398.6 ms
>> ..
>> 554.9
>> ms)
>> 0.996 R² (0.987 R² .. 1.000 R²)
>> mean 466.2 ms (449.0 ms .. 475.0 ms)
>> std dev 14.94 ms (0.0 s .. 15.29 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>> benchmarking descending ints/ghc
>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>> 165.1 ms
>> (148.2
>> ms .. 178.2 ms)
>> 0.991 R² (0.957 R² .. 1.000 R²)
>> mean 158.7 ms (154.0 ms .. 164.3 ms)
>> std dev 7.075 ms (4.152 ms .. 9.903 ms)
>> variance introduced by outliers: 12% (moderately inflated)
>>
>> benchmarking descending ints/greg
>> TTTTTTTTTTTTTTTTtime 471.7 ms (419.8 ms
>> ..
>> 508.3
>> ms)
>> 0.999 R² (0.995 R² .. 1.000 R²)
>> mean 476.0 ms (467.5 ms .. 480.0 ms)
>> std dev 7.447 ms (67.99 as .. 7.865 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>> benchmarking random ints/ghc
>> TTTTTTTTTTTTTTTTtime 2.852 s (2.564 s
>> ..
>> 3.019 s)
>> 0.999 R² (0.997 R² .. 1.000 R²)
>> mean 2.812 s (2.785 s .. 2.838 s)
>> std dev 44.06 ms (543.9 as .. 44.97 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>> benchmarking random ints/greg
>> TTTTTTTTTTTTTTTTtime 2.032 s (1.993 s
>> ..
>> 2.076 s)
>> 1.000 R² (1.000 R² .. 1.000 R²)
>> mean 2.028 s (2.019 s .. 2.033 s)
>> std dev 7.832 ms (0.0 s .. 8.178 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>> benchmarking shakespeare/ghc
>> TTTTTTTTTTTTTTTTtime 6.504 s (6.391 s
>> ..
>> 6.694 s)
>> 1.000 R² (1.000 R² .. 1.000 R²)
>> mean 6.499 s (6.468 s .. 6.518 s)
>> std dev 28.85 ms (0.0 s .. 32.62 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>> benchmarking shakespeare/greg
>> TTTTTTTTTTTTTTTTtime 5.560 s (5.307 s
>> ..
>> 5.763 s)
>> 1.000 R² (0.999 R² .. 1.000 R²)
>> mean 5.582 s (5.537 s .. 5.607 s)
>> std dev 39.30 ms (0.0 s .. 43.49 ms)
>> variance introduced by outliers: 19% (moderately inflated)
>>
>>
>> Costs and Drawbacks
>> -------------------
>>
>> The only cost I see is the reduced performance when
>> sorting
>> already
>> sorted
>> lists. However, since this remains quite efficient, indeed
>> over 4
>> times
>> faster than sorting unsorted lists, I think it is an
>> acceptable
>> tradeoff.
>>
>> Final note
>> ----------
>>
>> My Haskell is very rusty. I worked on this a couple years
>> ago when I
>> was
>> learning Haskell, and meant to propose it to the Haskell
>> community,
>> but
>> never got to it at the time.
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bi
>> n/mailman/listinfo/libraries
>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
>>
>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170327/b4c69d63/attachment-0001.html>
More information about the Libraries
mailing list