Proposal: a new implementation for Data.List.sort and Data.List.sortBy, which has better performance characteristics and is more laziness-friendly.

Sun Mar 26 20:54:18 UTC 2017

Theoretically, we could do better. We currently only exploit monotonic runs
in merge sort, but we could also exploit bitonic runs:

    dlist as = as [] `seq` as []

    sequences [] = [[]]
    sequences [a] = [[a]]
    sequences (a:xs) = bitonic a a (a:) xs

    bitonic min max as (b:bs)
      | b `cmp` max /= LT = bitonic min b   (as . (b:)) bs
      | b `cmp` min /= GT = bitonic b   max ((b:) . as) bs
      | otherwise = dlist as : sequences (b:bs)
    bitonic _ _ as [] = [dlist as]

The constant factors here might be too high to notice the difference though.

> However, still my version is more laziness-friendly, i.e. it requires
fewer
> comparisons to get the
> N smallest elements of a list (see
> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.

This comes down to a discussion of merge sort vs natural merge sort.

Data.List.sort is an implementation of a variant of merge sort called
natural merge sort. The algorithm is linearithmic in the worst case, but
linear in the best case (already sorted list).

On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <greg7mdp at gmail.com>
wrote:

> Thanks again @Siddhanathan! Looks like your gSort fixes the main issue with
> Data.List.sort().
>
> I have updated the test programs in https://github.com/greg7mdp/ghc-sort
> to
> include your new version.
>
> Here are the results (your new version looks like a definite improvement vs
> the current GHC one):
>
> input                        GHC sort         My Orig proposal     gSort
> ------------------------------------------------------------
> ----------------
> ---
> sorted ints (ascending)      151               456                  148
> sorted ints (descending)     152               466                  155
> random ints                 2732              2006                 2004
> random strings              6564              5549                 5528
>
>
> So replacing the current GHC version with gSort is a no brainer, as it is
> better in all regards.
>
> However, still my version is more laziness-friendly, i.e. it requires fewer
> comparisons to get the
> N smallest elements of a list (see
> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs).
>
> I wonder if this might not be a more useful trait than being able to sort
> already sorted lists super fast.
>
> Thanks,
>
> greg
>
> ________________________________
>
> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On Behalf Of
> Siddhanathan Shanmugam
> Sent: Sunday, March 26, 2017 1:05 PM
> To: Gregory Popovitch
> Cc: Haskell Libraries
> Subject: Re: Proposal: a new implementation for Data.List.sort and
> Data.List.sortBy, which has better performance characteristics and is more
> laziness-friendly.
>
>
> Interesting. You are right, performance for sorting random lists has
> priority over performance for sorting already-sorted lists.
>
> Ignore the numbers for my previous version. Can you compare GHC's sort,
> your
> proposal, and gSort below?
>
>
> gSort :: Ord a => [a] -> [a]
> gSort = gSortBy compare
> gSortBy cmp = mergeAll . sequences
>   where
>     sequences (a:b:xs)
>       | a `cmp` b == GT = descending b [a]  xs
>       | otherwise       = ascending  b (a:) xs
>     sequences xs = [xs]
>
>
>     descending a as (b:bs)
>       | a `cmp` b == GT = descending b (a:as) bs
>     descending a as bs  = (a:as) : sequences bs
>
>
>     ascending a as (b:bs)
>       | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
>     ascending a as bs   = as [a] `seq` as [a] : sequences bs
>
>
>     mergeAll [x] = x
>     mergeAll xs  = mergeAll (mergePairs xs)
>
>
>     mergePairs (a:b:xs) = merge a b : mergePairs xs
>     mergePairs xs       = xs
>
>
>     merge as@(a:as') bs@(b:bs')
>       | a `cmp` b == GT = b : merge as  bs'
>       | otherwise       = a : merge as' bs
>     merge [] bs         = bs
>     merge as []         = as
>
>
> Thanks,
> Sid
>
>
> On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <greg7mdp at gmail.com>
> wrote:
>
>
>         Thank you @Siddhanathan! I welcome any improvement you may make, as
> I said I
>         am very far from a Haskell expert.
>
>         I just tested your change with my test project
>         (https://github.com/greg7mdp/ghc-sort
> <https://github.com/greg7mdp/ghc-sort> )
>         and here are my results (mean times in ms):
>
>         input                        GHC sort          Orig proposal
> your
>         change
>
> ------------------------------------------------------------
> ----------------
>         ---
>         sorted ints (ascending)      153               467
> 139
>         sorted ints (descending)     152               472
> 599
>         random ints                 2824              2077
> 2126
>         random strings              6564              5613
> 5983
>
>         Your change is a definite improvement for sorted integers in
> ascending
>         order, but is worse for other cases.
>
>         Is there a real need to optimize the sort for already sorted list?
> Of course
>         it should not be a degenerate
>         case and take longer than sorting random numbers, but this is not
> the case
>         here. Sorting already sorted
>         lists is, even with my version, over 4 times faster than sorting
> random
>         lists. This sounds perfectly
>         acceptable to me, and I feel that trying to optimize this specific
> case
>         further, if it comes at the
>         detriment of the general case, is not desirable.
>
>         Thanks,
>
>         greg
>
>         ________________________________
>
>         From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On
> Behalf Of
>         Siddhanathan Shanmugam
>         Sent: Sunday, March 26, 2017 11:41 AM
>         To: Gregory Popovitch
>         Cc: Haskell Libraries
>         Subject: Re: Proposal: a new implementation for Data.List.sort and
>         Data.List.sortBy, which has better performance characteristics and
> is more
>         laziness-friendly.
>
>
>
>         Thank you! This identifies a space leak in base which went
> unnoticed
> for 7
>         years.
>
>         Your implementation can be improved further. Instead of splitting
> into
>         pairs, you could instead split into lists of sorted sublists by
> replacing
>         the pairs function with the following
>
>             pair = foldr f []
>               where
>                 f x [] = [[x]]
>                 f x (y:ys)
>                   | x `cmp` head y == LT = (x:y):ys
>                   | otherwise            = [x]:y:ys
>
>         This should give you the same performance improvements for sorting
> random
>         lists, but better performance while sorting ascending lists.
>
>         The version in base takes it one step further by using a DList to
> handle the
>         descending case efficiently as well, except there's a space leak
> right now
>         because of which it is slower.
>
>         On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
> <greg7mdp at gmail.com>
>         wrote:
>
>
>
>                 Motivation:
>                 ----------
>
>                 Data.List.sort is a very important functionality in
> Haskell.
> I
>         believe that
>                 the proposed implementation is:
>
>                 - significantly faster than the current implementation on
> unsorted
>         lists,
>                 typically 14% to 27% faster
>                 - more laziness-friendly, i.e.:
>                     take 3 $ sort l
>                   will require significantly less comparisons than the
> current
>                 implementation
>
>                 Proposed Implementation
>                 -----------------------
>
>                 sort :: (Ord a) => [a] -> [a]
>                 sort =  sortBy compare
>
>                 sortBy cmp [] = []
>                 sortBy cmp xs = head $ until (null.tail) reduce (pair xs)
>                   where
>                     pair (x:y:t) | x `cmp` y == GT  = [y, x] : pair t
>                                  | otherwise        = [x, y] : pair t
>                     pair [x] = [[x]]
>                     pair []  = []
>
>                     reduce (v:w:x:y:t) = merge v' x' : reduce t
>                                          where v' = merge v w
>                                                x' = merge x y
>
>                     reduce (x:y:t) = merge x y : reduce t
>                     reduce xs      = xs
>
>                     merge xs []           = xs
>                     merge []  ys          = ys
>                     merge xs@(x:xs') ys@(y:ys')
>                          | x `cmp` y == GT  = y : merge xs  ys'
>                          | otherwise        = x : merge xs' ys
>
>
>                 Effect and Interactions
>                 -----------------------
>
>                 I have a stack project with a criterion test for this new
>         implementation,
>                 available at https://github.com/greg7mdp/ghc-sort
> <https://github.com/greg7mdp/ghc-sort>
>
>         <https://github.com/greg7mdp/ghc-sort
> <https://github.com/greg7mdp/ghc-sort> > .
>                 I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and
> had the
>         following
>                 results:
>
>                 - sorting of random lists of integers is 27% faster
>                 - sorting of random lists of strings is 14% faster
>                 - sorting of already sorted lists is significantly slower,
> but still
>         much
>                 faster than sorting random lists
>                 - proposed version is more laziness friendly. For example
> this
>         version of
>                 sortBy requires 11 comparisons to find
>                   the smallest element of a 15 element list, while the
> default
>                 Data.List.sortBy requires 15 comparisons.
>                   (see
>
>
> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>
>
> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs
> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs>
> >
> )
>
>
>
>                 Test results
>                 ------------
>
>                 Criterion output (descending/ascending results are for
> already
>         sorted
>                 lists).
>                 I barely understand what Criterion does, and I am puzzled
> with the
>         various
>                 "T" output - maybe there is a bug in my bench code:
>
>                 vagrant at vagrant-ubuntu-trusty-64:/vagrant$ stack exec
> ghc-sort
>                 benchmarking ascending ints/ghc
>                 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
> 160.6 ms
>         (153.4
>                 ms .. 167.8 ms)
>                                      0.997 R²   (0.986 R² .. 1.000 R²)
>                 mean                 161.7 ms   (158.3 ms .. 165.9 ms)
>                 std dev              5.210 ms   (3.193 ms .. 7.006 ms)
>                 variance introduced by outliers: 12% (moderately inflated)
>
>                 benchmarking ascending ints/greg
>                 TTTTTTTTTTTTTTTTtime                 473.8 ms   (398.6 ms
> ..
> 554.9
>         ms)
>                                      0.996 R²   (0.987 R² .. 1.000 R²)
>                 mean                 466.2 ms   (449.0 ms .. 475.0 ms)
>                 std dev              14.94 ms   (0.0 s .. 15.29 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>                 benchmarking descending ints/ghc
>                 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
> 165.1 ms
>         (148.2
>                 ms .. 178.2 ms)
>                                      0.991 R²   (0.957 R² .. 1.000 R²)
>                 mean                 158.7 ms   (154.0 ms .. 164.3 ms)
>                 std dev              7.075 ms   (4.152 ms .. 9.903 ms)
>                 variance introduced by outliers: 12% (moderately inflated)
>
>                 benchmarking descending ints/greg
>                 TTTTTTTTTTTTTTTTtime                 471.7 ms   (419.8 ms
> ..
> 508.3
>         ms)
>                                      0.999 R²   (0.995 R² .. 1.000 R²)
>                 mean                 476.0 ms   (467.5 ms .. 480.0 ms)
>                 std dev              7.447 ms   (67.99 as .. 7.865 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>                 benchmarking random ints/ghc
>                 TTTTTTTTTTTTTTTTtime                 2.852 s    (2.564 s ..
> 3.019 s)
>                                      0.999 R²   (0.997 R² .. 1.000 R²)
>                 mean                 2.812 s    (2.785 s .. 2.838 s)
>                 std dev              44.06 ms   (543.9 as .. 44.97 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>                 benchmarking random ints/greg
>                 TTTTTTTTTTTTTTTTtime                 2.032 s    (1.993 s ..
> 2.076 s)
>                                      1.000 R²   (1.000 R² .. 1.000 R²)
>                 mean                 2.028 s    (2.019 s .. 2.033 s)
>                 std dev              7.832 ms   (0.0 s .. 8.178 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>                 benchmarking shakespeare/ghc
>                 TTTTTTTTTTTTTTTTtime                 6.504 s    (6.391 s ..
> 6.694 s)
>                                      1.000 R²   (1.000 R² .. 1.000 R²)
>                 mean                 6.499 s    (6.468 s .. 6.518 s)
>                 std dev              28.85 ms   (0.0 s .. 32.62 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>                 benchmarking shakespeare/greg
>                 TTTTTTTTTTTTTTTTtime                 5.560 s    (5.307 s ..
> 5.763 s)
>                                      1.000 R²   (0.999 R² .. 1.000 R²)
>                 mean                 5.582 s    (5.537 s .. 5.607 s)
>                 std dev              39.30 ms   (0.0 s .. 43.49 ms)
>                 variance introduced by outliers: 19% (moderately inflated)
>
>
>                 Costs and Drawbacks
>                 -------------------
>
>                 The only cost I see is the reduced performance when sorting
> already
>         sorted
>                 lists. However, since this remains quite efficient, indeed
> over 4
>         times
>                 faster than sorting unsorted lists, I think it is an
> acceptable
>         tradeoff.
>
>                 Final note
>                 ----------
>
>                 My Haskell is very rusty. I worked on this a couple years
> ago when I
>         was
>                 learning Haskell, and meant to propose it to the Haskell
> community,
>         but
>                 never got to it at the time.
>
>                 _______________________________________________
>                 Libraries mailing list
>                 Libraries at haskell.org
>                 http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
>
>         <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170326/611aa479/attachment-0001.html>