fix for Data.List.sortBy
Siddhanathan Shanmugam
siddhanathan+eml at gmail.com
Tue Mar 28 06:44:42 UTC 2017
Turns out we don't need seq at all. A simple refactoring of the merge
function does the trick equally well.
mergePairs (a:b:xs) = merge id a b : mergePairs xs
mergePairs xs = xs
merge f as@(a:as') bs@(b:bs')
| a `cmp` b == GT = merge (f.(b:)) as bs'
| otherwise = merge (f.(a:)) as' bs
merge f [] bs = f bs
merge f as [] = f as
This variant is 10% faster in my tests.
On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <danburton.email at gmail.com>
wrote:
> Does this rely on Common Subexpression Elimination optimization in order
> to work? Would it work more reliably if the `seq`-ed expression were
> let-bound?
>
I don't think it relies heavily on CSE. The seq's are there to avoid a
cascading series of thunk evaluations. Using let expressions doesn't seem
to affect the benchmarks.
>
> -- Dan Burton
>
> On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <david.feuer at gmail.com>
> wrote:
>
>> The first seq is useless: constructor application is never suspended. I
>> haven't had a chance to look at the rest yet.
>>
>> On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <greg7mdp at gmail.com> wrote:
>>
>>> Sid,
>>>
>>> I'd be delighted to submit the patch, as long as I have permission
>>> (which I probably don't), you feel confident about the change and maybe a
>>> couple of other people agree.
>>>
>>> Here is the proposed change. Tests shows significant speed improvement
>>> (30%) when sorting lists of random numbers, and same efficiency for sorting
>>> already sorted lists (both ascending and descending).
>>>
>>>
>>> Thanks,
>>>
>>> greg
>>>
>>> ------------------------------
>>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>>> Behalf Of *Siddhanathan Shanmugam
>>> *Sent:* Monday, March 27, 2017 6:53 PM
>>> *To:* Gregory Popovitch
>>> *Subject:* RE: Proposal: a new implementation for Data.List.sort and
>>> Data.List.sortBy, which has better performance characteristics and is more
>>> laziness-friendly.
>>>
>>> Since I don't see any regressions, this doesn't really need CLC
>>> approval. The changes are also small enough that a Github PR may be
>>> accepted (otherwise, the change goes in via Phabricator).
>>>
>>> Are you interested in implementing this patch? If yes, a standard Github
>>> PR should be fine. Right now gSort is a three line change I think. It will
>>> be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on
>>> Github.
>>>
>>> I'm hoping for some more comments from other Haskellers, before pushing
>>> for this change in base. I feel like we may be missing a potential
>>> optimization that someone else might spot. So probably going to wait a few
>>> days.
>>>
>>>
>>> On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <greg7mdp at gmail.com>
>>> wrote:
>>>
>>> Hi Sid,
>>>
>>> Thanks, glad you looked into that. My understanding of the Haskell
>>> execution model is really poor, so I can't say one way or the other, but I
>>> felt that laziness ought to be considered as well, and I'm glad it was :-)
>>>
>>> So in conclusion it looks like we have a winner with your latest
>>> gSortBy version. How do we get this pushed to the GHC library?
>>>
>>> Thanks,
>>>
>>> greg
>>>
>>> ------------------------------
>>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>>> Behalf Of *Siddhanathan Shanmugam
>>> *Sent:* Monday, March 27, 2017 2:12 PM
>>> *To:* Gregory Popovitch
>>>
>>> *Subject:* Re: Proposal: a new implementation for Data.List.sort and
>>> Data.List.sortBy, which has better performance characteristics and is more
>>> laziness-friendly.
>>>
>>> Hi Greg,
>>>
>>> On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <greg7mdp at gmail.com>
>>> wrote:
>>>
>>>> Unfortunately, this optimization makes the sort less lazy, so doing
>>>> something like:
>>>>
>>>> take 4 $ sort l
>>>>
>>>> requires more sorting of the list l with this change. I'm not sure it
>>>> is a good tradeoff.
>>>>
>>>> This can be verified with: https://github.com/greg7mdp/gh
>>>> c-sort/blob/master/src/sort_with_trace.hs
>>>>
>>>
>>> I think you're running without optimizations turned on. It is lazy in my
>>> case.
>>>
>>> Also, the difference should be negligible (if any at all). Here's an
>>> example of the list being sorted:
>>>
>>> [11,4,6,8,2,5,1,7,9,55,11,3]
>>> ...
>>> [[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
>>> ...
>>> [[1,2,4,5,6,7,8,9,11,55],[3,11]]
>>> * 1 3
>>> * 2 3
>>> * 4 3
>>> * 4 11
>>> [1,2,3,4]
>>>
>>> The number of operations saved is only in the last merge. It's only lazy
>>> at this step.
>>>
>>> So we save at most one traversal of the list, which is not too expensive
>>> since our worst case bounds is O(n log n) anyway.
>>>
>>> This should mean that the asymptotic performance should be identical,
>>> regardless of the number of comparisons saved. Of course, you do get better
>>> constants, but I would be surprised if those constants translated to
>>> significantly better performance for a reasonable size list.
>>>
>>>
>>>>
>>>> I do agree that it would be nice to have a more serious validation
>>>> test suite.
>>>>
>>>> Thanks,
>>>>
>>>> greg
>>>>
>>>> ------------------------------
>>>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>>>> Behalf Of *Siddhanathan Shanmugam
>>>> *Sent:* Monday, March 27, 2017 12:53 PM
>>>>
>>>> *To:* Gregory Popovitch
>>>> *Cc:* Haskell Libraries
>>>> *Subject:* Re: Proposal: a new implementation for Data.List.sort and
>>>> Data.List.sortBy, which has better performance characteristics and is more
>>>> laziness-friendly.
>>>>
>>>> We can improve things a bit further by forcing evaluation (with seq)
>>>> along the way appropriately.
>>>>
>>>>
>>>>
>>>> gregSortBy cmp [] = []
>>>> gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
>>>> where
>>>> pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
>>>> | otherwise = [x, y] : pair t
>>>> pair [x] = [[x]]
>>>> pair [] = []
>>>>
>>>> reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
>>>> where v' = merge v w `seq` merge v w
>>>> x' = merge x y `seq` merge x y
>>>>
>>>> reduce (x:y:t) = merge x y `seq` merge x y : reduce t
>>>> reduce xs = xs
>>>>
>>>> merge xs [] = xs
>>>> merge [] ys = ys
>>>> merge xs@(x:xs') ys@(y:ys')
>>>> | x `cmp` y == GT = y : merge xs ys'
>>>> | otherwise = x : merge xs' ys
>>>>
>>>>
>>>> gSortBy cmp = mergeAll . sequences
>>>> where
>>>> sequences (a:b:xs)
>>>> | a `cmp` b == GT = descending b [a] xs
>>>> | otherwise = ascending b (a:) xs
>>>> sequences xs = [xs]
>>>>
>>>>
>>>> descending a as (b:bs)
>>>> | a `cmp` b == GT = descending b (a:as) bs
>>>> descending a as bs = (a:as) `seq` (a:as) : sequences bs
>>>>
>>>>
>>>> ascending a as (b:bs)
>>>> | a `cmp` b /= GT = ascending b (as . (a:)) bs
>>>> ascending a as bs = as [a] `seq` as [a] : sequences bs
>>>>
>>>>
>>>> mergeAll [x] = x
>>>> mergeAll xs = mergeAll (mergePairs xs)
>>>>
>>>>
>>>> mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
>>>> mergePairs xs = xs
>>>>
>>>>
>>>> merge as@(a:as') bs@(b:bs')
>>>> | a `cmp` b == GT = b : merge as bs'
>>>> | otherwise = a : merge as' bs
>>>> merge [] bs = bs
>>>> merge as [] = as
>>>>
>>>>
>>>>
>>>>
>>>> *Before the change:*
>>>>
>>>> benchmarking random ints/ghc
>>>> time 3.687 s (3.541 s .. NaN s)
>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>> mean 3.691 s (3.669 s .. 3.705 s)
>>>> std dev 21.45 ms (0.0 s .. 24.76 ms)
>>>> variance introduced by outliers: 19% (moderately inflated)
>>>>
>>>> benchmarking random ints/greg
>>>> time 2.648 s (2.482 s .. 2.822 s)
>>>> 0.999 R² (0.998 R² .. 1.000 R²)
>>>> mean 2.704 s (2.670 s .. 2.736 s)
>>>> std dev 52.68 ms (0.0 s .. 54.49 ms)
>>>> variance introduced by outliers: 19% (moderately inflated)
>>>>
>>>> benchmarking random ints/gSort
>>>> time 2.733 s (2.682 s .. 2.758 s)
>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>> mean 2.707 s (2.689 s .. 2.718 s)
>>>> std dev 16.84 ms (0.0 s .. 19.20 ms)
>>>> variance introduced by outliers: 19% (moderately inflated)
>>>>
>>>> *After the change:*
>>>>
>>>> benchmarking random ints/greg
>>>> time 2.576 s (2.548 s .. 2.628 s)
>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>> mean 2.590 s (2.578 s .. 2.599 s)
>>>> std dev 12.99 ms (0.0 s .. 14.89 ms)
>>>> variance introduced by outliers: 19% (moderately inflated)
>>>>
>>>> benchmarking random ints/gSort
>>>> time 2.538 s (2.412 s .. 2.627 s)
>>>> 1.000 R² (0.999 R² .. 1.000 R²)
>>>> mean 2.543 s (2.517 s .. 2.560 s)
>>>> std dev 26.16 ms (0.0 s .. 30.21 ms)
>>>> variance introduced by outliers: 19% (moderately inflated)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <
>>>> siddhanathan+eml at gmail.com> wrote:
>>>>
>>>>> Theoretically, we could do better. We currently only exploit monotonic
>>>>> runs in merge sort, but we could also exploit bitonic runs:
>>>>>
>>>>> dlist as = as [] `seq` as []
>>>>>
>>>>> sequences [] = [[]]
>>>>> sequences [a] = [[a]]
>>>>> sequences (a:xs) = bitonic a a (a:) xs
>>>>>
>>>>> bitonic min max as (b:bs)
>>>>> | b `cmp` max /= LT = bitonic min b (as . (b:)) bs
>>>>> | b `cmp` min /= GT = bitonic b max ((b:) . as) bs
>>>>> | otherwise = dlist as : sequences (b:bs)
>>>>> bitonic _ _ as [] = [dlist as]
>>>>>
>>>>>
>>>>> The constant factors here might be too high to notice the difference
>>>>> though.
>>>>>
>>>>>
>>>>> > However, still my version is more laziness-friendly, i.e. it
>>>>> requires fewer
>>>>> > comparisons to get the
>>>>> > N smallest elements of a list (see
>>>>> > https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_
>>>>> with_trace.hs).
>>>>> >
>>>>> > I wonder if this might not be a more useful trait than being able to
>>>>> sort
>>>>> > already sorted lists super fast.
>>>>>
>>>>> This comes down to a discussion of merge sort vs natural merge sort.
>>>>>
>>>>> Data.List.sort is an implementation of a variant of merge sort called
>>>>> natural merge sort. The algorithm is linearithmic in the worst case, but
>>>>> linear in the best case (already sorted list).
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <
>>>>> greg7mdp at gmail.com> wrote:
>>>>>
>>>>>> Thanks again @Siddhanathan! Looks like your gSort fixes the main
>>>>>> issue with
>>>>>> Data.List.sort().
>>>>>>
>>>>>> I have updated the test programs in https://github.com/greg7mdp/gh
>>>>>> c-sort to
>>>>>> include your new version.
>>>>>>
>>>>>> Here are the results (your new version looks like a definite
>>>>>> improvement vs
>>>>>> the current GHC one):
>>>>>>
>>>>>> input GHC sort My Orig proposal
>>>>>> gSort
>>>>>> ------------------------------------------------------------
>>>>>> ----------------
>>>>>> ---
>>>>>> sorted ints (ascending) 151 456
>>>>>> 148
>>>>>> sorted ints (descending) 152 466
>>>>>> 155
>>>>>> random ints 2732 2006
>>>>>> 2004
>>>>>> random strings 6564 5549
>>>>>> 5528
>>>>>>
>>>>>>
>>>>>> So replacing the current GHC version with gSort is a no brainer, as
>>>>>> it is
>>>>>> better in all regards.
>>>>>>
>>>>>> However, still my version is more laziness-friendly, i.e. it requires
>>>>>> fewer
>>>>>> comparisons to get the
>>>>>> N smallest elements of a list (see
>>>>>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_wi
>>>>>> th_trace.hs).
>>>>>>
>>>>>> I wonder if this might not be a more useful trait than being able to
>>>>>> sort
>>>>>> already sorted lists super fast.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> greg
>>>>>>
>>>>>> ________________________________
>>>>>>
>>>>>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On
>>>>>> Behalf Of
>>>>>> Siddhanathan Shanmugam
>>>>>> Sent: Sunday, March 26, 2017 1:05 PM
>>>>>> To: Gregory Popovitch
>>>>>> Cc: Haskell Libraries
>>>>>> Subject: Re: Proposal: a new implementation for Data.List.sort and
>>>>>> Data.List.sortBy, which has better performance characteristics and is
>>>>>> more
>>>>>> laziness-friendly.
>>>>>>
>>>>>>
>>>>>> Interesting. You are right, performance for sorting random lists has
>>>>>> priority over performance for sorting already-sorted lists.
>>>>>>
>>>>>> Ignore the numbers for my previous version. Can you compare GHC's
>>>>>> sort, your
>>>>>> proposal, and gSort below?
>>>>>>
>>>>>>
>>>>>> gSort :: Ord a => [a] -> [a]
>>>>>> gSort = gSortBy compare
>>>>>> gSortBy cmp = mergeAll . sequences
>>>>>> where
>>>>>> sequences (a:b:xs)
>>>>>> | a `cmp` b == GT = descending b [a] xs
>>>>>> | otherwise = ascending b (a:) xs
>>>>>> sequences xs = [xs]
>>>>>>
>>>>>>
>>>>>> descending a as (b:bs)
>>>>>> | a `cmp` b == GT = descending b (a:as) bs
>>>>>> descending a as bs = (a:as) : sequences bs
>>>>>>
>>>>>>
>>>>>> ascending a as (b:bs)
>>>>>> | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
>>>>>> ascending a as bs = as [a] `seq` as [a] : sequences bs
>>>>>>
>>>>>>
>>>>>> mergeAll [x] = x
>>>>>> mergeAll xs = mergeAll (mergePairs xs)
>>>>>>
>>>>>>
>>>>>> mergePairs (a:b:xs) = merge a b : mergePairs xs
>>>>>> mergePairs xs = xs
>>>>>>
>>>>>>
>>>>>> merge as@(a:as') bs@(b:bs')
>>>>>> | a `cmp` b == GT = b : merge as bs'
>>>>>> | otherwise = a : merge as' bs
>>>>>> merge [] bs = bs
>>>>>> merge as [] = as
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Sid
>>>>>>
>>>>>>
>>>>>> On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <
>>>>>> greg7mdp at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Thank you @Siddhanathan! I welcome any improvement you may
>>>>>> make, as
>>>>>> I said I
>>>>>> am very far from a Haskell expert.
>>>>>>
>>>>>> I just tested your change with my test project
>>>>>> (https://github.com/greg7mdp/ghc-sort
>>>>>> <https://github.com/greg7mdp/ghc-sort> )
>>>>>> and here are my results (mean times in ms):
>>>>>>
>>>>>> input GHC sort Orig proposal
>>>>>> your
>>>>>> change
>>>>>>
>>>>>> ------------------------------------------------------------
>>>>>> ----------------
>>>>>> ---
>>>>>> sorted ints (ascending) 153 467
>>>>>> 139
>>>>>> sorted ints (descending) 152 472
>>>>>> 599
>>>>>> random ints 2824 2077
>>>>>> 2126
>>>>>> random strings 6564 5613
>>>>>> 5983
>>>>>>
>>>>>> Your change is a definite improvement for sorted integers in
>>>>>> ascending
>>>>>> order, but is worse for other cases.
>>>>>>
>>>>>> Is there a real need to optimize the sort for already sorted
>>>>>> list?
>>>>>> Of course
>>>>>> it should not be a degenerate
>>>>>> case and take longer than sorting random numbers, but this is
>>>>>> not
>>>>>> the case
>>>>>> here. Sorting already sorted
>>>>>> lists is, even with my version, over 4 times faster than
>>>>>> sorting
>>>>>> random
>>>>>> lists. This sounds perfectly
>>>>>> acceptable to me, and I feel that trying to optimize this
>>>>>> specific
>>>>>> case
>>>>>> further, if it comes at the
>>>>>> detriment of the general case, is not desirable.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> greg
>>>>>>
>>>>>> ________________________________
>>>>>>
>>>>>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com]
>>>>>> On
>>>>>> Behalf Of
>>>>>> Siddhanathan Shanmugam
>>>>>> Sent: Sunday, March 26, 2017 11:41 AM
>>>>>> To: Gregory Popovitch
>>>>>> Cc: Haskell Libraries
>>>>>> Subject: Re: Proposal: a new implementation for
>>>>>> Data.List.sort and
>>>>>> Data.List.sortBy, which has better performance
>>>>>> characteristics and
>>>>>> is more
>>>>>> laziness-friendly.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you! This identifies a space leak in base which went
>>>>>> unnoticed
>>>>>> for 7
>>>>>> years.
>>>>>>
>>>>>> Your implementation can be improved further. Instead of
>>>>>> splitting
>>>>>> into
>>>>>> pairs, you could instead split into lists of sorted sublists
>>>>>> by
>>>>>> replacing
>>>>>> the pairs function with the following
>>>>>>
>>>>>> pair = foldr f []
>>>>>> where
>>>>>> f x [] = [[x]]
>>>>>> f x (y:ys)
>>>>>> | x `cmp` head y == LT = (x:y):ys
>>>>>> | otherwise = [x]:y:ys
>>>>>>
>>>>>> This should give you the same performance improvements for
>>>>>> sorting
>>>>>> random
>>>>>> lists, but better performance while sorting ascending lists.
>>>>>>
>>>>>> The version in base takes it one step further by using a
>>>>>> DList to
>>>>>> handle the
>>>>>> descending case efficiently as well, except there's a space
>>>>>> leak
>>>>>> right now
>>>>>> because of which it is slower.
>>>>>>
>>>>>> On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
>>>>>> <greg7mdp at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Motivation:
>>>>>> ----------
>>>>>>
>>>>>> Data.List.sort is a very important functionality in
>>>>>> Haskell.
>>>>>> I
>>>>>> believe that
>>>>>> the proposed implementation is:
>>>>>>
>>>>>> - significantly faster than the current
>>>>>> implementation on
>>>>>> unsorted
>>>>>> lists,
>>>>>> typically 14% to 27% faster
>>>>>> - more laziness-friendly, i.e.:
>>>>>> take 3 $ sort l
>>>>>> will require significantly less comparisons than the
>>>>>> current
>>>>>> implementation
>>>>>>
>>>>>> Proposed Implementation
>>>>>> -----------------------
>>>>>>
>>>>>> sort :: (Ord a) => [a] -> [a]
>>>>>> sort = sortBy compare
>>>>>>
>>>>>> sortBy cmp [] = []
>>>>>> sortBy cmp xs = head $ until (null.tail) reduce (pair
>>>>>> xs)
>>>>>> where
>>>>>> pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
>>>>>> | otherwise = [x, y] : pair t
>>>>>> pair [x] = [[x]]
>>>>>> pair [] = []
>>>>>>
>>>>>> reduce (v:w:x:y:t) = merge v' x' : reduce t
>>>>>> where v' = merge v w
>>>>>> x' = merge x y
>>>>>>
>>>>>> reduce (x:y:t) = merge x y : reduce t
>>>>>> reduce xs = xs
>>>>>>
>>>>>> merge xs [] = xs
>>>>>> merge [] ys = ys
>>>>>> merge xs@(x:xs') ys@(y:ys')
>>>>>> | x `cmp` y == GT = y : merge xs ys'
>>>>>> | otherwise = x : merge xs' ys
>>>>>>
>>>>>>
>>>>>> Effect and Interactions
>>>>>> -----------------------
>>>>>>
>>>>>> I have a stack project with a criterion test for this
>>>>>> new
>>>>>> implementation,
>>>>>> available at https://github.com/greg7mdp/ghc-sort
>>>>>> <https://github.com/greg7mdp/ghc-sort>
>>>>>>
>>>>>> <https://github.com/greg7mdp/ghc-sort
>>>>>> <https://github.com/greg7mdp/ghc-sort> > .
>>>>>> I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2,
>>>>>> and
>>>>>> had the
>>>>>> following
>>>>>> results:
>>>>>>
>>>>>> - sorting of random lists of integers is 27% faster
>>>>>> - sorting of random lists of strings is 14% faster
>>>>>> - sorting of already sorted lists is significantly
>>>>>> slower,
>>>>>> but still
>>>>>> much
>>>>>> faster than sorting random lists
>>>>>> - proposed version is more laziness friendly. For
>>>>>> example
>>>>>> this
>>>>>> version of
>>>>>> sortBy requires 11 comparisons to find
>>>>>> the smallest element of a 15 element list, while the
>>>>>> default
>>>>>> Data.List.sortBy requires 15 comparisons.
>>>>>> (see
>>>>>>
>>>>>>
>>>>>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_wi
>>>>>> th_trace.hs
>>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>>> ith_trace.hs>
>>>>>>
>>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>>> ith_trace.hs
>>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>>> ith_trace.hs> >
>>>>>> )
>>>>>>
>>>>>>
>>>>>>
>>>>>> Test results
>>>>>> ------------
>>>>>>
>>>>>> Criterion output (descending/ascending results are for
>>>>>> already
>>>>>> sorted
>>>>>> lists).
>>>>>> I barely understand what Criterion does, and I am
>>>>>> puzzled
>>>>>> with the
>>>>>> various
>>>>>> "T" output - maybe there is a bug in my bench code:
>>>>>>
>>>>>> vagrant at vagrant-ubuntu-trusty-64:/vagrant$ stack exec
>>>>>> ghc-sort
>>>>>> benchmarking ascending ints/ghc
>>>>>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>>>>>> 160.6 ms
>>>>>> (153.4
>>>>>> ms .. 167.8 ms)
>>>>>> 0.997 R² (0.986 R² .. 1.000 R²)
>>>>>> mean 161.7 ms (158.3 ms .. 165.9 ms)
>>>>>> std dev 5.210 ms (3.193 ms .. 7.006 ms)
>>>>>> variance introduced by outliers: 12% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking ascending ints/greg
>>>>>> TTTTTTTTTTTTTTTTtime 473.8 ms
>>>>>> (398.6 ms ..
>>>>>> 554.9
>>>>>> ms)
>>>>>> 0.996 R² (0.987 R² .. 1.000 R²)
>>>>>> mean 466.2 ms (449.0 ms .. 475.0 ms)
>>>>>> std dev 14.94 ms (0.0 s .. 15.29 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking descending ints/ghc
>>>>>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>>>>>> 165.1 ms
>>>>>> (148.2
>>>>>> ms .. 178.2 ms)
>>>>>> 0.991 R² (0.957 R² .. 1.000 R²)
>>>>>> mean 158.7 ms (154.0 ms .. 164.3 ms)
>>>>>> std dev 7.075 ms (4.152 ms .. 9.903 ms)
>>>>>> variance introduced by outliers: 12% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking descending ints/greg
>>>>>> TTTTTTTTTTTTTTTTtime 471.7 ms
>>>>>> (419.8 ms ..
>>>>>> 508.3
>>>>>> ms)
>>>>>> 0.999 R² (0.995 R² .. 1.000 R²)
>>>>>> mean 476.0 ms (467.5 ms .. 480.0 ms)
>>>>>> std dev 7.447 ms (67.99 as .. 7.865 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking random ints/ghc
>>>>>> TTTTTTTTTTTTTTTTtime 2.852 s
>>>>>> (2.564 s ..
>>>>>> 3.019 s)
>>>>>> 0.999 R² (0.997 R² .. 1.000 R²)
>>>>>> mean 2.812 s (2.785 s .. 2.838 s)
>>>>>> std dev 44.06 ms (543.9 as .. 44.97 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking random ints/greg
>>>>>> TTTTTTTTTTTTTTTTtime 2.032 s
>>>>>> (1.993 s ..
>>>>>> 2.076 s)
>>>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>>>> mean 2.028 s (2.019 s .. 2.033 s)
>>>>>> std dev 7.832 ms (0.0 s .. 8.178 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking shakespeare/ghc
>>>>>> TTTTTTTTTTTTTTTTtime 6.504 s
>>>>>> (6.391 s ..
>>>>>> 6.694 s)
>>>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>>>> mean 6.499 s (6.468 s .. 6.518 s)
>>>>>> std dev 28.85 ms (0.0 s .. 32.62 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>> benchmarking shakespeare/greg
>>>>>> TTTTTTTTTTTTTTTTtime 5.560 s
>>>>>> (5.307 s ..
>>>>>> 5.763 s)
>>>>>> 1.000 R² (0.999 R² .. 1.000 R²)
>>>>>> mean 5.582 s (5.537 s .. 5.607 s)
>>>>>> std dev 39.30 ms (0.0 s .. 43.49 ms)
>>>>>> variance introduced by outliers: 19% (moderately
>>>>>> inflated)
>>>>>>
>>>>>>
>>>>>> Costs and Drawbacks
>>>>>> -------------------
>>>>>>
>>>>>> The only cost I see is the reduced performance when
>>>>>> sorting
>>>>>> already
>>>>>> sorted
>>>>>> lists. However, since this remains quite efficient,
>>>>>> indeed
>>>>>> over 4
>>>>>> times
>>>>>> faster than sorting unsorted lists, I think it is an
>>>>>> acceptable
>>>>>> tradeoff.
>>>>>>
>>>>>> Final note
>>>>>> ----------
>>>>>>
>>>>>> My Haskell is very rusty. I worked on this a couple
>>>>>> years
>>>>>> ago when I
>>>>>> was
>>>>>> learning Haskell, and meant to propose it to the
>>>>>> Haskell
>>>>>> community,
>>>>>> but
>>>>>> never got to it at the time.
>>>>>>
>>>>>> _______________________________________________
>>>>>> Libraries mailing list
>>>>>> Libraries at haskell.org
>>>>>> http://mail.haskell.org/cgi-bi
>>>>>> n/mailman/listinfo/libraries
>>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
>>>>>>
>>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170327/00724dd1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook.jpg
Type: image/jpeg
Size: 113533 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170327/00724dd1/attachment-0001.jpg>
More information about the Libraries
mailing list