fix for Data.List.sortBy
Dan Burton
danburton.email at gmail.com
Tue Mar 28 00:49:36 UTC 2017
Does this rely on Common Subexpression Elimination optimization in order to
work? Would it work more reliably if the `seq`-ed expression were
let-bound?
-- Dan Burton
On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <david.feuer at gmail.com> wrote:
> The first seq is useless: constructor application is never suspended. I
> haven't had a chance to look at the rest yet.
>
> On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <greg7mdp at gmail.com> wrote:
>
>> Sid,
>>
>> I'd be delighted to submit the patch, as long as I have permission
>> (which I probably don't), you feel confident about the change and maybe a
>> couple of other people agree.
>>
>> Here is the proposed change. Tests shows significant speed improvement
>> (30%) when sorting lists of random numbers, and same efficiency for sorting
>> already sorted lists (both ascending and descending).
>>
>>
>> Thanks,
>>
>> greg
>>
>> ------------------------------
>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>> Behalf Of *Siddhanathan Shanmugam
>> *Sent:* Monday, March 27, 2017 6:53 PM
>> *To:* Gregory Popovitch
>> *Subject:* RE: Proposal: a new implementation for Data.List.sort and
>> Data.List.sortBy, which has better performance characteristics and is more
>> laziness-friendly.
>>
>> Since I don't see any regressions, this doesn't really need CLC approval.
>> The changes are also small enough that a Github PR may be accepted
>> (otherwise, the change goes in via Phabricator).
>>
>> Are you interested in implementing this patch? If yes, a standard Github
>> PR should be fine. Right now gSort is a three line change I think. It will
>> be changed in ghc/libraries/base/Data/OldList.hs on the ghc/ghc repo on
>> Github.
>>
>> I'm hoping for some more comments from other Haskellers, before pushing
>> for this change in base. I feel like we may be missing a potential
>> optimization that someone else might spot. So probably going to wait a few
>> days.
>>
>>
>> On Mar 27, 2017 11:43 AM, "Gregory Popovitch" <greg7mdp at gmail.com> wrote:
>>
>> Hi Sid,
>>
>> Thanks, glad you looked into that. My understanding of the Haskell
>> execution model is really poor, so I can't say one way or the other, but I
>> felt that laziness ought to be considered as well, and I'm glad it was :-)
>>
>> So in conclusion it looks like we have a winner with your latest gSortBy
>> version. How do we get this pushed to the GHC library?
>>
>> Thanks,
>>
>> greg
>>
>> ------------------------------
>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>> Behalf Of *Siddhanathan Shanmugam
>> *Sent:* Monday, March 27, 2017 2:12 PM
>> *To:* Gregory Popovitch
>>
>> *Subject:* Re: Proposal: a new implementation for Data.List.sort and
>> Data.List.sortBy, which has better performance characteristics and is more
>> laziness-friendly.
>>
>> Hi Greg,
>>
>> On Mon, Mar 27, 2017 at 10:19 AM, Gregory Popovitch <greg7mdp at gmail.com>
>> wrote:
>>
>>> Unfortunately, this optimization makes the sort less lazy, so doing
>>> something like:
>>>
>>> take 4 $ sort l
>>>
>>> requires more sorting of the list l with this change. I'm not sure it is
>>> a good tradeoff.
>>>
>>> This can be verified with: https://github.com/greg7mdp/gh
>>> c-sort/blob/master/src/sort_with_trace.hs
>>>
>>
>> I think you're running without optimizations turned on. It is lazy in my
>> case.
>>
>> Also, the difference should be negligible (if any at all). Here's an
>> example of the list being sorted:
>>
>> [11,4,6,8,2,5,1,7,9,55,11,3]
>> ...
>> [[4,11],[6,8],[2,5],[1,7,9,55],[3,11],[]]
>> ...
>> [[1,2,4,5,6,7,8,9,11,55],[3,11]]
>> * 1 3
>> * 2 3
>> * 4 3
>> * 4 11
>> [1,2,3,4]
>>
>> The number of operations saved is only in the last merge. It's only lazy
>> at this step.
>>
>> So we save at most one traversal of the list, which is not too expensive
>> since our worst case bounds is O(n log n) anyway.
>>
>> This should mean that the asymptotic performance should be identical,
>> regardless of the number of comparisons saved. Of course, you do get better
>> constants, but I would be surprised if those constants translated to
>> significantly better performance for a reasonable size list.
>>
>>
>>>
>>> I do agree that it would be nice to have a more serious validation
>>> test suite.
>>>
>>> Thanks,
>>>
>>> greg
>>>
>>> ------------------------------
>>> *From:* siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] *On
>>> Behalf Of *Siddhanathan Shanmugam
>>> *Sent:* Monday, March 27, 2017 12:53 PM
>>>
>>> *To:* Gregory Popovitch
>>> *Cc:* Haskell Libraries
>>> *Subject:* Re: Proposal: a new implementation for Data.List.sort and
>>> Data.List.sortBy, which has better performance characteristics and is more
>>> laziness-friendly.
>>>
>>> We can improve things a bit further by forcing evaluation (with seq)
>>> along the way appropriately.
>>>
>>>
>>>
>>> gregSortBy cmp [] = []
>>> gregSortBy cmp xs = head $ until (null.tail) reduce (pair xs)
>>> where
>>> pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
>>> | otherwise = [x, y] : pair t
>>> pair [x] = [[x]]
>>> pair [] = []
>>>
>>> reduce (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t
>>> where v' = merge v w `seq` merge v w
>>> x' = merge x y `seq` merge x y
>>>
>>> reduce (x:y:t) = merge x y `seq` merge x y : reduce t
>>> reduce xs = xs
>>>
>>> merge xs [] = xs
>>> merge [] ys = ys
>>> merge xs@(x:xs') ys@(y:ys')
>>> | x `cmp` y == GT = y : merge xs ys'
>>> | otherwise = x : merge xs' ys
>>>
>>>
>>> gSortBy cmp = mergeAll . sequences
>>> where
>>> sequences (a:b:xs)
>>> | a `cmp` b == GT = descending b [a] xs
>>> | otherwise = ascending b (a:) xs
>>> sequences xs = [xs]
>>>
>>>
>>> descending a as (b:bs)
>>> | a `cmp` b == GT = descending b (a:as) bs
>>> descending a as bs = (a:as) `seq` (a:as) : sequences bs
>>>
>>>
>>> ascending a as (b:bs)
>>> | a `cmp` b /= GT = ascending b (as . (a:)) bs
>>> ascending a as bs = as [a] `seq` as [a] : sequences bs
>>>
>>>
>>> mergeAll [x] = x
>>> mergeAll xs = mergeAll (mergePairs xs)
>>>
>>>
>>> mergePairs (a:b:xs) = merge a b `seq` merge a b : mergePairs xs
>>> mergePairs xs = xs
>>>
>>>
>>> merge as@(a:as') bs@(b:bs')
>>> | a `cmp` b == GT = b : merge as bs'
>>> | otherwise = a : merge as' bs
>>> merge [] bs = bs
>>> merge as [] = as
>>>
>>>
>>>
>>>
>>> *Before the change:*
>>>
>>> benchmarking random ints/ghc
>>> time 3.687 s (3.541 s .. NaN s)
>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>> mean 3.691 s (3.669 s .. 3.705 s)
>>> std dev 21.45 ms (0.0 s .. 24.76 ms)
>>> variance introduced by outliers: 19% (moderately inflated)
>>>
>>> benchmarking random ints/greg
>>> time 2.648 s (2.482 s .. 2.822 s)
>>> 0.999 R² (0.998 R² .. 1.000 R²)
>>> mean 2.704 s (2.670 s .. 2.736 s)
>>> std dev 52.68 ms (0.0 s .. 54.49 ms)
>>> variance introduced by outliers: 19% (moderately inflated)
>>>
>>> benchmarking random ints/gSort
>>> time 2.733 s (2.682 s .. 2.758 s)
>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>> mean 2.707 s (2.689 s .. 2.718 s)
>>> std dev 16.84 ms (0.0 s .. 19.20 ms)
>>> variance introduced by outliers: 19% (moderately inflated)
>>>
>>> *After the change:*
>>>
>>> benchmarking random ints/greg
>>> time 2.576 s (2.548 s .. 2.628 s)
>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>> mean 2.590 s (2.578 s .. 2.599 s)
>>> std dev 12.99 ms (0.0 s .. 14.89 ms)
>>> variance introduced by outliers: 19% (moderately inflated)
>>>
>>> benchmarking random ints/gSort
>>> time 2.538 s (2.412 s .. 2.627 s)
>>> 1.000 R² (0.999 R² .. 1.000 R²)
>>> mean 2.543 s (2.517 s .. 2.560 s)
>>> std dev 26.16 ms (0.0 s .. 30.21 ms)
>>> variance introduced by outliers: 19% (moderately inflated)
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan Shanmugam <
>>> siddhanathan+eml at gmail.com> wrote:
>>>
>>>> Theoretically, we could do better. We currently only exploit monotonic
>>>> runs in merge sort, but we could also exploit bitonic runs:
>>>>
>>>> dlist as = as [] `seq` as []
>>>>
>>>> sequences [] = [[]]
>>>> sequences [a] = [[a]]
>>>> sequences (a:xs) = bitonic a a (a:) xs
>>>>
>>>> bitonic min max as (b:bs)
>>>> | b `cmp` max /= LT = bitonic min b (as . (b:)) bs
>>>> | b `cmp` min /= GT = bitonic b max ((b:) . as) bs
>>>> | otherwise = dlist as : sequences (b:bs)
>>>> bitonic _ _ as [] = [dlist as]
>>>>
>>>>
>>>> The constant factors here might be too high to notice the difference
>>>> though.
>>>>
>>>>
>>>> > However, still my version is more laziness-friendly, i.e. it requires
>>>> fewer
>>>> > comparisons to get the
>>>> > N smallest elements of a list (see
>>>> > https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_
>>>> with_trace.hs).
>>>> >
>>>> > I wonder if this might not be a more useful trait than being able to
>>>> sort
>>>> > already sorted lists super fast.
>>>>
>>>> This comes down to a discussion of merge sort vs natural merge sort.
>>>>
>>>> Data.List.sort is an implementation of a variant of merge sort called
>>>> natural merge sort. The algorithm is linearithmic in the worst case, but
>>>> linear in the best case (already sorted list).
>>>>
>>>>
>>>>
>>>> On Sun, Mar 26, 2017 at 10:47 AM, Gregory Popovitch <greg7mdp at gmail.com
>>>> > wrote:
>>>>
>>>>> Thanks again @Siddhanathan! Looks like your gSort fixes the main issue
>>>>> with
>>>>> Data.List.sort().
>>>>>
>>>>> I have updated the test programs in https://github.com/greg7mdp/gh
>>>>> c-sort to
>>>>> include your new version.
>>>>>
>>>>> Here are the results (your new version looks like a definite
>>>>> improvement vs
>>>>> the current GHC one):
>>>>>
>>>>> input GHC sort My Orig proposal
>>>>> gSort
>>>>> ------------------------------------------------------------
>>>>> ----------------
>>>>> ---
>>>>> sorted ints (ascending) 151 456 148
>>>>> sorted ints (descending) 152 466 155
>>>>> random ints 2732 2006 2004
>>>>> random strings 6564 5549 5528
>>>>>
>>>>>
>>>>> So replacing the current GHC version with gSort is a no brainer, as it
>>>>> is
>>>>> better in all regards.
>>>>>
>>>>> However, still my version is more laziness-friendly, i.e. it requires
>>>>> fewer
>>>>> comparisons to get the
>>>>> N smallest elements of a list (see
>>>>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_wi
>>>>> th_trace.hs).
>>>>>
>>>>> I wonder if this might not be a more useful trait than being able to
>>>>> sort
>>>>> already sorted lists super fast.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> greg
>>>>>
>>>>> ________________________________
>>>>>
>>>>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com] On
>>>>> Behalf Of
>>>>> Siddhanathan Shanmugam
>>>>> Sent: Sunday, March 26, 2017 1:05 PM
>>>>> To: Gregory Popovitch
>>>>> Cc: Haskell Libraries
>>>>> Subject: Re: Proposal: a new implementation for Data.List.sort and
>>>>> Data.List.sortBy, which has better performance characteristics and is
>>>>> more
>>>>> laziness-friendly.
>>>>>
>>>>>
>>>>> Interesting. You are right, performance for sorting random lists has
>>>>> priority over performance for sorting already-sorted lists.
>>>>>
>>>>> Ignore the numbers for my previous version. Can you compare GHC's
>>>>> sort, your
>>>>> proposal, and gSort below?
>>>>>
>>>>>
>>>>> gSort :: Ord a => [a] -> [a]
>>>>> gSort = gSortBy compare
>>>>> gSortBy cmp = mergeAll . sequences
>>>>> where
>>>>> sequences (a:b:xs)
>>>>> | a `cmp` b == GT = descending b [a] xs
>>>>> | otherwise = ascending b (a:) xs
>>>>> sequences xs = [xs]
>>>>>
>>>>>
>>>>> descending a as (b:bs)
>>>>> | a `cmp` b == GT = descending b (a:as) bs
>>>>> descending a as bs = (a:as) : sequences bs
>>>>>
>>>>>
>>>>> ascending a as (b:bs)
>>>>> | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
>>>>> ascending a as bs = as [a] `seq` as [a] : sequences bs
>>>>>
>>>>>
>>>>> mergeAll [x] = x
>>>>> mergeAll xs = mergeAll (mergePairs xs)
>>>>>
>>>>>
>>>>> mergePairs (a:b:xs) = merge a b : mergePairs xs
>>>>> mergePairs xs = xs
>>>>>
>>>>>
>>>>> merge as@(a:as') bs@(b:bs')
>>>>> | a `cmp` b == GT = b : merge as bs'
>>>>> | otherwise = a : merge as' bs
>>>>> merge [] bs = bs
>>>>> merge as [] = as
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Sid
>>>>>
>>>>>
>>>>> On Sun, Mar 26, 2017 at 9:19 AM, Gregory Popovitch <greg7mdp at gmail.com
>>>>> >
>>>>> wrote:
>>>>>
>>>>>
>>>>> Thank you @Siddhanathan! I welcome any improvement you may
>>>>> make, as
>>>>> I said I
>>>>> am very far from a Haskell expert.
>>>>>
>>>>> I just tested your change with my test project
>>>>> (https://github.com/greg7mdp/ghc-sort
>>>>> <https://github.com/greg7mdp/ghc-sort> )
>>>>> and here are my results (mean times in ms):
>>>>>
>>>>> input GHC sort Orig proposal
>>>>> your
>>>>> change
>>>>>
>>>>> ------------------------------------------------------------
>>>>> ----------------
>>>>> ---
>>>>> sorted ints (ascending) 153 467
>>>>> 139
>>>>> sorted ints (descending) 152 472
>>>>> 599
>>>>> random ints 2824 2077
>>>>> 2126
>>>>> random strings 6564 5613
>>>>> 5983
>>>>>
>>>>> Your change is a definite improvement for sorted integers in
>>>>> ascending
>>>>> order, but is worse for other cases.
>>>>>
>>>>> Is there a real need to optimize the sort for already sorted
>>>>> list?
>>>>> Of course
>>>>> it should not be a degenerate
>>>>> case and take longer than sorting random numbers, but this is
>>>>> not
>>>>> the case
>>>>> here. Sorting already sorted
>>>>> lists is, even with my version, over 4 times faster than
>>>>> sorting
>>>>> random
>>>>> lists. This sounds perfectly
>>>>> acceptable to me, and I feel that trying to optimize this
>>>>> specific
>>>>> case
>>>>> further, if it comes at the
>>>>> detriment of the general case, is not desirable.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> greg
>>>>>
>>>>> ________________________________
>>>>>
>>>>> From: siddhanathan at gmail.com [mailto:siddhanathan at gmail.com]
>>>>> On
>>>>> Behalf Of
>>>>> Siddhanathan Shanmugam
>>>>> Sent: Sunday, March 26, 2017 11:41 AM
>>>>> To: Gregory Popovitch
>>>>> Cc: Haskell Libraries
>>>>> Subject: Re: Proposal: a new implementation for Data.List.sort
>>>>> and
>>>>> Data.List.sortBy, which has better performance characteristics
>>>>> and
>>>>> is more
>>>>> laziness-friendly.
>>>>>
>>>>>
>>>>>
>>>>> Thank you! This identifies a space leak in base which went
>>>>> unnoticed
>>>>> for 7
>>>>> years.
>>>>>
>>>>> Your implementation can be improved further. Instead of
>>>>> splitting
>>>>> into
>>>>> pairs, you could instead split into lists of sorted sublists by
>>>>> replacing
>>>>> the pairs function with the following
>>>>>
>>>>> pair = foldr f []
>>>>> where
>>>>> f x [] = [[x]]
>>>>> f x (y:ys)
>>>>> | x `cmp` head y == LT = (x:y):ys
>>>>> | otherwise = [x]:y:ys
>>>>>
>>>>> This should give you the same performance improvements for
>>>>> sorting
>>>>> random
>>>>> lists, but better performance while sorting ascending lists.
>>>>>
>>>>> The version in base takes it one step further by using a DList
>>>>> to
>>>>> handle the
>>>>> descending case efficiently as well, except there's a space
>>>>> leak
>>>>> right now
>>>>> because of which it is slower.
>>>>>
>>>>> On Sun, Mar 26, 2017 at 7:21 AM, Gregory Popovitch
>>>>> <greg7mdp at gmail.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Motivation:
>>>>> ----------
>>>>>
>>>>> Data.List.sort is a very important functionality in
>>>>> Haskell.
>>>>> I
>>>>> believe that
>>>>> the proposed implementation is:
>>>>>
>>>>> - significantly faster than the current implementation
>>>>> on
>>>>> unsorted
>>>>> lists,
>>>>> typically 14% to 27% faster
>>>>> - more laziness-friendly, i.e.:
>>>>> take 3 $ sort l
>>>>> will require significantly less comparisons than the
>>>>> current
>>>>> implementation
>>>>>
>>>>> Proposed Implementation
>>>>> -----------------------
>>>>>
>>>>> sort :: (Ord a) => [a] -> [a]
>>>>> sort = sortBy compare
>>>>>
>>>>> sortBy cmp [] = []
>>>>> sortBy cmp xs = head $ until (null.tail) reduce (pair
>>>>> xs)
>>>>> where
>>>>> pair (x:y:t) | x `cmp` y == GT = [y, x] : pair t
>>>>> | otherwise = [x, y] : pair t
>>>>> pair [x] = [[x]]
>>>>> pair [] = []
>>>>>
>>>>> reduce (v:w:x:y:t) = merge v' x' : reduce t
>>>>> where v' = merge v w
>>>>> x' = merge x y
>>>>>
>>>>> reduce (x:y:t) = merge x y : reduce t
>>>>> reduce xs = xs
>>>>>
>>>>> merge xs [] = xs
>>>>> merge [] ys = ys
>>>>> merge xs@(x:xs') ys@(y:ys')
>>>>> | x `cmp` y == GT = y : merge xs ys'
>>>>> | otherwise = x : merge xs' ys
>>>>>
>>>>>
>>>>> Effect and Interactions
>>>>> -----------------------
>>>>>
>>>>> I have a stack project with a criterion test for this
>>>>> new
>>>>> implementation,
>>>>> available at https://github.com/greg7mdp/ghc-sort
>>>>> <https://github.com/greg7mdp/ghc-sort>
>>>>>
>>>>> <https://github.com/greg7mdp/ghc-sort
>>>>> <https://github.com/greg7mdp/ghc-sort> > .
>>>>> I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2,
>>>>> and
>>>>> had the
>>>>> following
>>>>> results:
>>>>>
>>>>> - sorting of random lists of integers is 27% faster
>>>>> - sorting of random lists of strings is 14% faster
>>>>> - sorting of already sorted lists is significantly
>>>>> slower,
>>>>> but still
>>>>> much
>>>>> faster than sorting random lists
>>>>> - proposed version is more laziness friendly. For
>>>>> example
>>>>> this
>>>>> version of
>>>>> sortBy requires 11 comparisons to find
>>>>> the smallest element of a 15 element list, while the
>>>>> default
>>>>> Data.List.sortBy requires 15 comparisons.
>>>>> (see
>>>>>
>>>>>
>>>>> https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_wi
>>>>> th_trace.hs
>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>> ith_trace.hs>
>>>>>
>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>> ith_trace.hs
>>>>> <https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_w
>>>>> ith_trace.hs> >
>>>>> )
>>>>>
>>>>>
>>>>>
>>>>> Test results
>>>>> ------------
>>>>>
>>>>> Criterion output (descending/ascending results are for
>>>>> already
>>>>> sorted
>>>>> lists).
>>>>> I barely understand what Criterion does, and I am
>>>>> puzzled
>>>>> with the
>>>>> various
>>>>> "T" output - maybe there is a bug in my bench code:
>>>>>
>>>>> vagrant at vagrant-ubuntu-trusty-64:/vagrant$ stack exec
>>>>> ghc-sort
>>>>> benchmarking ascending ints/ghc
>>>>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>>>>> 160.6 ms
>>>>> (153.4
>>>>> ms .. 167.8 ms)
>>>>> 0.997 R² (0.986 R² .. 1.000 R²)
>>>>> mean 161.7 ms (158.3 ms .. 165.9 ms)
>>>>> std dev 5.210 ms (3.193 ms .. 7.006 ms)
>>>>> variance introduced by outliers: 12% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking ascending ints/greg
>>>>> TTTTTTTTTTTTTTTTtime 473.8 ms (398.6
>>>>> ms ..
>>>>> 554.9
>>>>> ms)
>>>>> 0.996 R² (0.987 R² .. 1.000 R²)
>>>>> mean 466.2 ms (449.0 ms .. 475.0 ms)
>>>>> std dev 14.94 ms (0.0 s .. 15.29 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking descending ints/ghc
>>>>> TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTtime
>>>>> 165.1 ms
>>>>> (148.2
>>>>> ms .. 178.2 ms)
>>>>> 0.991 R² (0.957 R² .. 1.000 R²)
>>>>> mean 158.7 ms (154.0 ms .. 164.3 ms)
>>>>> std dev 7.075 ms (4.152 ms .. 9.903 ms)
>>>>> variance introduced by outliers: 12% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking descending ints/greg
>>>>> TTTTTTTTTTTTTTTTtime 471.7 ms (419.8
>>>>> ms ..
>>>>> 508.3
>>>>> ms)
>>>>> 0.999 R² (0.995 R² .. 1.000 R²)
>>>>> mean 476.0 ms (467.5 ms .. 480.0 ms)
>>>>> std dev 7.447 ms (67.99 as .. 7.865 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking random ints/ghc
>>>>> TTTTTTTTTTTTTTTTtime 2.852 s (2.564
>>>>> s ..
>>>>> 3.019 s)
>>>>> 0.999 R² (0.997 R² .. 1.000 R²)
>>>>> mean 2.812 s (2.785 s .. 2.838 s)
>>>>> std dev 44.06 ms (543.9 as .. 44.97 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking random ints/greg
>>>>> TTTTTTTTTTTTTTTTtime 2.032 s (1.993
>>>>> s ..
>>>>> 2.076 s)
>>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>>> mean 2.028 s (2.019 s .. 2.033 s)
>>>>> std dev 7.832 ms (0.0 s .. 8.178 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking shakespeare/ghc
>>>>> TTTTTTTTTTTTTTTTtime 6.504 s (6.391
>>>>> s ..
>>>>> 6.694 s)
>>>>> 1.000 R² (1.000 R² .. 1.000 R²)
>>>>> mean 6.499 s (6.468 s .. 6.518 s)
>>>>> std dev 28.85 ms (0.0 s .. 32.62 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>> benchmarking shakespeare/greg
>>>>> TTTTTTTTTTTTTTTTtime 5.560 s (5.307
>>>>> s ..
>>>>> 5.763 s)
>>>>> 1.000 R² (0.999 R² .. 1.000 R²)
>>>>> mean 5.582 s (5.537 s .. 5.607 s)
>>>>> std dev 39.30 ms (0.0 s .. 43.49 ms)
>>>>> variance introduced by outliers: 19% (moderately
>>>>> inflated)
>>>>>
>>>>>
>>>>> Costs and Drawbacks
>>>>> -------------------
>>>>>
>>>>> The only cost I see is the reduced performance when
>>>>> sorting
>>>>> already
>>>>> sorted
>>>>> lists. However, since this remains quite efficient,
>>>>> indeed
>>>>> over 4
>>>>> times
>>>>> faster than sorting unsorted lists, I think it is an
>>>>> acceptable
>>>>> tradeoff.
>>>>>
>>>>> Final note
>>>>> ----------
>>>>>
>>>>> My Haskell is very rusty. I worked on this a couple
>>>>> years
>>>>> ago when I
>>>>> was
>>>>> learning Haskell, and meant to propose it to the
>>>>> Haskell
>>>>> community,
>>>>> but
>>>>> never got to it at the time.
>>>>>
>>>>> _______________________________________________
>>>>> Libraries mailing list
>>>>> Libraries at haskell.org
>>>>> http://mail.haskell.org/cgi-bi
>>>>> n/mailman/listinfo/libraries
>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries>
>>>>>
>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>> <http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries> >
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170327/2007e594/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook.jpg
Type: image/jpeg
Size: 113533 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170327/2007e594/attachment-0001.jpg>
More information about the Libraries
mailing list