<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 11.00.10570.1001"></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri>
<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080
face=Calibri>Sid, this new version (diff below) is really fast when sorting
random ints, but slower when sorting strings:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080
face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080
face="Courier New">input
GHC sort Orig
proposal
gSort<BR>-------------------------------------------------------------------------<BR>sorted
ints (ascending)
151
460
147<BR>sorted ints (descending)
151
467
171<BR>random
ints
2771
2010
1365<BR>random
strings
6542
5524
5991</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV><FONT color=#000080 face=Calibri></FONT> </DIV>
<DIV><FONT color=#000080 face=Calibri><IMG
src="cid:165160212@28032017-0395"></FONT></DIV>
<DIV><FONT color=#000080 face=Calibri></FONT> </DIV>
<DIV><SPAN class=297205711-28032017><FONT color=#000080
face=Calibri>Thanks,</FONT></SPAN></DIV>
<DIV><SPAN class=297205711-28032017><FONT color=#000080
face=Calibri></FONT></SPAN> </DIV>
<DIV><SPAN class=297205711-28032017><FONT color=#000080
face=Calibri>greg</FONT></SPAN></DIV></FONT></DIV><BR>
<DIV lang=en-us class=OutlookMessageHeader dir=ltr align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B> siddhanathan@gmail.com
[mailto:siddhanathan@gmail.com] <B>On Behalf Of </B>Siddhanathan
Shanmugam<BR><B>Sent:</B> Tuesday, March 28, 2017 2:45 AM<BR><B>To:</B> Dan
Burton<BR><B>Cc:</B> Haskell Libraries; Gregory Popovitch<BR><B>Subject:</B> Re:
fix for Data.List.sortBy<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr>Turns out we don't need seq at all. A simple refactoring of the
merge function does the trick equally well.
<DIV><BR></DIV>
<DIV>
<DIV><FONT face="monospace, monospace"> mergePairs (a:b:xs) = merge
id a b : mergePairs xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> mergePairs xs
= xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge f as@(a:as')
bs@(b:bs')</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | a `cmp` b == GT =
merge (f.(b:)) as bs'</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | otherwise
= merge (f.(a:)) as' bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge f [] bs
= f bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge f as []
= f as</FONT></DIV></DIV>
<DIV><BR></DIV>
<DIV>This variant is 10% faster in my tests.</DIV>
<DIV><BR></DIV>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <SPAN
dir=ltr><<A href="mailto:danburton.email@gmail.com"
target=_blank>danburton.email@gmail.com</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">
<DIV dir=ltr>Does this rely on Common Subexpression Elimination optimization
in order to work? Would it work more reliably if the `seq`-ed expression were
let-bound? </DIV></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>I don't think it relies heavily on CSE. The seq's are there to avoid a
cascading series of thunk evaluations. Using let expressions doesn't seem to
affect the benchmarks.</DIV>
<DIV> </DIV>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">
<DIV class=gmail_extra><SPAN class=HOEnZb><FONT color=#888888><BR clear=all>
<DIV>
<DIV class=m_4214469927093090448gmail_signature
data-smartmail="gmail_signature">-- Dan Burton</DIV></DIV></FONT></SPAN>
<DIV>
<DIV class=h5><BR>
<DIV class=gmail_quote>On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <SPAN
dir=ltr><<A href="mailto:david.feuer@gmail.com"
target=_blank>david.feuer@gmail.com</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">
<DIV>The first seq is useless: constructor application is never suspended. I
haven't had a chance to look at the rest yet.</DIV>
<DIV class=m_4214469927093090448HOEnZb>
<DIV class=m_4214469927093090448h5>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <<A
href="mailto:greg7mdp@gmail.com" target=_blank>greg7mdp@gmail.com</A>>
wrote:<BR type="attribution">
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>
<DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
face=Calibri><FONT color=#000080><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017>Sid,
</SPAN></FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
face=Calibri><FONT color=#000080><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017></SPAN></FONT></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
face=Calibri><FONT color=#000080><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017></SPAN>I'd
be delighted to submit<SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017>
</SPAN>the patch, as long as I have permission (which I probably don't),
you feel confident about the change and maybe a couple of other people
agree.</FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
face=Calibri><FONT color=#000080>Here is the proposed change. Tests shows
significant speed improvement (30%) when sorting lists of random numbers,
and same efficiency for sorting already sorted lists (both ascending and
descending).<SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140523135523-27032017> </SPAN></FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
face=Calibri><FONT color=#000080><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140523135523-27032017></SPAN></FONT></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT
color=#000080 face=Calibri>greg</FONT></SPAN></DIV></DIV><BR>
<DIV lang=en-us
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140OutlookMessageHeader
dir=ltr align=left>
<HR>
<FONT size=2 face=Tahoma><B>From:</B> <A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A> [mailto:<A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of
</B>Siddhanathan Shanmugam<BR><B>Sent:</B> Monday, March 27, 2017 6:53
PM<BR><B>To:</B> Gregory Popovitch<BR><B>Subject:</B> RE: Proposal: a new
implementation for Data.List.sort and Data.List.sortBy, which has better
performance characteristics and is more
laziness-friendly.<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV>
<DIV>Since I don't see any regressions, this doesn't really need CLC
approval. The changes are also small enough that a Github PR may be
accepted (otherwise, the change goes in via Phabricator).<BR></DIV>
<DIV><BR></DIV>
<DIV>Are you interested in implementing this patch? If yes, a standard
Github PR should be fine. Right now gSort is a three line change I think.
It will be changed in ghc/libraries/base/Data/OldLis<WBR>t.hs on the
ghc/ghc repo on Github.</DIV>
<DIV><BR></DIV>
<DIV><SPAN style="FONT-FAMILY: sans-serif">I'm hoping for some more
comments from other Haskellers, before pushing for this change in base. I
feel like we may be missing a potential optimization that someone else
might spot. So probably going to wait a few
days. </SPAN><BR></DIV><BR>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Mar 27, 2017 11:43 AM, "Gregory Popovitch"
<<A href="mailto:greg7mdp@gmail.com"
target=_blank>greg7mdp@gmail.com</A>> wrote:<BR type="attribution">
<BLOCKQUOTE
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>
<DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri>Hi Sid,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri>Thanks, glad you looked into that. My
understanding of the Haskell execution model is really poor, so
I can't say one way or the other, but I felt that laziness
ought to be considered as well, and I'm glad it was
:-)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri>So in conclusion it looks like we
have a winner with your latest gSortBy version. How do we get this
pushed to the GHC library?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT
color=#000080 face=Calibri>greg</FONT></SPAN></DIV><BR>
<DIV lang=en-us
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236OutlookMessageHeader
dir=ltr align=left>
<HR>
<FONT size=2 face=Tahoma>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140quoted-text><B>From:</B>
<A href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A> [mailto:<A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of
</B>Siddhanathan Shanmugam<BR></DIV><B>Sent:</B> Monday, March 27, 2017
2:12 PM<BR><B>To:</B> Gregory Popovitch
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140elided-text><BR><B>Subject:</B>
Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is
more laziness-friendly.<BR></DIV></FONT><BR></DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140elided-text>
<DIV></DIV>
<DIV dir=ltr>Hi Greg,<BR>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Mon, Mar 27, 2017 at 10:19 AM, Gregory
Popovitch <SPAN dir=ltr><<A href="mailto:greg7mdp@gmail.com"
target=_blank>greg7mdp@gmail.com</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>
<DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>Unfortunately,
this optimization makes the sort less lazy, so doing something
like:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>take 4 $
sort l</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>requires
more sorting of the list l with this change. I'm not sure it is a good
tradeoff.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>This
can be verified with: <A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A></SPAN></FONT></DIV></DIV></BLOCKQUOTE>
<DIV><BR></DIV>
<DIV>I think you're running without optimizations turned on. It is lazy
in my case.</DIV>
<DIV><BR></DIV>
<DIV>Also, the difference should be negligible (if any at all). Here's
an example of the list being sorted:</DIV>
<DIV><BR></DIV>
<DIV>
<DIV>[11,4,6,8,2,5,1,7,9,55,11,3]</DIV></DIV>
<DIV>...</DIV>
<DIV>[[4,11],[6,8],[2,5],[1,7,9,55]<WBR>,[3,11],[]]<BR></DIV>
<DIV>...</DIV>
<DIV>
<DIV>[[1,2,4,5,6,7,8,9,11,55],[3,11<WBR>]]</DIV>
<DIV> * 1 3</DIV>
<DIV> * 2 3</DIV>
<DIV> * 4 3</DIV>
<DIV> * 4 11</DIV>
<DIV>[1,2,3,4]</DIV></DIV>
<DIV><BR></DIV>
<DIV>The number of operations saved is only in the last merge. It's only
lazy at this step.</DIV>
<DIV><BR></DIV>
<DIV>So we save at most one traversal of the list, which is not too
expensive since our worst case bounds is O(n log n) anyway.</DIV>
<DIV><BR></DIV>
<DIV>This should mean that the asymptotic performance should be
identical, regardless of the number of comparisons saved. Of course, you
do get better constants, but I would be surprised if those constants
translated to significantly better performance for a reasonable size
list.</DIV>
<DIV> </DIV>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">
<DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017> <FONT
color=#000080 face=Calibri>I do agree that it would be nice to have a
more serious validation test suite.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT
color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT
color=#000080 face=Calibri></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT
color=#000080 face=Calibri>greg</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN> </DIV>
<DIV dir=ltr align=left>
<HR>
</DIV>
<DIV dir=ltr align=left><FONT size=2 face=Tahoma><SPAN
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-><B>From:</B>
<A href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A> [mailto:<A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of
</B>Siddhanathan Shanmugam<BR></SPAN><B>Sent:</B> Monday, March 27,
2017 12:53 PM
<DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-h5><BR><B>To:</B>
Gregory Popovitch<BR><B>Cc:</B> Haskell Libraries<BR><B>Subject:</B>
Re: Proposal: a new implementation for Data.List.sort and
Data.List.sortBy, which has better performance characteristics and is
more laziness-friendly.<BR></DIV></DIV></FONT><BR></DIV>
<DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-h5>
<DIV></DIV>
<DIV dir=ltr>
<DIV>We can improve things a bit further by forcing evaluation (with
seq) along the way appropriately.<BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV>
<DIV><SPAN style="FONT-FAMILY: monospace,monospace">gregSortBy cmp []
= []</SPAN><BR></DIV>
<DIV><FONT face="monospace, monospace">gregSortBy cmp xs = head $
until (null.tail) reduce (pair xs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> where</FONT></DIV>
<DIV><FONT face="monospace, monospace"> pair (x:y:t) | x
`cmp` y == GT = [y, x] : pair t</FONT></DIV>
<DIV><FONT face="monospace, monospace">
| otherwise
= [x, y] : pair t</FONT></DIV>
<DIV><FONT face="monospace, monospace"> pair [x] =
[[x]]</FONT></DIV>
<DIV><FONT face="monospace, monospace"> pair [] =
[]</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> reduce
(v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t</FONT></DIV>
<DIV><FONT face="monospace, monospace">
where v'
= merge v w `seq` merge v w</FONT></DIV>
<DIV><FONT face="monospace, monospace">
x' = merge x y `seq` merge x y</FONT></DIV>
<DIV><FONT face="monospace, monospace">
</FONT></DIV>
<DIV><FONT face="monospace, monospace"> reduce (x:y:t) =
merge x y `seq` merge x y : reduce t</FONT></DIV>
<DIV><FONT face="monospace, monospace"> reduce xs
= xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge xs []
= xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge [] ys
= ys</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge xs@(x:xs')
ys@(y:ys') </FONT></DIV>
<DIV><FONT face="monospace, monospace">
| x `cmp` y == GT = y : merge xs ys'</FONT></DIV>
<DIV><FONT face="monospace, monospace">
| otherwise = x : merge xs'
ys</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><SPAN style="FONT-FAMILY: monospace,monospace">gSortBy cmp =
mergeAll . sequences</SPAN><BR></DIV>
<DIV><FONT face="monospace, monospace"> where</FONT></DIV>
<DIV><FONT face="monospace, monospace"> sequences
(a:b:xs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | a `cmp`
b == GT = descending b [a] xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> |
otherwise = ascending b (a:)
xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> sequences xs =
[xs]</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> descending a as
(b:bs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | a `cmp`
b == GT = descending b (a:as) bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> descending a as
bs = (a:as) `seq` (a:as) : sequences bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> ascending a as
(b:bs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | a `cmp`
b /= GT = ascending b (as . (a:)) bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> ascending a as bs
= as [a] `seq` as [a] : sequences bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> mergeAll [x] =
x</FONT></DIV>
<DIV><FONT face="monospace, monospace"> mergeAll xs
= mergeAll (mergePairs xs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> mergePairs
(a:b:xs) = merge a b `seq` merge a b : mergePairs xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> mergePairs xs
= xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge as@(a:as')
bs@(b:bs')</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | a `cmp`
b == GT = b : merge as bs'</FONT></DIV>
<DIV><FONT face="monospace, monospace"> |
otherwise = a : merge as' bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge [] bs
= bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> merge as []
= as</FONT></DIV></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><B>Before the change:</B></DIV>
<DIV><BR></DIV>
<DIV>
<DIV><FONT face="monospace, monospace">benchmarking random
ints/ghc</FONT></DIV>
<DIV><FONT face="monospace, monospace">time
3.687 s (3.541 s ..
NaN s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">
1.000 Rē (1.000
Rē .. 1.000 Rē)</FONT></DIV>
<DIV><FONT face="monospace, monospace">mean
3.691 s (3.669 s ..
3.705 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">std dev
21.45 ms (0.0 s .. 24.76
ms)</FONT></DIV>
<DIV><FONT face="monospace, monospace">variance introduced by
outliers: 19% (moderately inflated)</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace">benchmarking random
ints/greg</FONT></DIV>
<DIV><FONT face="monospace, monospace">time
2.648 s (2.482 s ..
2.822 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">
0.999 Rē (0.998
Rē .. 1.000 Rē)</FONT></DIV>
<DIV><FONT face="monospace, monospace">mean
2.704 s (2.670 s ..
2.736 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">std dev
52.68 ms (0.0 s .. 54.49
ms)</FONT></DIV>
<DIV><FONT face="monospace, monospace">variance introduced by
outliers: 19% (moderately inflated)</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace">benchmarking random
ints/gSort</FONT></DIV>
<DIV><FONT face="monospace, monospace">time
2.733 s (2.682 s ..
2.758 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">
1.000 Rē (1.000
Rē .. 1.000 Rē)</FONT></DIV>
<DIV><FONT face="monospace, monospace">mean
2.707 s (2.689 s ..
2.718 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">std dev
16.84 ms (0.0 s .. 19.20
ms)</FONT></DIV>
<DIV><FONT face="monospace, monospace">variance introduced by
outliers: 19% (moderately inflated)</FONT></DIV></DIV>
<DIV><BR></DIV>
<DIV><B>After the change:</B></DIV>
<DIV><BR></DIV>
<DIV>
<DIV><FONT face="monospace, monospace">benchmarking random
ints/greg</FONT></DIV>
<DIV><FONT face="monospace, monospace">time
2.576 s (2.548 s ..
2.628 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">
1.000 Rē (1.000
Rē .. 1.000 Rē)</FONT></DIV>
<DIV><FONT face="monospace, monospace">mean
2.590 s (2.578 s ..
2.599 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">std dev
12.99 ms (0.0 s .. 14.89
ms)</FONT></DIV>
<DIV><FONT face="monospace, monospace">variance introduced by
outliers: 19% (moderately inflated)</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace">benchmarking random
ints/gSort</FONT></DIV>
<DIV><FONT face="monospace, monospace">time
2.538 s (2.412 s ..
2.627 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">
1.000 Rē (0.999
Rē .. 1.000 Rē)</FONT></DIV>
<DIV><FONT face="monospace, monospace">mean
2.543 s (2.517 s ..
2.560 s)</FONT></DIV>
<DIV><FONT face="monospace, monospace">std dev
26.16 ms (0.0 s .. 30.21
ms)</FONT></DIV>
<DIV><FONT face="monospace, monospace">variance introduced by
outliers: 19% (moderately inflated)</FONT></DIV></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV></DIV>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan
Shanmugam <SPAN dir=ltr><<A
href="mailto:siddhanathan+eml@gmail.com"
target=_blank>siddhanathan+eml@gmail.com</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">
<DIV dir=ltr>
<DIV>
<DIV>Theoretically, we could do better. We currently only exploit
monotonic runs in merge sort, but we could also exploit bitonic
runs:</DIV>
<DIV><BR></DIV>
<DIV>
<DIV><FONT face="monospace, monospace"> dlist as = as
[] `seq` as []<BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> sequences [] =
[[]]</FONT></DIV>
<DIV><FONT face="monospace, monospace"> sequences [a] =
[[a]]</FONT></DIV>
<DIV><FONT face="monospace, monospace"> sequences
(a:xs) = bitonic a a (a:) xs</FONT></DIV>
<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>
<DIV><FONT face="monospace, monospace"> bitonic min max
as (b:bs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | b
`cmp` max /= LT = bitonic min b (as . (b:)) bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> | b
`cmp` min /= GT = bitonic b max ((b:) . as) bs</FONT></DIV>
<DIV><FONT face="monospace, monospace"> |
otherwise = dlist as : sequences (b:bs)</FONT></DIV>
<DIV><FONT face="monospace, monospace"> bitonic _ _ as
[] = [dlist as]</FONT></DIV>
<DIV><BR></DIV></DIV></DIV>
<DIV><BR></DIV>
<DIV>The constant factors here might be too high to notice the
difference though.</DIV><SPAN>
<DIV><BR></DIV>
<DIV><SPAN style="FONT-SIZE: 12px"><BR></SPAN></DIV>
<DIV><SPAN style="FONT-SIZE: 12px">> However, still my version is
more laziness-friendly, i.e. it requires fewer</SPAN><BR></DIV>
<DIV><SPAN style="FONT-SIZE: 12px">> comparisons to get
the</SPAN><BR style="FONT-SIZE: 12px"><SPAN
style="FONT-SIZE: 12px">> N smallest elements of a
list </SPAN><SPAN
style="FONT-SIZE: 12px">(see</SPAN><BR></DIV>> <A
style="FONT-SIZE: 12px"
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/<WBR>ghc-sort/blob/master/src/sort_<WBR>with_trace.hs</A><SPAN
style="FONT-SIZE: 12px">).</SPAN>
<DIV><SPAN style="FONT-SIZE: 12px">></SPAN></DIV>
<DIV><SPAN style="FONT-SIZE: 12px">> I wonder if this might not
be a more useful trait than being able to sort</SPAN><BR></DIV>
<DIV><SPAN style="FONT-SIZE: 12px">> already sorted lists super
fast.</SPAN><BR></DIV>
<DIV><SPAN style="FONT-SIZE: 12px"><BR></SPAN></DIV></SPAN>
<DIV><SPAN style="FONT-SIZE: 12px">This comes down to a discussion
of merge sort vs natural merge sort.</SPAN></DIV>
<DIV><BR></DIV>
<DIV>Data.List.sort is an implementation of a variant of merge sort
called natural merge sort. The algorithm is linearithmic in the
worst case, but linear in the best case (already sorted list).</DIV>
<DIV><BR></DIV>
<DIV><BR></DIV></DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618HOEnZb>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618h5>
<DIV class=gmail_extra><BR>
<DIV class=gmail_quote>On Sun, Mar 26, 2017 at 10:47 AM, Gregory
Popovitch <SPAN dir=ltr><<A href="mailto:greg7mdp@gmail.com"
target=_blank>greg7mdp@gmail.com</A>></SPAN> wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">Thanks
again @Siddhanathan! Looks like your gSort fixes the main issue
with<BR>Data.List.sort().<BR><BR>I have updated the test programs
in <A href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort</A>
to<BR>include your new version.<BR><BR>Here are the results (your
new version looks like a definite improvement vs<BR>the current
GHC one):<BR><BR>input
GHC sort
My Orig proposal
gSort<BR>------------------------------<WBR>------------------------------<WBR>----------------<BR>---<BR>sorted
ints (ascending) 151
456
148<BR>sorted ints
(descending) 152
466
155<BR>random ints
2732
2006
2004<BR>random strings
6564
5549
5528<BR><BR><BR>So replacing the current GHC
version with gSort is a no brainer, as it is<BR>better in all
regards.<BR><BR>However, still my version is more
laziness-friendly, i.e. it requires fewer<BR>comparisons to get
the<BR>N smallest elements of a list (see<BR><A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A>).<BR><BR>I
wonder if this might not be a more useful trait than being able to
sort<BR>already sorted lists super
fast.<BR><SPAN><BR>Thanks,<BR><BR>greg<BR><BR>______________________________<WBR>__<BR><BR>From:
<A href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A> [mailto:<A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A><WBR>] On Behalf
Of<BR>Siddhanathan Shanmugam<BR></SPAN>Sent: Sunday, March 26,
2017 1:05 PM<BR><SPAN>To: Gregory Popovitch<BR>Cc: Haskell
Libraries<BR>Subject: Re: Proposal: a new implementation for
Data.List.sort and<BR>Data.List.sortBy, which has better
performance characteristics and is
more<BR>laziness-friendly.<BR><BR><BR></SPAN>
<DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618m_6283889194055629001h5>Interesting.
You are right, performance for sorting random lists
has<BR>priority over performance for sorting already-sorted
lists.<BR><BR>Ignore the numbers for my previous version. Can you
compare GHC's sort, your<BR>proposal, and gSort
below?<BR><BR><BR>gSort :: Ord a => [a] -> [a]<BR>gSort =
gSortBy compare<BR>gSortBy cmp = mergeAll . sequences<BR>
where<BR> sequences (a:b:xs)<BR>
| a `cmp` b == GT = descending b [a] xs<BR>
| otherwise = ascending b
(a:) xs<BR> sequences xs = [xs]<BR><BR><BR>
descending a as (b:bs)<BR> | a `cmp` b
== GT = descending b (a:as) bs<BR> descending a as
bs = (a:as) : sequences bs<BR><BR><BR>
ascending a as (b:bs)<BR> | a `cmp` b /= GT =
ascending b (\ys -> as (a:ys)) bs<BR> ascending a
as bs = as [a] `seq` as [a] : sequences
bs<BR><BR><BR> mergeAll [x] = x<BR>
mergeAll xs = mergeAll (mergePairs xs)<BR><BR><BR>
mergePairs (a:b:xs) = merge a b : mergePairs xs<BR>
mergePairs xs =
xs<BR><BR><BR> merge as@(a:as') bs@(b:bs')<BR>
| a `cmp` b == GT = b : merge as bs'<BR>
| otherwise = a : merge
as' bs<BR> merge [] bs
= bs<BR> merge as []
= as<BR><BR><BR>Thanks,<BR>Sid<BR><BR><BR>On Sun, Mar 26,
2017 at 9:19 AM, Gregory Popovitch <<A
href="mailto:greg7mdp@gmail.com"
target=_blank>greg7mdp@gmail.com</A>><BR>wrote:<BR><BR><BR>
Thank you @Siddhanathan! I welcome any
improvement you may make, as<BR>I said I<BR>
am very far from a Haskell expert.<BR><BR>
I just tested your change with my test
project<BR> (<A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A><BR></DIV></DIV><<A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>>
)<BR>
<DIV>
<DIV
class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618m_6283889194055629001h5>
and here are my results (mean times in
ms):<BR><BR> input
GHC
sort Orig
proposal<BR>your<BR>
change<BR><BR>------------------------------<WBR>------------------------------<WBR>----------------<BR>
---<BR> sorted
ints (ascending) 153
467<BR>139<BR>
sorted ints (descending)
152
472<BR>599<BR> random ints
2824
2077<BR>2126<BR>
random strings
6564
5613<BR>5983<BR><BR> Your change
is a definite improvement for sorted integers
in<BR>ascending<BR> order, but is worse
for other cases.<BR><BR> Is there a
real need to optimize the sort for already sorted list?<BR>Of
course<BR> it should not be a
degenerate<BR> case and take longer
than sorting random numbers, but this is not<BR>the case<BR>
here. Sorting already sorted<BR>
lists is, even with my version, over 4 times faster
than sorting<BR>random<BR> lists. This
sounds perfectly<BR> acceptable to me,
and I feel that trying to optimize this specific<BR>case<BR>
further, if it comes at the<BR>
detriment of the general case, is not
desirable.<BR><BR>
Thanks,<BR><BR> greg<BR><BR>
______________________________<WBR>__<BR><BR>
From: <A href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A> [mailto:<A
href="mailto:siddhanathan@gmail.com"
target=_blank>siddhanathan@gmail.com</A><WBR>] On<BR>Behalf
Of<BR> Siddhanathan Shanmugam<BR>
Sent: Sunday, March 26, 2017 11:41
AM<BR> To: Gregory Popovitch<BR>
Cc: Haskell Libraries<BR>
Subject: Re: Proposal: a new implementation for
Data.List.sort and<BR>
Data.List.sortBy, which has better performance characteristics
and<BR>is more<BR>
laziness-friendly.<BR><BR><BR><BR>
Thank you! This identifies a space leak in base which went
unnoticed<BR>for 7<BR>
years.<BR><BR> Your implementation can
be improved further. Instead of splitting<BR>into<BR>
pairs, you could instead split into lists of sorted
sublists by<BR>replacing<BR> the pairs
function with the following<BR><BR>
pair = foldr f []<BR>
where<BR>
f x [] = [[x]]<BR>
f x (y:ys)<BR>
| x `cmp` head y == LT =
(x:y):ys<BR>
| otherwise
= [x]:y:ys<BR><BR> This should give you
the same performance improvements for sorting<BR>random<BR>
lists, but better performance while sorting
ascending lists.<BR><BR> The version in
base takes it one step further by using a DList to<BR>handle
the<BR> descending case efficiently as
well, except there's a space leak<BR>right now<BR>
because of which it is slower.<BR><BR>
On Sun, Mar 26, 2017 at 7:21 AM, Gregory
Popovitch<BR><<A href="mailto:greg7mdp@gmail.com"
target=_blank>greg7mdp@gmail.com</A>><BR>
wrote:<BR><BR><BR><BR>
Motivation:<BR>
----------<BR><BR>
Data.List.sort is a very
important functionality in Haskell.<BR>I<BR>
believe that<BR>
the proposed implementation is:<BR><BR>
- significantly faster
than the current implementation on<BR>unsorted<BR>
lists,<BR>
typically 14% to 27% faster<BR>
- more laziness-friendly,
i.e.:<BR>
take 3 $ sort l<BR>
will require significantly less
comparisons than the<BR>current<BR>
implementation<BR><BR>
Proposed
Implementation<BR>
-----------------------<BR><BR>
sort :: (Ord a) => [a] ->
[a]<BR>
sort = sortBy compare<BR><BR>
sortBy cmp [] = []<BR>
sortBy cmp xs = head $
until (null.tail) reduce (pair xs)<BR>
where<BR>
pair (x:y:t) | x
`cmp` y == GT = [y, x] : pair t<BR>
| otherwise
= [x, y] : pair t<BR>
pair [x] = [[x]]<BR>
pair [] = []<BR><BR>
reduce (v:w:x:y:t) = merge v'
x' : reduce t<BR>
where v' = merge v w<BR>
x' = merge x y<BR><BR>
reduce (x:y:t) = merge x y : reduce t<BR>
reduce xs
= xs<BR><BR>
merge xs []
= xs<BR>
merge [] ys
= ys<BR>
merge xs@(x:xs')
ys@(y:ys')<BR>
| x `cmp` y == GT =
y : merge xs ys'<BR>
|
otherwise = x : merge xs'
ys<BR><BR><BR>
Effect and Interactions<BR>
-----------------------<BR><BR>
I have a stack
project with a criterion test for this new<BR>
implementation,<BR>
available at <A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort</A><BR><<A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>><BR><BR>
<<A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A><BR><<A
href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>>
> .<BR>
I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and<BR>had
the<BR> following<BR>
results:<BR><BR>
- sorting of
random lists of integers is 27% faster<BR>
- sorting of random lists of
strings is 14% faster<BR>
- sorting of already sorted lists is significantly
slower,<BR>but still<BR> much<BR>
faster than
sorting random lists<BR>
- proposed version is more laziness friendly. For
example<BR>this<BR> version
of<BR>
sortBy requires 11 comparisons to find<BR>
the smallest element of
a 15 element list, while the<BR>default<BR>
Data.List.sortBy requires 15
comparisons.<BR>
(see<BR><BR><BR><A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A><BR><<A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A>><BR><BR><<A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A><BR><<A
href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs"
rel=noreferrer
target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A>>
><BR>)<BR><BR><BR><BR>
Test results<BR>
------------<BR><BR>
Criterion output
(descending/ascending results are for<BR>already<BR>
sorted<BR>
lists).<BR>
I barely understand what Criterion does, and I am
puzzled<BR>with the<BR>
various<BR>
"T" output - maybe there is a bug in my bench code:<BR><BR>
vagrant@vagrant-ubuntu-trusty-<WBR>64:/vagrant$ stack
exec<BR>ghc-sort<BR>
benchmarking ascending ints/ghc<BR>
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT<WBR>TTTTTTTtime<BR>160.6
ms<BR> (153.4<BR>
ms .. 167.8 ms)<BR>
0.997 Rē (0.986 Rē .. 1.000 Rē)<BR>
mean
161.7 ms
(158.3 ms .. 165.9 ms)<BR>
std dev
5.210 ms (3.193 ms .. 7.006
ms)<BR>
variance introduced by outliers: 12% (moderately
inflated)<BR><BR>
benchmarking ascending ints/greg<BR>
TTTTTTTTTTTTTTTTtime
473.8
ms (398.6 ms ..<BR>554.9<BR>
ms)<BR>
0.996 Rē (0.987 Rē .. 1.000
Rē)<BR>
mean
466.2 ms (449.0 ms .. 475.0 ms)<BR>
std dev
14.94 ms (0.0 s ..
15.29 ms)<BR>
variance introduced by outliers: 19% (moderately
inflated)<BR><BR>
benchmarking descending ints/ghc<BR>
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT<WBR>TTTTTTTtime<BR>165.1
ms<BR> (148.2<BR>
ms .. 178.2 ms)<BR>
0.991 Rē (0.957 Rē .. 1.000 Rē)<BR>
mean
158.7 ms
(154.0 ms .. 164.3 ms)<BR>
std dev
7.075 ms (4.152 ms .. 9.903
ms)<BR>
variance introduced by outliers: 12% (moderately
inflated)<BR><BR>
benchmarking descending ints/greg<BR>
TTTTTTTTTTTTTTTTtime
471.7
ms (419.8 ms ..<BR>508.3<BR>
ms)<BR>
0.999 Rē (0.995 Rē .. 1.000
Rē)<BR>
mean
476.0 ms (467.5 ms .. 480.0 ms)<BR>
std dev
7.447 ms (67.99 as
.. 7.865 ms)<BR>
variance introduced by outliers: 19% (moderately
inflated)<BR><BR>
benchmarking random ints/ghc<BR>
TTTTTTTTTTTTTTTTtime
2.852 s
(2.564 s ..<BR>3.019 s)<BR>
0.999 Rē
(0.997 Rē .. 1.000 Rē)<BR>
mean
2.812 s (2.785 s .. 2.838
s)<BR> std
dev 44.06 ms
(543.9 as .. 44.97 ms)<BR>
variance introduced by outliers: 19%
(moderately inflated)<BR><BR>
benchmarking random ints/greg<BR>
TTTTTTTTTTTTTTTTtime
2.032 s (1.993 s ..<BR>2.076
s)<BR>
1.000 Rē (1.000 Rē .. 1.000 Rē)<BR>
mean
2.028 s
(2.019 s .. 2.033 s)<BR>
std dev
7.832 ms (0.0 s .. 8.178 ms)<BR>
variance
introduced by outliers: 19% (moderately inflated)<BR><BR>
benchmarking
shakespeare/ghc<BR>
TTTTTTTTTTTTTTTTtime
6.504 s (6.391 s
..<BR>6.694 s)<BR>
1.000 Rē (1.000 Rē .. 1.000
Rē)<BR>
mean
6.499 s (6.468 s .. 6.518 s)<BR>
std dev
28.85 ms (0.0 s ..
32.62 ms)<BR>
variance introduced by outliers: 19% (moderately
inflated)<BR><BR>
benchmarking shakespeare/greg<BR>
TTTTTTTTTTTTTTTTtime
5.560
s (5.307 s ..<BR>5.763 s)<BR>
1.000 Rē
(0.999 Rē .. 1.000 Rē)<BR>
mean
5.582 s (5.537 s .. 5.607
s)<BR> std
dev 39.30 ms
(0.0 s .. 43.49 ms)<BR>
variance introduced by outliers: 19%
(moderately inflated)<BR><BR><BR>
Costs and Drawbacks<BR>
-------------------<BR><BR>
The only cost I see is the reduced
performance when sorting<BR>already<BR>
sorted<BR>
lists. However, since this remains quite efficient, indeed<BR>over
4<BR> times<BR>
faster than sorting unsorted
lists, I think it is an<BR>acceptable<BR>
tradeoff.<BR><BR>
Final note<BR>
----------<BR><BR>
My Haskell is very rusty. I worked on
this a couple years<BR>ago when I<BR>
was<BR>
learning Haskell, and meant to propose it to the
Haskell<BR>community,<BR> but<BR>
never got to it
at the time.<BR><BR>
______________________________<WBR>_________________<BR>
Libraries mailing
list<BR> <A
href="mailto:Libraries@haskell.org"
target=_blank>Libraries@haskell.org</A><BR>
<A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><<A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A>><BR><BR></DIV></DIV>
<<A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A><BR><<A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A>>
><BR><BR><BR><BR><BR><BR><BR><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries
mailing list<BR><A href="mailto:Libraries@haskell.org"
target=_blank>Libraries@haskell.org</A><BR><A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries
mailing list<BR><A href="mailto:Libraries@haskell.org"
target=_blank>Libraries@haskell.org</A><BR><A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries
mailing list<BR><A
href="mailto:Libraries@haskell.org">Libraries@haskell.org</A><BR><A
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries"
rel=noreferrer
target=_blank>http://mail.haskell.org/cgi-<WBR>bin/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></BODY></HTML>