<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>

<META name=GENERATOR content="MSHTML 11.00.10570.1001"></HEAD>

<BODY>

<DIV dir=ltr align=left><FONT color=#000080 face=Calibri>

<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080 

face=Calibri>Sid, this new version (diff below) is really fast when sorting 

random ints, but slower when sorting strings:</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080 

face=Calibri></FONT></SPAN> </DIV>

<DIV dir=ltr align=left><SPAN class=297205711-28032017><FONT color=#000080 

face="Courier New">input                        

GHC sort          Orig 

proposal        

gSort<BR>-------------------------------------------------------------------------<BR>sorted 

ints (ascending)      

151               

460                  

147<BR>sorted ints (descending)     

151               

467                  

171<BR>random 

ints                 

2771              

2010                 

1365<BR>random 

strings              

6542              

5524                 

5991</FONT></SPAN></DIV>

<DIV> </DIV>

<DIV><FONT color=#000080 face=Calibri></FONT> </DIV>

<DIV><FONT color=#000080 face=Calibri><IMG 

src="cid:165160212@28032017-0395"></FONT></DIV>

<DIV><FONT color=#000080 face=Calibri></FONT> </DIV>

<DIV><SPAN class=297205711-28032017><FONT color=#000080 

face=Calibri>Thanks,</FONT></SPAN></DIV>

<DIV><SPAN class=297205711-28032017><FONT color=#000080 

face=Calibri></FONT></SPAN> </DIV>

<DIV><SPAN class=297205711-28032017><FONT color=#000080 

face=Calibri>greg</FONT></SPAN></DIV></FONT></DIV><BR>

<DIV lang=en-us class=OutlookMessageHeader dir=ltr align=left>

<HR tabIndex=-1>

<FONT size=2 face=Tahoma><B>From:</B> siddhanathan@gmail.com 

[mailto:siddhanathan@gmail.com] <B>On Behalf Of </B>Siddhanathan 

Shanmugam<BR><B>Sent:</B> Tuesday, March 28, 2017 2:45 AM<BR><B>To:</B> Dan 

Burton<BR><B>Cc:</B> Haskell Libraries; Gregory Popovitch<BR><B>Subject:</B> Re: 

fix for Data.List.sortBy<BR></FONT><BR></DIV>

<DIV></DIV>

<DIV dir=ltr>Turns out we don't need seq at all. A simple refactoring of the 

merge function does the trick equally well.

<DIV><BR></DIV>

<DIV>

<DIV><FONT face="monospace, monospace">    mergePairs (a:b:xs) = merge 

id a b : mergePairs xs</FONT></DIV>

<DIV><FONT face="monospace, monospace">    mergePairs xs     

  = xs</FONT></DIV>

<DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

<DIV><FONT face="monospace, monospace">    merge f as@(a:as') 

bs@(b:bs')</FONT></DIV>

<DIV><FONT face="monospace, monospace">      | a `cmp` b == GT = 

merge (f.(b:)) as  bs'</FONT></DIV>

<DIV><FONT face="monospace, monospace">      | otherwise   

    = merge (f.(a:)) as' bs</FONT></DIV>

<DIV><FONT face="monospace, monospace">    merge f [] bs     

  = f bs</FONT></DIV>

<DIV><FONT face="monospace, monospace">    merge f as []     

  = f as</FONT></DIV></DIV>

<DIV><BR></DIV>

<DIV>This variant is 10% faster in my tests.</DIV>

<DIV><BR></DIV>

<DIV class=gmail_extra><BR>

<DIV class=gmail_quote>On Mon, Mar 27, 2017 at 5:49 PM, Dan Burton <SPAN 

dir=ltr><<A href="mailto:danburton.email@gmail.com" 

target=_blank>danburton.email@gmail.com</A>></SPAN> wrote:<BR>

<BLOCKQUOTE class=gmail_quote 

style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">

  <DIV dir=ltr>Does this rely on Common Subexpression Elimination optimization 

  in order to work? Would it work more reliably if the `seq`-ed expression were 

  let-bound? </DIV></BLOCKQUOTE>

<DIV><BR></DIV>

<DIV>I don't think it relies heavily on CSE. The seq's are there to avoid a 

cascading series of thunk evaluations. Using let expressions doesn't seem to 

affect the benchmarks.</DIV>

<DIV> </DIV>

<BLOCKQUOTE class=gmail_quote 

style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">

  <DIV class=gmail_extra><SPAN class=HOEnZb><FONT color=#888888><BR clear=all>

  <DIV>

  <DIV class=m_4214469927093090448gmail_signature 

  data-smartmail="gmail_signature">-- Dan Burton</DIV></DIV></FONT></SPAN>

  <DIV>

  <DIV class=h5><BR>

  <DIV class=gmail_quote>On Mon, Mar 27, 2017 at 5:41 PM, David Feuer <SPAN 

  dir=ltr><<A href="mailto:david.feuer@gmail.com" 

  target=_blank>david.feuer@gmail.com</A>></SPAN> wrote:<BR>

  <BLOCKQUOTE class=gmail_quote 

  style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex">

    <DIV>The first seq is useless: constructor application is never suspended. I 

    haven't had a chance to look at the rest yet.</DIV>

    <DIV class=m_4214469927093090448HOEnZb>

    <DIV class=m_4214469927093090448h5>

    <DIV class=gmail_extra><BR>

    <DIV class=gmail_quote>On Mar 27, 2017 7:59 PM, "Gregory Popovitch" <<A 

    href="mailto:greg7mdp@gmail.com" target=_blank>greg7mdp@gmail.com</A>> 

    wrote:<BR type="attribution">

    <BLOCKQUOTE class=gmail_quote 

    style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>

      <DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      face=Calibri><FONT color=#000080><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017>Sid, 

      </SPAN></FONT></FONT></SPAN></DIV>

      <DIV dir=ltr align=left>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      face=Calibri><FONT color=#000080><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017></SPAN></FONT></FONT></SPAN> </DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      face=Calibri><FONT color=#000080><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017></SPAN>I'd 

      be delighted to submit<SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140755023923-27032017> 

      </SPAN>the patch, as long as I have permission (which I probably don't), 

      you feel confident about the change and maybe a couple of other people 

      agree.</FONT></FONT></SPAN></DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      color=#000080 face=Calibri></FONT></SPAN> </DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      face=Calibri><FONT color=#000080>Here is the proposed change. Tests shows 

      significant speed improvement (30%) when sorting lists of random numbers, 

      and same efficiency for sorting already sorted lists (both ascending and 

      descending).<SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140523135523-27032017> </SPAN></FONT></FONT></SPAN></DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      face=Calibri><FONT color=#000080><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140523135523-27032017></SPAN></FONT></FONT></SPAN> </DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      color=#000080 face=Calibri></FONT></SPAN> </DIV>

      <DIV dir=ltr align=left><SPAN 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140146023323-27032017><FONT 

      color=#000080 face=Calibri>greg</FONT></SPAN></DIV></DIV><BR>

      <DIV lang=en-us 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140OutlookMessageHeader 

      dir=ltr align=left>

      <HR>

      <FONT size=2 face=Tahoma><B>From:</B> <A 

      href="mailto:siddhanathan@gmail.com" 

      target=_blank>siddhanathan@gmail.com</A> [mailto:<A 

      href="mailto:siddhanathan@gmail.com" 

      target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of 

      </B>Siddhanathan Shanmugam<BR><B>Sent:</B> Monday, March 27, 2017 6:53 

      PM<BR><B>To:</B> Gregory Popovitch<BR><B>Subject:</B> RE: Proposal: a new 

      implementation for Data.List.sort and Data.List.sortBy, which has better 

      performance characteristics and is more 

      laziness-friendly.<BR></FONT><BR></DIV>

      <DIV></DIV>

      <DIV>

      <DIV>Since I don't see any regressions, this doesn't really need CLC 

      approval. The changes are also small enough that a Github PR may be 

      accepted (otherwise, the change goes in via Phabricator).<BR></DIV>

      <DIV><BR></DIV>

      <DIV>Are you interested in implementing this patch? If yes, a standard 

      Github PR should be fine. Right now gSort is a three line change I think. 

      It will be changed in ghc/libraries/base/Data/OldLis<WBR>t.hs on the 

      ghc/ghc repo on Github.</DIV>

      <DIV><BR></DIV>

      <DIV><SPAN style="FONT-FAMILY: sans-serif">I'm hoping for some more 

      comments from other Haskellers, before pushing for this change in base. I 

      feel like we may be missing a potential optimization that someone else 

      might spot. So probably going to wait a few 

      days. </SPAN><BR></DIV><BR>

      <DIV class=gmail_extra><BR>

      <DIV class=gmail_quote>On Mar 27, 2017 11:43 AM, "Gregory Popovitch" 

      <<A href="mailto:greg7mdp@gmail.com" 

      target=_blank>greg7mdp@gmail.com</A>> wrote:<BR type="attribution">

      <BLOCKQUOTE 

      class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140quote 

      style="PADDING-LEFT: 1ex; BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>

        <DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri>Hi Sid,</FONT></SPAN></DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri></FONT></SPAN> </DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri>Thanks, glad you looked into that. My 

        understanding of the Haskell execution model is really poor, so 

        I can't say one way or the other, but I felt that laziness 

        ought to be considered as well, and I'm glad it was 

        :-)</FONT></SPAN></DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri></FONT></SPAN> </DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri>So  in conclusion it looks like we 

        have a winner with your latest gSortBy version. How do we get this 

        pushed to the GHC library?</FONT></SPAN></DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri></FONT></SPAN> </DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri></FONT></SPAN> </DIV>

        <DIV dir=ltr align=left><SPAN 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236217273418-27032017><FONT 

        color=#000080 face=Calibri>greg</FONT></SPAN></DIV><BR>

        <DIV lang=en-us 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236OutlookMessageHeader 

        dir=ltr align=left>

        <HR>

        <FONT size=2 face=Tahoma>

        <DIV 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140quoted-text><B>From:</B> 

        <A href="mailto:siddhanathan@gmail.com" 

        target=_blank>siddhanathan@gmail.com</A> [mailto:<A 

        href="mailto:siddhanathan@gmail.com" 

        target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of 

        </B>Siddhanathan Shanmugam<BR></DIV><B>Sent:</B> Monday, March 27, 2017 

        2:12 PM<BR><B>To:</B> Gregory Popovitch 

        <DIV 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140elided-text><BR><B>Subject:</B> 

        Re: Proposal: a new implementation for Data.List.sort and 

        Data.List.sortBy, which has better performance characteristics and is 

        more laziness-friendly.<BR></DIV></FONT><BR></DIV>

        <DIV 

        class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140elided-text>

        <DIV></DIV>

        <DIV dir=ltr>Hi Greg,<BR>

        <DIV class=gmail_extra><BR>

        <DIV class=gmail_quote>On Mon, Mar 27, 2017 at 10:19 AM, Gregory 

        Popovitch <SPAN dir=ltr><<A href="mailto:greg7mdp@gmail.com" 

        target=_blank>greg7mdp@gmail.com</A>></SPAN> wrote:<BR>

        <BLOCKQUOTE class=gmail_quote 

        style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex"><U></U>

          <DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>Unfortunately, 

          this optimization makes the sort less lazy, so doing something 

          like:</SPAN></FONT></DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>take 4 $ 

          sort l</SPAN></FONT></DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>requires 

          more sorting of the list l with this change. I'm not sure it is a good 

          tradeoff.</SPAN></FONT></DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN></FONT> </DIV>

          <DIV dir=ltr align=left><FONT color=#000080 face=Calibri><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017>This 

          can be verified with: <A 

          href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

          target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A></SPAN></FONT></DIV></DIV></BLOCKQUOTE>

        <DIV><BR></DIV>

        <DIV>I think you're running without optimizations turned on. It is lazy 

        in my case.</DIV>

        <DIV><BR></DIV>

        <DIV>Also, the difference should be negligible (if any at all). Here's 

        an example of the list being sorted:</DIV>

        <DIV><BR></DIV>

        <DIV>

        <DIV>[11,4,6,8,2,5,1,7,9,55,11,3]</DIV></DIV>

        <DIV>...</DIV>

        <DIV>[[4,11],[6,8],[2,5],[1,7,9,55]<WBR>,[3,11],[]]<BR></DIV>

        <DIV>...</DIV>

        <DIV>

        <DIV>[[1,2,4,5,6,7,8,9,11,55],[3,11<WBR>]]</DIV>

        <DIV> * 1 3</DIV>

        <DIV> * 2 3</DIV>

        <DIV> * 4 3</DIV>

        <DIV> * 4 11</DIV>

        <DIV>[1,2,3,4]</DIV></DIV>

        <DIV><BR></DIV>

        <DIV>The number of operations saved is only in the last merge. It's only 

        lazy at this step.</DIV>

        <DIV><BR></DIV>

        <DIV>So we save at most one traversal of the list, which is not too 

        expensive since our worst case bounds is O(n log n) anyway.</DIV>

        <DIV><BR></DIV>

        <DIV>This should mean that the asymptotic performance should be 

        identical, regardless of the number of comparisons saved. Of course, you 

        do get better constants, but I would be surprised if those constants 

        translated to significantly better performance for a reasonable size 

        list.</DIV>

        <DIV> </DIV>

        <BLOCKQUOTE class=gmail_quote 

        style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">

          <DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT 

          color=#000080 face=Calibri></FONT></SPAN> </DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017> <FONT 

          color=#000080 face=Calibri>I do agree that it would be nice to have a 

          more serious validation test suite.</FONT></SPAN></DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT 

          color=#000080 face=Calibri></FONT></SPAN> </DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT 

          color=#000080 face=Calibri>Thanks,</FONT></SPAN></DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT 

          color=#000080 face=Calibri></FONT></SPAN> </DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017><FONT 

          color=#000080 face=Calibri>greg</FONT></SPAN></DIV>

          <DIV dir=ltr align=left><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618136311517-27032017></SPAN> </DIV>

          <DIV dir=ltr align=left>

          <HR>

          </DIV>

          <DIV dir=ltr align=left><FONT size=2 face=Tahoma><SPAN 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-><B>From:</B> 

          <A href="mailto:siddhanathan@gmail.com" 

          target=_blank>siddhanathan@gmail.com</A> [mailto:<A 

          href="mailto:siddhanathan@gmail.com" 

          target=_blank>siddhanathan@gmail.com</A><WBR>] <B>On Behalf Of 

          </B>Siddhanathan Shanmugam<BR></SPAN><B>Sent:</B> Monday, March 27, 

          2017 12:53 PM 

          <DIV>

          <DIV 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-h5><BR><B>To:</B> 

          Gregory Popovitch<BR><B>Cc:</B> Haskell Libraries<BR><B>Subject:</B> 

          Re: Proposal: a new implementation for Data.List.sort and 

          Data.List.sortBy, which has better performance characteristics and is 

          more laziness-friendly.<BR></DIV></DIV></FONT><BR></DIV>

          <DIV>

          <DIV 

          class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-h5>

          <DIV></DIV>

          <DIV dir=ltr>

          <DIV>We can improve things a bit further by forcing evaluation (with 

          seq) along the way appropriately.<BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV>

          <DIV><SPAN style="FONT-FAMILY: monospace,monospace">gregSortBy cmp [] 

          = []</SPAN><BR></DIV>

          <DIV><FONT face="monospace, monospace">gregSortBy cmp xs = head $ 

          until (null.tail) reduce (pair xs)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">  where</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    pair (x:y:t) | x 

          `cmp` y == GT  = [y, x] : pair t</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                   | otherwise       

           = [x, y] : pair t</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    pair [x] = 

          [[x]]</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    pair []  = 

          []</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    reduce 

          (v:w:x:y:t) = merge v' x' `seq` merge v' x' : reduce t</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                           where v' 

          = merge v w `seq` merge v w</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

             x' = merge x y `seq` merge x y</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

           </FONT></DIV>

          <DIV><FONT face="monospace, monospace">    reduce (x:y:t) = 

          merge x y `seq` merge x y : reduce t</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    reduce xs   

             = xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge xs [] 

                    = xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge []  ys 

                   = ys</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge xs@(x:xs') 

          ys@(y:ys') </FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

           | x `cmp` y == GT  = y : merge xs  ys'</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

           | otherwise        = x : merge xs' 

          ys</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><SPAN style="FONT-FAMILY: monospace,monospace">gSortBy cmp = 

          mergeAll . sequences</SPAN><BR></DIV>

          <DIV><FONT face="monospace, monospace">  where</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    sequences 

          (a:b:xs)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | a `cmp` 

          b == GT = descending b [a]  xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | 

          otherwise       = ascending  b (a:) 

xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    sequences xs = 

          [xs]</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    descending a as 

          (b:bs)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | a `cmp` 

          b == GT = descending b (a:as) bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    descending a as 

          bs  = (a:as) `seq` (a:as) : sequences bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    ascending a as 

          (b:bs)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | a `cmp` 

          b /= GT = ascending b (as . (a:)) bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    ascending a as bs 

            = as [a] `seq` as [a] : sequences bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    mergeAll [x] = 

          x</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    mergeAll xs 

           = mergeAll (mergePairs xs)</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    mergePairs 

          (a:b:xs) = merge a b `seq` merge a b : mergePairs xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    mergePairs xs 

                = xs</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge as@(a:as') 

          bs@(b:bs')</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | a `cmp` 

          b == GT = b : merge as  bs'</FONT></DIV>

          <DIV><FONT face="monospace, monospace">      | 

          otherwise       = a : merge as' bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge [] bs 

                  = bs</FONT></DIV>

          <DIV><FONT face="monospace, monospace">    merge as [] 

                  = as</FONT></DIV></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><B>Before the change:</B></DIV>

          <DIV><BR></DIV>

          <DIV>

          <DIV><FONT face="monospace, monospace">benchmarking random 

          ints/ghc</FONT></DIV>

          <DIV><FONT face="monospace, monospace">time       

                    3.687 s    (3.541 s .. 

          NaN s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                       1.000 R�   (1.000 

          R� .. 1.000 R�)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">mean       

                    3.691 s    (3.669 s .. 

          3.705 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">std dev       

                 21.45 ms   (0.0 s .. 24.76 

          ms)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">variance introduced by 

          outliers: 19% (moderately inflated)</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">benchmarking random 

          ints/greg</FONT></DIV>

          <DIV><FONT face="monospace, monospace">time       

                    2.648 s    (2.482 s .. 

          2.822 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                       0.999 R�   (0.998 

          R� .. 1.000 R�)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">mean       

                    2.704 s    (2.670 s .. 

          2.736 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">std dev       

                 52.68 ms   (0.0 s .. 54.49 

          ms)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">variance introduced by 

          outliers: 19% (moderately inflated)</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">benchmarking random 

          ints/gSort</FONT></DIV>

          <DIV><FONT face="monospace, monospace">time       

                    2.733 s    (2.682 s .. 

          2.758 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                       1.000 R�   (1.000 

          R� .. 1.000 R�)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">mean       

                    2.707 s    (2.689 s .. 

          2.718 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">std dev       

                 16.84 ms   (0.0 s .. 19.20 

          ms)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">variance introduced by 

          outliers: 19% (moderately inflated)</FONT></DIV></DIV>

          <DIV><BR></DIV>

          <DIV><B>After the change:</B></DIV>

          <DIV><BR></DIV>

          <DIV>

          <DIV><FONT face="monospace, monospace">benchmarking random 

          ints/greg</FONT></DIV>

          <DIV><FONT face="monospace, monospace">time       

                    2.576 s    (2.548 s .. 

          2.628 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                       1.000 R�   (1.000 

          R� .. 1.000 R�)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">mean       

                    2.590 s    (2.578 s .. 

          2.599 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">std dev       

                 12.99 ms   (0.0 s .. 14.89 

          ms)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">variance introduced by 

          outliers: 19% (moderately inflated)</FONT></DIV>

          <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

          <DIV><FONT face="monospace, monospace">benchmarking random 

          ints/gSort</FONT></DIV>

          <DIV><FONT face="monospace, monospace">time       

                    2.538 s    (2.412 s .. 

          2.627 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">        

                       1.000 R�   (0.999 

          R� .. 1.000 R�)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">mean       

                    2.543 s    (2.517 s .. 

          2.560 s)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">std dev       

                 26.16 ms   (0.0 s .. 30.21 

          ms)</FONT></DIV>

          <DIV><FONT face="monospace, monospace">variance introduced by 

          outliers: 19% (moderately inflated)</FONT></DIV></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV>

          <DIV><BR></DIV></DIV>

          <DIV class=gmail_extra><BR>

          <DIV class=gmail_quote>On Sun, Mar 26, 2017 at 1:54 PM, Siddhanathan 

          Shanmugam <SPAN dir=ltr><<A 

          href="mailto:siddhanathan+eml@gmail.com" 

          target=_blank>siddhanathan+eml@gmail.com</A>></SPAN> wrote:<BR>

          <BLOCKQUOTE class=gmail_quote 

          style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">

            <DIV dir=ltr>

            <DIV>

            <DIV>Theoretically, we could do better. We currently only exploit 

            monotonic runs in merge sort, but we could also exploit bitonic 

            runs:</DIV>

            <DIV><BR></DIV>

            <DIV>

            <DIV><FONT face="monospace, monospace">    dlist as = as 

            [] `seq` as []<BR></FONT></DIV>

            <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

            <DIV><FONT face="monospace, monospace">    sequences [] = 

            [[]]</FONT></DIV>

            <DIV><FONT face="monospace, monospace">    sequences [a] = 

            [[a]]</FONT></DIV>

            <DIV><FONT face="monospace, monospace">    sequences 

            (a:xs) = bitonic a a (a:) xs</FONT></DIV>

            <DIV><FONT face="monospace, monospace"><BR></FONT></DIV>

            <DIV><FONT face="monospace, monospace">    bitonic min max 

            as (b:bs)</FONT></DIV>

            <DIV><FONT face="monospace, monospace">      | b 

            `cmp` max /= LT = bitonic min b   (as . (b:)) bs</FONT></DIV>

            <DIV><FONT face="monospace, monospace">      | b 

            `cmp` min /= GT = bitonic b   max ((b:) . as) bs</FONT></DIV>

            <DIV><FONT face="monospace, monospace">      | 

            otherwise = dlist as : sequences (b:bs)</FONT></DIV>

            <DIV><FONT face="monospace, monospace">    bitonic _ _ as 

            [] = [dlist as]</FONT></DIV>

            <DIV><BR></DIV></DIV></DIV>

            <DIV><BR></DIV>

            <DIV>The constant factors here might be too high to notice the 

            difference though.</DIV><SPAN>

            <DIV><BR></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px"><BR></SPAN></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px">> However, still my version is 

            more laziness-friendly, i.e. it requires fewer</SPAN><BR></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px">> comparisons to get 

            the</SPAN><BR style="FONT-SIZE: 12px"><SPAN 

            style="FONT-SIZE: 12px">> N smallest elements of a 

            list </SPAN><SPAN 

            style="FONT-SIZE: 12px">(see</SPAN><BR></DIV>> <A 

            style="FONT-SIZE: 12px" 

            href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

            rel=noreferrer 

            target=_blank>https://github.com/greg7mdp/<WBR>ghc-sort/blob/master/src/sort_<WBR>with_trace.hs</A><SPAN 

            style="FONT-SIZE: 12px">).</SPAN> 

            <DIV><SPAN style="FONT-SIZE: 12px">></SPAN></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px">> I wonder if this might not 

            be a more useful trait than being able to sort</SPAN><BR></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px">> already sorted lists super 

            fast.</SPAN><BR></DIV>

            <DIV><SPAN style="FONT-SIZE: 12px"><BR></SPAN></DIV></SPAN>

            <DIV><SPAN style="FONT-SIZE: 12px">This comes down to a discussion 

            of merge sort vs natural merge sort.</SPAN></DIV>

            <DIV><BR></DIV>

            <DIV>Data.List.sort is an implementation of a variant of merge sort 

            called natural merge sort. The algorithm is linearithmic in the 

            worst case, but linear in the best case (already sorted list).</DIV>

            <DIV><BR></DIV>

            <DIV><BR></DIV></DIV>

            <DIV 

            class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618HOEnZb>

            <DIV 

            class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618h5>

            <DIV class=gmail_extra><BR>

            <DIV class=gmail_quote>On Sun, Mar 26, 2017 at 10:47 AM, Gregory 

            Popovitch <SPAN dir=ltr><<A href="mailto:greg7mdp@gmail.com" 

            target=_blank>greg7mdp@gmail.com</A>></SPAN> wrote:<BR>

            <BLOCKQUOTE class=gmail_quote 

            style="PADDING-LEFT: 1ex; BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0px 0px 0px 0.8ex">Thanks 

              again @Siddhanathan! Looks like your gSort fixes the main issue 

              with<BR>Data.List.sort().<BR><BR>I have updated the test programs 

              in <A href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort</A> 

              to<BR>include your new version.<BR><BR>Here are the results (your 

              new version looks like a definite improvement vs<BR>the current 

              GHC one):<BR><BR>input            

                          GHC sort    

                   My Orig proposal    

               gSort<BR>------------------------------<WBR>------------------------------<WBR>----------------<BR>---<BR>sorted 

              ints (ascending)      151      

                       456        

                        148<BR>sorted ints 

              (descending)     152        

                     466          

                      155<BR>random ints      

                         2732      

                      2006          

                     2004<BR>random strings    

                        6564        

                    5549            

                   5528<BR><BR><BR>So replacing the current GHC 

              version with gSort is a no brainer, as it is<BR>better in all 

              regards.<BR><BR>However, still my version is more 

              laziness-friendly, i.e. it requires fewer<BR>comparisons to get 

              the<BR>N smallest elements of a list (see<BR><A 

              href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

              rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A>).<BR><BR>I 

              wonder if this might not be a more useful trait than being able to 

              sort<BR>already sorted lists super 

              fast.<BR><SPAN><BR>Thanks,<BR><BR>greg<BR><BR>______________________________<WBR>__<BR><BR>From: 

              <A href="mailto:siddhanathan@gmail.com" 

              target=_blank>siddhanathan@gmail.com</A> [mailto:<A 

              href="mailto:siddhanathan@gmail.com" 

              target=_blank>siddhanathan@gmail.com</A><WBR>] On Behalf 

              Of<BR>Siddhanathan Shanmugam<BR></SPAN>Sent: Sunday, March 26, 

              2017 1:05 PM<BR><SPAN>To: Gregory Popovitch<BR>Cc: Haskell 

              Libraries<BR>Subject: Re: Proposal: a new implementation for 

              Data.List.sort and<BR>Data.List.sortBy, which has better 

              performance characteristics and is 

              more<BR>laziness-friendly.<BR><BR><BR></SPAN>

              <DIV>

              <DIV 

              class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618m_6283889194055629001h5>Interesting. 

              You are right, performance for sorting random lists 

              has<BR>priority over performance for sorting already-sorted 

              lists.<BR><BR>Ignore the numbers for my previous version. Can you 

              compare GHC's sort, your<BR>proposal, and gSort 

              below?<BR><BR><BR>gSort :: Ord a => [a] -> [a]<BR>gSort = 

              gSortBy compare<BR>gSortBy cmp = mergeAll . sequences<BR>  

              where<BR>    sequences (a:b:xs)<BR>      

              | a `cmp` b == GT = descending b [a]  xs<BR>    

                | otherwise       = ascending  b 

              (a:) xs<BR>    sequences xs = [xs]<BR><BR><BR>  

                descending a as (b:bs)<BR>      | a `cmp` b 

              == GT = descending b (a:as) bs<BR>    descending a as 

              bs  = (a:as) : sequences bs<BR><BR><BR>    

              ascending a as (b:bs)<BR>      | a `cmp` b /= GT = 

              ascending b (\ys -> as (a:ys)) bs<BR>    ascending a 

              as bs   = as [a] `seq` as [a] : sequences 

              bs<BR><BR><BR>    mergeAll [x] = x<BR>    

              mergeAll xs  = mergeAll (mergePairs xs)<BR><BR><BR>  

                mergePairs (a:b:xs) = merge a b : mergePairs xs<BR>  

                mergePairs xs       = 

              xs<BR><BR><BR>    merge as@(a:as') bs@(b:bs')<BR>  

                  | a `cmp` b == GT = b : merge as  bs'<BR>  

                  | otherwise       = a : merge 

              as' bs<BR>    merge [] bs        

               = bs<BR>    merge as []        

               = as<BR><BR><BR>Thanks,<BR>Sid<BR><BR><BR>On Sun, Mar 26, 

              2017 at 9:19 AM, Gregory Popovitch <<A 

              href="mailto:greg7mdp@gmail.com" 

              target=_blank>greg7mdp@gmail.com</A>><BR>wrote:<BR><BR><BR>  

                    Thank you @Siddhanathan! I welcome any 

              improvement you may make, as<BR>I said I<BR>      

                am very far from a Haskell expert.<BR><BR>    

                  I just tested your change with my test 

              project<BR>        (<A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A><BR></DIV></DIV><<A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>> 

              )<BR>

              <DIV>

              <DIV 

              class=m_4214469927093090448m_-8278973055558469163m_8686323323360374140m_-3409671060788413236gmail-m_-5012754740519492618m_6283889194055629001h5>  

                    and here are my results (mean times in 

              ms):<BR><BR>        input      

                                GHC 

              sort          Orig 

              proposal<BR>your<BR>        

              change<BR><BR>------------------------------<WBR>------------------------------<WBR>----------------<BR>  

                    ---<BR>        sorted 

              ints (ascending)      153      

                       467<BR>139<BR>    

                  sorted ints (descending)    

               152              

               472<BR>599<BR>        random ints  

                             2824  

                          2077<BR>2126<BR>  

                    random strings        

                    6564            

                5613<BR>5983<BR><BR>        Your change 

              is a definite improvement for sorted integers 

              in<BR>ascending<BR>        order, but is worse 

              for other cases.<BR><BR>        Is there a 

              real need to optimize the sort for already sorted list?<BR>Of 

              course<BR>        it should not be a 

              degenerate<BR>        case and take longer 

              than sorting random numbers, but this is not<BR>the case<BR>  

                    here. Sorting already sorted<BR>    

                  lists is, even with my version, over 4 times faster 

              than sorting<BR>random<BR>        lists. This 

              sounds perfectly<BR>        acceptable to me, 

              and I feel that trying to optimize this specific<BR>case<BR>  

                    further, if it comes at the<BR>    

                  detriment of the general case, is not 

              desirable.<BR><BR>        

              Thanks,<BR><BR>        greg<BR><BR>  

              ______________________________<WBR>__<BR><BR>      

                From: <A href="mailto:siddhanathan@gmail.com" 

              target=_blank>siddhanathan@gmail.com</A> [mailto:<A 

              href="mailto:siddhanathan@gmail.com" 

              target=_blank>siddhanathan@gmail.com</A><WBR>] On<BR>Behalf 

              Of<BR>        Siddhanathan Shanmugam<BR>  

                    Sent: Sunday, March 26, 2017 11:41 

              AM<BR>        To: Gregory Popovitch<BR>  

                    Cc: Haskell Libraries<BR>      

                Subject: Re: Proposal: a new implementation for 

              Data.List.sort and<BR>        

              Data.List.sortBy, which has better performance characteristics 

              and<BR>is more<BR>        

              laziness-friendly.<BR><BR><BR><BR>        

              Thank you! This identifies a space leak in base which went 

              unnoticed<BR>for 7<BR>        

              years.<BR><BR>        Your implementation can 

              be improved further. Instead of splitting<BR>into<BR>    

                  pairs, you could instead split into lists of sorted 

              sublists by<BR>replacing<BR>        the pairs 

              function with the following<BR><BR>        

                  pair = foldr f []<BR>        

                    where<BR>          

                    f x [] = [[x]]<BR>        

                      f x (y:ys)<BR>      

                          | x `cmp` head y == LT = 

              (x:y):ys<BR>              

                  | otherwise            

              = [x]:y:ys<BR><BR>        This should give you 

              the same performance improvements for sorting<BR>random<BR>  

                    lists, but better performance while sorting 

              ascending lists.<BR><BR>        The version in 

              base takes it one step further by using a DList to<BR>handle 

              the<BR>        descending case efficiently as 

              well, except there's a space leak<BR>right now<BR>    

                  because of which it is slower.<BR><BR>    

                  On Sun, Mar 26, 2017 at 7:21 AM, Gregory 

              Popovitch<BR><<A href="mailto:greg7mdp@gmail.com" 

              target=_blank>greg7mdp@gmail.com</A>><BR>      

                wrote:<BR><BR><BR><BR>          

                    Motivation:<BR>        

                      ----------<BR><BR>      

                        Data.List.sort is a very 

              important functionality in Haskell.<BR>I<BR>      

                believe that<BR>            

                  the proposed implementation is:<BR><BR>    

                          - significantly faster 

              than the current implementation on<BR>unsorted<BR>    

                  lists,<BR>            

                  typically 14% to 27% faster<BR>      

                        - more laziness-friendly, 

              i.e.:<BR>                

                  take 3 $ sort l<BR>        

                        will require significantly less 

              comparisons than the<BR>current<BR>        

                      implementation<BR><BR>    

                          Proposed 

              Implementation<BR>              

                -----------------------<BR><BR>        

                      sort :: (Ord a) => [a] -> 

              [a]<BR>                

              sort =  sortBy compare<BR><BR>        

                      sortBy cmp [] = []<BR>    

                          sortBy cmp xs = head $ 

              until (null.tail) reduce (pair xs)<BR>        

                        where<BR>      

                            pair (x:y:t) | x 

              `cmp` y == GT  = [y, x] : pair t<BR>      

                       | otherwise      

                = [x, y] : pair t<BR>          

                        pair [x] = [[x]]<BR>  

              pair []  = []<BR><BR>          

                        reduce (v:w:x:y:t) = merge v' 

              x' : reduce t<BR>              

                       where v' = merge v w<BR>  

                       x' = merge x y<BR><BR>  

              reduce (x:y:t) = merge x y : reduce t<BR>      

                            reduce xs  

                  = xs<BR><BR>          

                        merge xs []      

                   = xs<BR>          

                        merge []  ys    

                    = ys<BR>          

                        merge xs@(x:xs') 

              ys@(y:ys')<BR>              

                         | x `cmp` y == GT  = 

              y : merge xs  ys'<BR>          

                             | 

              otherwise        = x : merge xs' 

              ys<BR><BR><BR>              

                Effect and Interactions<BR>        

                      -----------------------<BR><BR>  

                            I have a stack 

              project with a criterion test for this new<BR>      

                implementation,<BR>          

                    available at <A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort</A><BR><<A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>><BR><BR>  

                    <<A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A><BR><<A 

              href="https://github.com/greg7mdp/ghc-sort" rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort</A>> 

              > .<BR>                

              I ran the tests on an Ubuntu 14.0.2 VM and GHC 8.0.2, and<BR>had 

              the<BR>        following<BR>    

                          results:<BR><BR>  

                            - sorting of 

              random lists of integers is 27% faster<BR>      

                        - sorting of random lists of 

              strings is 14% faster<BR>            

                  - sorting of already sorted lists is significantly 

              slower,<BR>but still<BR>        much<BR>  

                            faster than 

              sorting random lists<BR>            

                  - proposed version is more laziness friendly. For 

              example<BR>this<BR>        version 

              of<BR>                

              sortBy requires 11 comparisons to find<BR>      

                          the smallest element of 

              a 15 element list, while the<BR>default<BR>      

                        Data.List.sortBy requires 15 

              comparisons.<BR>              

                  (see<BR><BR><BR><A 

              href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

              rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/gh<WBR>c-sort/blob/master/src/sort_wi<WBR>th_trace.hs</A><BR><<A 

              href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

              rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A>><BR><BR><<A 

              href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

              rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A><BR><<A 

              href="https://github.com/greg7mdp/ghc-sort/blob/master/src/sort_with_trace.hs" 

              rel=noreferrer 

              target=_blank>https://github.com/greg7mdp/g<WBR>hc-sort/blob/master/src/sort_w<WBR>ith_trace.hs</A>> 

              ><BR>)<BR><BR><BR><BR>            

                  Test results<BR>          

                    ------------<BR><BR>      

                        Criterion output 

              (descending/ascending results are for<BR>already<BR>    

                  sorted<BR>            

                  lists).<BR>            

                  I barely understand what Criterion does, and I am 

              puzzled<BR>with the<BR>        

              various<BR>                

              "T" output - maybe there is a bug in my bench code:<BR><BR>  

              vagrant@vagrant-ubuntu-trusty-<WBR>64:/vagrant$ stack 

              exec<BR>ghc-sort<BR>            

                  benchmarking ascending ints/ghc<BR>    

              TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT<WBR>TTTTTTTtime<BR>160.6 

              ms<BR>        (153.4<BR>      

                        ms .. 167.8 ms)<BR>  

               0.997 R�   (0.986 R� .. 1.000 R�)<BR>    

                          mean      

                         161.7 ms  

               (158.3 ms .. 165.9 ms)<BR>          

                    std dev          

                  5.210 ms   (3.193 ms .. 7.006 

              ms)<BR>                

              variance introduced by outliers: 12% (moderately 

              inflated)<BR><BR>              

                benchmarking ascending ints/greg<BR>      

                        TTTTTTTTTTTTTTTTtime  

                             473.8 

              ms   (398.6 ms ..<BR>554.9<BR>      

                ms)<BR>              

                   0.996 R�   (0.987 R� .. 1.000 

              R�)<BR>                

              mean                

               466.2 ms   (449.0 ms .. 475.0 ms)<BR>    

                          std dev    

                        14.94 ms   (0.0 s .. 

              15.29 ms)<BR>              

                variance introduced by outliers: 19% (moderately 

              inflated)<BR><BR>              

                benchmarking descending ints/ghc<BR>      

              TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT<WBR>TTTTTTTtime<BR>165.1 

              ms<BR>        (148.2<BR>      

                        ms .. 178.2 ms)<BR>  

               0.991 R�   (0.957 R� .. 1.000 R�)<BR>    

                          mean      

                         158.7 ms  

               (154.0 ms .. 164.3 ms)<BR>          

                    std dev          

                  7.075 ms   (4.152 ms .. 9.903 

              ms)<BR>                

              variance introduced by outliers: 12% (moderately 

              inflated)<BR><BR>              

                benchmarking descending ints/greg<BR>      

                        TTTTTTTTTTTTTTTTtime  

                             471.7 

              ms   (419.8 ms ..<BR>508.3<BR>      

                ms)<BR>              

                   0.999 R�   (0.995 R� .. 1.000 

              R�)<BR>                

              mean                

               476.0 ms   (467.5 ms .. 480.0 ms)<BR>    

                          std dev    

                        7.447 ms   (67.99 as 

              .. 7.865 ms)<BR>              

                variance introduced by outliers: 19% (moderately 

              inflated)<BR><BR>              

                benchmarking random ints/ghc<BR>        

                      TTTTTTTTTTTTTTTTtime    

                           2.852 s  

                (2.564 s ..<BR>3.019 s)<BR>        

                         0.999 R�  

               (0.997 R� .. 1.000 R�)<BR>          

                    mean            

                   2.812 s    (2.785 s .. 2.838 

              s)<BR>                std 

              dev              44.06 ms  

               (543.9 as .. 44.97 ms)<BR>          

                    variance introduced by outliers: 19% 

              (moderately inflated)<BR><BR>          

                    benchmarking random ints/greg<BR>  

              TTTTTTTTTTTTTTTTtime            

                   2.032 s    (1.993 s ..<BR>2.076 

              s)<BR>                

                 1.000 R�   (1.000 R� .. 1.000 R�)<BR>  

                            mean    

                           2.028 s  

                (2.019 s .. 2.033 s)<BR>          

                    std dev          

                  7.832 ms   (0.0 s .. 8.178 ms)<BR>  

                            variance 

              introduced by outliers: 19% (moderately inflated)<BR><BR>  

                            benchmarking 

              shakespeare/ghc<BR>            

                  TTTTTTTTTTTTTTTTtime        

                       6.504 s    (6.391 s 

              ..<BR>6.694 s)<BR>              

                   1.000 R�   (1.000 R� .. 1.000 

              R�)<BR>                

              mean                

               6.499 s    (6.468 s .. 6.518 s)<BR>    

                          std dev    

                        28.85 ms   (0.0 s .. 

              32.62 ms)<BR>              

                variance introduced by outliers: 19% (moderately 

              inflated)<BR><BR>              

                benchmarking shakespeare/greg<BR>      

                        TTTTTTTTTTTTTTTTtime  

                             5.560 

              s    (5.307 s ..<BR>5.763 s)<BR>      

                           1.000 R�  

               (0.999 R� .. 1.000 R�)<BR>          

                    mean            

                   5.582 s    (5.537 s .. 5.607 

              s)<BR>                std 

              dev              39.30 ms  

               (0.0 s .. 43.49 ms)<BR>          

                    variance introduced by outliers: 19% 

              (moderately inflated)<BR><BR><BR>        

                      Costs and Drawbacks<BR>    

              -------------------<BR><BR>          

                    The only cost I see is the reduced 

              performance when sorting<BR>already<BR>        

              sorted<BR>                

              lists. However, since this remains quite efficient, indeed<BR>over 

              4<BR>        times<BR>      

                        faster than sorting unsorted 

              lists, I think it is an<BR>acceptable<BR>      

                tradeoff.<BR><BR>            

                  Final note<BR>          

                    ----------<BR><BR>        

                      My Haskell is very rusty. I worked on 

              this a couple years<BR>ago when I<BR>        

              was<BR>                

              learning Haskell, and meant to propose it to the 

              Haskell<BR>community,<BR>        but<BR>  

                            never got to it 

              at the time.<BR><BR>            

              ______________________________<WBR>_________________<BR>  

                            Libraries mailing 

              list<BR>                <A 

              href="mailto:Libraries@haskell.org" 

              target=_blank>Libraries@haskell.org</A><BR>      

                        <A 

              href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

              rel=noreferrer 

              target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><<A 

              href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

              rel=noreferrer 

              target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A>><BR><BR></DIV></DIV>  

                    <<A 

              href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

              rel=noreferrer 

              target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A><BR><<A 

              href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

              rel=noreferrer 

              target=_blank>http://mail.haskell.org/cgi-b<WBR>in/mailman/listinfo/libraries</A>> 

              ><BR><BR><BR><BR><BR><BR><BR><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries 

      mailing list<BR><A href="mailto:Libraries@haskell.org" 

      target=_blank>Libraries@haskell.org</A><BR><A 

      href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

      rel=noreferrer 

      target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries 

    mailing list<BR><A href="mailto:Libraries@haskell.org" 

    target=_blank>Libraries@haskell.org</A><BR><A 

    href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

    rel=noreferrer 

    target=_blank>http://mail.haskell.org/cgi-bi<WBR>n/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV><BR>______________________________<WBR>_________________<BR>Libraries 

  mailing list<BR><A 

  href="mailto:Libraries@haskell.org">Libraries@haskell.org</A><BR><A 

  href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" 

  rel=noreferrer 

  target=_blank>http://mail.haskell.org/cgi-<WBR>bin/mailman/listinfo/libraries</A><BR><BR></BLOCKQUOTE></DIV><BR></DIV></DIV></BODY></HTML>