[Haskell-cafe] natural mergesort in Data.List

Tue Sep 18 14:38:14 UTC 2018

Dear David,

thanks for the pointer.

Btw: I was able to modify my complexity proof so that "sequences" is no
longer required to only contain non-empty lists. Sorry for the noise.

But maybe such a "sequences" is not entirely useless: Did I understand
correctly that what you are saying is that knowing that all elements in
"sequences" are non-empty lists might have some impact on performance
due to being able to use "NonEmpty a"?

cheers

chris

On 09/18/2018 03:18 PM, David Feuer wrote:
> data NonEmpty a = a :| [a]
> 
> It's a nonempty list, defined in Data.List.NonEmpty.
> 
> On Tue, Sep 18, 2018, 9:14 AM Christian Sternagel <c.sternagel at gmail.com
> <mailto:c.sternagel at gmail.com>> wrote:
> 
>     Dear David,
> 
>     I am guaranteeing (since I proved it in Isabelle/HOL) that the following
>     version of "sequences" does not contain empty lists in its result (I am
>     copying from my Isabelle formalization, in order to avoid typos) ;)
> 
>     fun sequences :: "'a list ⇒ 'a list list"
>       and asc :: "'a ⇒ ('a list ⇒ 'a list) ⇒ 'a list ⇒ 'a list list"
>       and desc :: "'a ⇒ 'a list ⇒ 'a list ⇒ 'a list list"
>       where
>         "sequences (a # b # xs) =
>           (if key a > key b then desc b [a] xs else asc b ((#) a) xs)"
>       | "sequences [x] = [[x]]"
>       | "sequences [] = []"
>       | "asc a as (b # bs) =
>           (if ¬ key a > key b then asc b (λys. as (a # ys)) bs
>           else as [a] # sequences (b # bs))"
>       | "asc a as [] = [as [a]]"
>       | "desc a as (b # bs) =
>           (if key a > key b then desc b (a # as) bs
>           else (a # as) # sequences (b # bs))"
>       | "desc a as [] = [a # as]"
> 
>     The "key" function is an implicit first parameter for each of
>     "sequences", "asc", and "desc" above. The fact that I am using a "key"
>     function instead of a comparator is due to Isabelle/HOL's standard
>     library. Also, there are no pattern guards in Isabelle/HOL. Anyway, it
>     should be relatively straight-forward to translate these functions into
>     Haskell.
> 
>     Another thing: I just realized that "merge_pairs" in my formalization
>     also differs from "mergePairs", since with the changed "sequences" it
>     might actually get an empty list as input, in which case the current
>     "mergePairs" wouldn't terminate at all.
> 
>     So for those who are interested, the full definition of mergesort can be
>     found here
> 
> 
>     https://devel.isa-afp.org/browser_info/current/AFP/Efficient-Mergesort/Efficient_Mergesort.html
> 
>     where mergesort is called "msort_key".
> 
>     cheers
> 
>     chris
> 
>     Btw: What is "NonEmpty x"?
> 
>     On 09/18/2018 11:55 AM, David Feuer wrote:
>     > If you're guaranteeing that the result won't contain empty lists, it
>     > would be worth benchmarking the effect of using NonEmpty x instead of
>     > [x] in that spot.
>     >
>     > On Tue, Sep 18, 2018, 4:21 AM Christian Sternagel
>     <c.sternagel at gmail.com <mailto:c.sternagel at gmail.com>
>     > <mailto:c.sternagel at gmail.com <mailto:c.sternagel at gmail.com>>> wrote:
>     >
>     >     Dear Cafe,
>     >
>     >     some years ago I formalized the mergesort implementation [1] from
>     >
>     >
>     >   
>      http://hackage.haskell.org/package/base-4.11.1.0/docs/src/Data.OldList.html#sort
>     >
>     >     (as far as I can tell it didn't change since 2012) in the proof
>     >     assistant Isabelle/HOL [2].
>     >
>     >     More specifically, I proved its functional correctness (the
>     result is
>     >     sorted and contains all elements of the input with exactly the
>     same
>     >     multiplicities) and that it is a stable sorting algorithm.
>     >
>     >     Very recently I also formalized a complexity result in
>     Isabelle/HOL,
>     >     namely that the number of comparisons is bounded from above by
>     >
>     >       n + n * ⌈log 2 n⌉
>     >
>     >     for lists of length n.
>     >
>     >     For this proof I had to change the definition of "sequences",
>     >     "ascending", and "descending" slightly.
>     >
>     >     Now here is my question: Does anyone now of reasons why the
>     current
>     >     implementation of "sequences" is allowed to produce spurious
>     empty lists
>     >     in its result? The version I used in my formalization only
>     differs in
>     >     the following three spots:
>     >
>     >       sequences [x] = [[x]] -- this is the only important change,
>     since
>     >       sequences [] = []     -- then the result does not contain
>     empty lists
>     >
>     >     instead of
>     >
>     >       sequences xs = [xs]
>     >
>     >     and
>     >
>     >       ascending a as [] = let !x = as [a] in [x]
>     >
>     >     instead of
>     >
>     >       ascending a as bs = let !x = as [a] in x : sequences bs
>     >
>     >     and
>     >
>     >       descending a as []  = [a:as]
>     >
>     >     instead of
>     >
>     >       descending a as bs = (a:as) : sequences bs
>     >
>     >     [1] https://www.isa-afp.org/entries/Efficient-Mergesort.html
>     >     [2] http://isabelle.in.tum.de/
>     >     _______________________________________________
>     >     Haskell-Cafe mailing list
>     >     To (un)subscribe, modify options or view archives go to:
>     >     http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>     >     Only members subscribed via the mailman list are allowed to post.
>     >
>