Proposal: make minimumBy/maximumBy go through foldl', not foldr1

Joachim Breitner mail at joachim-breitner.de
Sun Feb 7 10:44:40 UTC 2016


Dear List,

the following discussion stalled:

Am Donnerstag, den 29.10.2015, 17:34 +0200 schrieb Roman Cheplyaka:
> I just realized (thanks to this reddit thread[1]) that minimumBy is now
> implemented through foldr1. Previously (before FTP), minimumBy from Data.List
> used foldl1, while minimumBy from Data.Foldable used foldr1.
> 
> I think that the way these were merged was a mistake.
> 
> Let's admit that the absolute majority of use cases for minimumBy are numbers,
> for which the comparison functions are strict in both arguments. Even for
> something like Bool, where comparison could potentially be in one of the
> arguments, the standard compare method is derived and is strict.
> 
> Now, for Foldable, there isn't necesarily a connection between the direction and
> the way fold is performed. For example, for Data.Map.Map both folds are
> tail-recursive. Yet foldr is non-tail-recursive at least for:
> 
> - good old lists
> - anything that uses the same convention for foldl/foldr as lists
> - anything that defines its foldable instance through conversion to lists
> - anything that uses the default implementation of foldr/foldr1
> 
> I also put a simple microbenchmark[2] to show the difference in performance;
> on it, the proposed implementation is more than 20x faster than the current one.
> 
> [1]: https://www.reddit.com/r/haskell/comments/3qpefo/a_very_unfortunate_error_message/
> [2]: http://lpaste.net/714261956202070016

see https://mail.haskell.org/pipermail/libraries/2015-October/026430.html and
https://ghc.haskell.org/trac/ghc/ticket/10830.

It seems we have the choice of either
 1 keeping this as they are, with Data.List.maximumBy having changed behaviour
   for the worse from 7.8 to 7.10
 2 changing the behavior of Foldable.maximumBy (from foldr to foldl)
   and changing the behavior of Data.List.maximumBy back to what it was before,
   and arguably the “better” definition.
 3 adding minimumBy/maximumBy as a class method and changing only the list instance.

This also raises the question of whether using foldl over foldl' is the
right thing to do, as I was under the impression that foldl will (at
least for lists)  always just accumulate a large heap of thunks, and
one would always want to use foldl' instead. I postulate that if foldl'
had been in the report, then it would have been used for maximumBy.

So this yields option 4:

 4 changing maximumBy to use foldl'


BTW, why is there no fold' method in Foldable that leaves the
associativity (left, right, or mixed in case of a tree) to the Foldable
instance, and just implements the “best” strictly accumulating fold for
the given data structure?


Greetings,
Joachim

-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.dehttp://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20160207/2d74dbea/attachment.sig>


More information about the Libraries mailing list