Proposal: make minimumBy/maximumBy go through foldl', not foldr1

Joachim Breitner mail at joachim-breitner.de
Wed Feb 10 20:46:20 UTC 2016


Hi,

Am Mittwoch, den 10.02.2016, 20:07 +0000 schrieb Lana Black:
> I tested maximumBy with foldl, and it runs in constant memory for
> lists,
> so extra strictness won't have any benefits in this particular case. The
> following snippet clearly shows it.
> 
> -----
> main = print $ maximumBy compare [1..100000000000]
> 
> 
> -- Copied from Data.Foldable with foldl1 replacing foldr1.
> maximumBy :: Foldable t => (a -> a -> Ordering) -> t a -> a
> maximumBy cmp = foldl1 max'
>   where max' x y = case cmp x y of
>                           GT -> x
>                           _  -> y

Thanks for testing, but I’d be careful with small examples. It is not
unlikely that GHC can “fully grasp” them and do wonders, possibly even
fusing the lists, but with larger examples we’d get bad behavior.

And indeed, with "-O0", or with NOINLINE maximumBy, I very quickly fill
my memory.

Too bad there is no foldl1' to easily verify that with that, we get the
desired behavior even without -O.

Now this gets interesting. I’m wondering if there is a good way of
implementing foldl1', so I looked at the default implementation of
foldl1, which is:

    foldl1 :: (a -> a -> a) -> t a -> a
    foldl1 f xs = fromMaybe (error "foldl1: empty structure")
                    (foldl mf Nothing xs)
      where
        mf m y = Just (case m of
                         Nothing -> y
                         Just x  -> f x y)

This implements foldl1 via foldl and an accumulator _wrapped in a Maybe
and case-analized in every step_. I sincerely hope that every instance
overrides this by an more efficient version. And at least those
Foldable instances with more than one element in Data.Foldable do...

Nevertheless, for the question of memory usage, a generic definition
will do. And indeed, using

    foldl1' :: Foldable t => (a -> a -> a) -> t a -> a
    foldl1' f xs = fromMaybe (error "foldl1': empty structure")
                    (foldl' mf Nothing xs)
      where
        mf Nothing y = Just y
        mf (Just x) y = x `seq` Just (f x y)

in your example, I get constant memory consumption as expected.

Greetings,
Joachim

-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.dehttps://www.joachim-breitner.de/
  XMPP: nomeata at joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20160210/ef956353/attachment.sig>


More information about the Libraries mailing list