Proposal: Non-allocating way to iterate over a Data.Map: traverseWithKey_

Ryan Newton rrnewton at gmail.com
Mon Jul 29 20:47:13 CEST 2013


Shachaf, I checked and Milan's commits that improve traverseWithKey were
already incorporated when I ran my tests above.  The extra speedup is good
but doesn't change the O(1) vs. O(N) allocation situation.

Ok, so the discussion period for this one is over two weeks.  I don't
believe there were any real objections to traverseWithKey_, rather there
were several questions raised about whether the non-allocating behavior
could be accomplished in other ways, which, alas didn't pan out.

Thus if there are no other objections, can someone merge pull request #30?

   https://github.com/haskell/containers/pull/30

P.S. I've been doing a lot of performance-oriented monadic programming with
large data structures (for the LVar project), and this Map issue is only
one of several places where containers API's force you to go through lists
or to otherwise allocate.  I think a continuing audit would be good and
will make other suggestions and pull requests as I come to them.


On Fri, Jul 5, 2013 at 3:51 AM, Shachaf Ben-Kiki <shachaf at gmail.com> wrote:

> On Tue, Jul 2, 2013 at 1:46 PM, Ryan Newton <rrnewton at gmail.com> wrote:
> >
> >>     foldMapWithKey :: Monoid r => (k -> a -> r) -> M.Map k a -> r
> >>     foldMapWithKey f = getConst . M.traverseWithKey (\k x -> Const (f k
> >> x))
> >
> >
> > Shachaf, thanks for fleshing it out.  I updated the test with your
> version
> > as well:
> >
> >    https://gist.github.com/rrnewton/5912908
> >
> > In short, it performs exactly the same as the foldrWithKey version, as
> you
> > pointed out (32M allocation).
> >
> > In both cases, using first class monadic/applicative values seems to foil
> > GHC.
> >
> > And btw, these do show the scaling you would expect, on 2M elements, it
> > allocates 64MB, 4M -> 128MB, and so on, whereas the traverseWithKey_
> version
> > allocates a constant amount.
> >
> >   -Ryan
> >
>
> If you're actually benchmarking it, you should note a few things:
>
> * As I mentioned, lens's actual implementation of this function is
> slightly different:
>
>     -- The argument 'a' of the result should not be used!
>     newtype Traversed a f = Traversed { getTraversed :: f a }
>     instance Applicative f => Monoid (Traversed a f) where
>       mempty = Traversed (pure (error "Traversed: value used"))
>       Traversed ma `mappend` Traversed mb = Traversed (ma *> mb)
>
> with one "void" applied at the very end of the fold, instead of one in
> each step. This may or may not be better; it should probably be
> measured.
>
> * Following a discussion with Milan off-list, he implemented a simple
> optimization in traverseWithKey which might have a significant impact.
> See
> https://github.com/haskell/containers/commit/4d24ff5d08f0bb27ca73a9888286d6426149515b
> . It should probably be considered in these benchmarks.
>
> * I haven't thought it through, but it's possible that using a
> "difference monoid" -- i.e. folding with Endo, the way
> Data.Foldable.foldr is implemented by default -- would also be useful
> to measure, to compare with the existing foldrWithKey.
>
>     Shachaf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20130729/f08e4d65/attachment.htm>


More information about the Libraries mailing list