[Haskell-cafe] mapM is supralinear?
lennart at augustsson.net
Mon Sep 26 19:00:47 CEST 2011
You seem to ignore garbage collection.
On Sat, Sep 24, 2011 at 6:40 AM, Arseniy Alekseyev <
arseniy.alekseyev at gmail.com> wrote:
> > Apparently it doesn't, and it seems to be fixed now.
> Does anyone know what exactly the bug was? Because this seems like a
> serious bug to me. I've run into it myself today and wasn't happy.
> Linear algorithms should work in linear time however much memory they
> allocate (modulo cache thrashing of course). Existence of people
> claiming otherwise surprises me!
> On 22 September 2011 01:05, John Lato <jwlato at gmail.com> wrote:
> > On Wed, Sep 21, 2011 at 1:57 PM, Tim Docker <tim at dockerz.net> wrote:
> >> On 09/09/2011, at 8:19 PM, John Lato wrote:
> >>> Agreed. Whenever I'd like to use mapM (or any other function for
> >>> which a *M_ is available), I've found the following rules helpful:
> >>> 1. If I can guarantee the list is short (~ n<=20), go ahead and use
> >>> 2. Otherwise use mapM_, foldM_, or foldM if a real reduction is
> >>> possible (i.e. not "foldM snocM ").
> >>> Step 2 sometimes requires changing my design, but it's always been for
> >>> the better. `mapM_` tends to require more pipeline composition, so
> >>> it's leveraging the language's strengths.
> >> This thread is really interesting - it relates directly to problems I am
> >> currently
> >> having with mapM over large lists (see the thread "stack overflow
> >> Can you explain what you mean by "mapM_ tends to require more pipeline
> >> composition"?
> >> In what way is it leveraging the language strengths?
> > Hmm, that is suitably cryptic. One way to think of it is an inversion
> > of control. Instead of operating on whole collections of things in a
> > monad, you specify monadic actions (pipelines) which are applied
> > sequentially to each input.
> > Here's a simple example. Suppose you have a bunch of data serialized
> > to files, and you want to read each file into a data structure, apply
> > some process based upon the last file's data, and write out the output
> > to new files. One way to do that would look like:
> > do
> > dats <- mapM readMyData files
> > let pairs = zip (mempty:dats) dats
> > zipWithM_ (\(last, this) fname -> writeMyData (update last this)
> > fname) pairs newFiles
> > However, you could also put everything into a single monadic
> > operation, like this
> > do
> > foldM_ (\last (infile, outfile) -> do
> > this <- readMyData
> > writeMyData
> > (update last this) outfile
> > return this
> > )
> > mempty
> > (zip files newFiles)
> > The first interleaves control (mapM, zipWIthM_) with monadic actions
> > (file IO), whereas the second only has one control function (foldM_)
> > which completely processes one input. I say this is more pipeline
> > composition because you have to create an entire pipeline from input
> > to output, which is then sequentially fed inputs by the control
> > function.
> > I say this leverages Haskell's strengths because it's quite easy to
> > compose functions and monadic actions in Haskell. It also tends to be
> > garbage-collector friendly. I also find it much easier to reason
> > about space usage. You don't need to worry if part of a list is being
> > retained, because the full list of data doesn't appear anywhere. If
> > you need to access prior elements they're specified explicitly so you
> > know exactly how much data you're holding on to.
> > My perspective might be warped by my work on iteratees, but I find
> > this a very natural approach.
> > John L.
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe