[Haskell-cafe] mapM is supralinear?

Arseniy Alekseyev arseniy.alekseyev at gmail.com
Tue Sep 27 00:14:13 CEST 2011


Garbage collection takes amortized O(1) per allocation, doesn't it?

On 26 September 2011 18:00, Lennart Augustsson <lennart at augustsson.net> wrote:
> You seem to ignore garbage collection.
>
> On Sat, Sep 24, 2011 at 6:40 AM, Arseniy Alekseyev
> <arseniy.alekseyev at gmail.com> wrote:
>>
>> > Apparently it doesn't, and it seems to be fixed now.
>>
>> Does anyone know what exactly the bug was? Because this seems like a
>> serious bug to me. I've run into it myself today and wasn't happy.
>> Linear algorithms should work in linear time however much memory they
>> allocate (modulo cache thrashing of course). Existence of people
>> claiming otherwise surprises me!
>>
>>
>> On 22 September 2011 01:05, John Lato <jwlato at gmail.com> wrote:
>> > On Wed, Sep 21, 2011 at 1:57 PM, Tim Docker <tim at dockerz.net> wrote:
>> >>
>> >> On 09/09/2011, at 8:19 PM, John Lato wrote:
>> >>
>> >>> Agreed.  Whenever I'd like to use mapM (or any other function for
>> >>> which a *M_ is available), I've found the following rules helpful:
>> >>>
>> >>> 1.  If I can guarantee the list is short (~ n<=20), go ahead and use
>> >>> mapM
>> >>> 2.  Otherwise use mapM_, foldM_, or foldM if a real reduction is
>> >>> possible (i.e. not "foldM snocM []").
>> >>>
>> >>> Step 2 sometimes requires changing my design, but it's always been for
>> >>> the better.  `mapM_` tends to require more pipeline composition, so
>> >>> it's leveraging the language's strengths.
>> >>
>> >> This thread is really interesting - it relates directly to problems I
>> >> am
>> >> currently
>> >> having with mapM over large lists (see the thread "stack overflow
>> >> pain").
>> >>
>> >> Can you explain what you mean by "mapM_ tends to require more pipeline
>> >> composition"?
>> >> In what way is it leveraging the language strengths?
>> >
>> > Hmm, that is suitably cryptic.  One way to think of it is an inversion
>> > of control.  Instead of operating on whole collections of things in a
>> > monad, you specify monadic actions (pipelines) which are applied
>> > sequentially to each input.
>> >
>> > Here's a simple example.  Suppose you have a bunch of data serialized
>> > to files, and you want to read each file into a data structure, apply
>> > some process based upon the last file's data, and write out the output
>> > to new files.  One way to do that would look like:
>> >
>> > do
>> >    dats <- mapM readMyData files
>> >    let pairs = zip (mempty:dats) dats
>> >    zipWithM_ (\(last, this) fname -> writeMyData (update last this)
>> > fname) pairs newFiles
>> >
>> > However, you could also put everything into a single monadic
>> > operation, like this
>> >
>> > do
>> >    foldM_ (\last (infile, outfile) -> do
>> >                                                    this <- readMyData
>> > infile
>> >                                                    writeMyData
>> > (update last this) outfile
>> >                                                    return this
>> >               )
>> >               mempty
>> >               (zip files newFiles)
>> >
>> > The first interleaves control (mapM, zipWIthM_) with monadic actions
>> > (file IO), whereas the second only has one control function (foldM_)
>> > which completely processes one input.  I say this is more pipeline
>> > composition because you have to create an entire pipeline from input
>> > to output, which is then sequentially fed inputs by the control
>> > function.
>> >
>> > I say this leverages Haskell's strengths because it's quite easy to
>> > compose functions and monadic actions in Haskell.  It also tends to be
>> > garbage-collector friendly.  I also find it much easier to reason
>> > about space usage.  You don't need to worry if part of a list is being
>> > retained, because the full list of data doesn't appear anywhere.  If
>> > you need to access prior elements they're specified explicitly so you
>> > know exactly how much data you're holding on to.
>> >
>> > My perspective might be warped by my work on iteratees, but I find
>> > this a very natural approach.
>> >
>> > John L.
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > Haskell-Cafe at haskell.org
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>



More information about the Haskell-Cafe mailing list