[Haskell-cafe] Parallel parsing & multicore

Wed Sep 9 12:33:15 EDT 2009

ekmett:
> 
> 
> On Wed, Sep 9, 2009 at 12:16 PM, Don Stewart <dons at galois.com> wrote:
> 
>     ekmett:
>     > Hi Anakim,
>     >  
>     > Nice to see someone else working in this space.
>     >  
>     > I have also been working on a set of parallel parsing techniques, which
>     can use
>     > small Parsec parsers for local context sensitivity.
>     >  
>     > See the second set of slides in http://comonad.com/reader/2009/
>     > iteratees-parsec-and-monoid/ for an overview of how I'm doing something
>     similar
>     > to feed Parsec independent chunks. Note that this approach bypasses the
>     need
>     > for a separate sequential scan, which otherwise floods your cache, and
>     lets you
>     > get closer to the performance limit imposed by Amdahl's law.
>     >  
>     > The code in the second set of slides can be adapted to your case: load
>     > everything into a lazy bytestring or fingertree of strict bytestrings,
>     then for
>     > each strict bytestring chunk in parallel, scan it for the first newline,
>     and
>     > then start an iteratee based parsec parser from that point. I use the
>     iteratee
>     > based parsec parsers so that when I glue the partial parses together I
>     can feed
>     > the unparsed data on the left side of the first newline in each chunk to
>     the
>     > parser I'm joining on the left. I provide a monoid for the purpose of
>     gluing
>     > together these partial parses, which encapsulates this behavior. 
> 
>     You can get quite a long way by using bytestring-mmap and strict
>     bytestrings. The first ensures your IO overhead will be low, and the
>     second ensures you don't migrate work needlessly.
> 
>  
> I'm actually using bytestring-mmap in a toy compiler which uses these
> techniques, though at present I'm using System.IO.Posix.MMap.Lazy rather than
> the strict version.
>  
> As for migrating work, maybe I'm not understanding your point, but my whole
> goal was to migrate the chunking and parsing out to other cores, so the
> behavior of parMap rwhnf'ing the monoidal reduction over the chunks seems to be
> exactly what I want. I could perhaps get better results by trying to assign the
> same thread contiguous ranges of the input though, and controlling the choice
> of core used to merge chunks more intelligently by maybe doing something like
> mixing up pseq and par to group runs onto the same core where possible.
>  

Oh, I just find strict structures easier when determining in which
thread work is happening. Appropriate use of rnf and friends is also
fine.

-- Don