[Haskell-cafe] Parallel parsing & multicore
Don Stewart
dons at galois.com
Wed Sep 9 12:33:15 EDT 2009
ekmett:
>
>
> On Wed, Sep 9, 2009 at 12:16 PM, Don Stewart <dons at galois.com> wrote:
>
> ekmett:
> > Hi Anakim,
> >
> > Nice to see someone else working in this space.
> >
> > I have also been working on a set of parallel parsing techniques, which
> can use
> > small Parsec parsers for local context sensitivity.
> >
> > See the second set of slides in http://comonad.com/reader/2009/
> > iteratees-parsec-and-monoid/ for an overview of how I'm doing something
> similar
> > to feed Parsec independent chunks. Note that this approach bypasses the
> need
> > for a separate sequential scan, which otherwise floods your cache, and
> lets you
> > get closer to the performance limit imposed by Amdahl's law.
> >
> > The code in the second set of slides can be adapted to your case: load
> > everything into a lazy bytestring or fingertree of strict bytestrings,
> then for
> > each strict bytestring chunk in parallel, scan it for the first newline,
> and
> > then start an iteratee based parsec parser from that point. I use the
> iteratee
> > based parsec parsers so that when I glue the partial parses together I
> can feed
> > the unparsed data on the left side of the first newline in each chunk to
> the
> > parser I'm joining on the left. I provide a monoid for the purpose of
> gluing
> > together these partial parses, which encapsulates this behavior.
>
> You can get quite a long way by using bytestring-mmap and strict
> bytestrings. The first ensures your IO overhead will be low, and the
> second ensures you don't migrate work needlessly.
>
>
> I'm actually using bytestring-mmap in a toy compiler which uses these
> techniques, though at present I'm using System.IO.Posix.MMap.Lazy rather than
> the strict version.
>
> As for migrating work, maybe I'm not understanding your point, but my whole
> goal was to migrate the chunking and parsing out to other cores, so the
> behavior of parMap rwhnf'ing the monoidal reduction over the chunks seems to be
> exactly what I want. I could perhaps get better results by trying to assign the
> same thread contiguous ranges of the input though, and controlling the choice
> of core used to merge chunks more intelligently by maybe doing something like
> mixing up pseq and par to group runs onto the same core where possible.
>
Oh, I just find strict structures easier when determining in which
thread work is happening. Appropriate use of rnf and friends is also
fine.
-- Don
More information about the Haskell-Cafe
mailing list