[Haskell-cafe] help with Haskell performance

Tue Nov 10 22:24:59 EST 2009

On Tue, Nov 10, 2009 at 8:20 PM, Gokul P. Nair <gpnair78 at yahoo.com> wrote:

> --- On Sat, 11/7/09, Don Stewart <dons at galois.com> wrote:
> > General notes:
> >
> >  * unpack is almost always wrong.
> >  * list indexing with !! is almost always wrong.
> >  * words/lines are often wrong for parsing large files (they build large
> list structures).
> >  * toList/fromList probably aren't the best strategy
> >  * sortBy (comparing snd)
> >  * use insertWith'
> > Spefically, avoid constructing intermediate lists, when you can process
> the
> > entire file in a single pass. Use O(1) bytestring substring operations
> like
> > take and drop.
>
> Thanks all for the valuable feedback. Switching from Regex.Posix to
> Regex.PCRE alone reduced the running time to about 6 secs and a few other
> optimizations suggested on this thread brought it down to about 5 secs ;)
>
> I then set out to profile the code out of curiosity to see where the bulk
> of the time was being spent and sure enough the culprit turned out to be
> "unpack". My question therefore is, given a list L1 of type [(ByteString,
> Int)], how do I print it out so as to eliminate the "chunk, empty" markers
> associated with a bytestring? The suggestions posted here are along the
> lines of "mapM_ print L1" but that's far from desirable especially because
> the generated output is for perusal by non-technical users etc.
>
> Thanks.
>
>
Take a look at Data.ByteString.Lazy.Char8.putStrLn.  That prints a lazy
ByteString without unpacking it, and without the internal markers.

Sincerely,
Brad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20091110/d96400bf/attachment.html