[Haskell-cafe] help with Haskell performance
brad.larsen at gmail.com
Tue Nov 10 22:24:59 EST 2009
On Tue, Nov 10, 2009 at 8:20 PM, Gokul P. Nair <gpnair78 at yahoo.com> wrote:
> --- On Sat, 11/7/09, Don Stewart <dons at galois.com> wrote:
> > General notes:
> > * unpack is almost always wrong.
> > * list indexing with !! is almost always wrong.
> > * words/lines are often wrong for parsing large files (they build large
> list structures).
> > * toList/fromList probably aren't the best strategy
> > * sortBy (comparing snd)
> > * use insertWith'
> > Spefically, avoid constructing intermediate lists, when you can process
> > entire file in a single pass. Use O(1) bytestring substring operations
> > take and drop.
> Thanks all for the valuable feedback. Switching from Regex.Posix to
> Regex.PCRE alone reduced the running time to about 6 secs and a few other
> optimizations suggested on this thread brought it down to about 5 secs ;)
> I then set out to profile the code out of curiosity to see where the bulk
> of the time was being spent and sure enough the culprit turned out to be
> "unpack". My question therefore is, given a list L1 of type [(ByteString,
> Int)], how do I print it out so as to eliminate the "chunk, empty" markers
> associated with a bytestring? The suggestions posted here are along the
> lines of "mapM_ print L1" but that's far from desirable especially because
> the generated output is for perusal by non-technical users etc.
Take a look at Data.ByteString.Lazy.Char8.putStrLn. That prints a lazy
ByteString without unpacking it, and without the internal markers.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe