[Haskell-cafe] Data.Text performance problem
prokhorenkov at gmail.com
Mon Sep 13 06:26:14 EDT 2010
I really didn't expect mapAccumL to have quadratic complexity. Thank you a
lot for the fix!
On Mon, Sep 13, 2010 at 5:06 AM, Bryan O'Sullivan <bos at serpentine.com>wrote:
> On Sun, Sep 12, 2010 at 12:23 PM, Petr Prokhorenkov <
> prokhorenkov at gmail.com> wrote:
>> I experienced a following problem while dealing with some text processing.
> Thanks for the report and the test case. There's nothing wrong with your
> code - read on for details.
> You ran into one of the few functions in Data.Text that I copied straight
> over from the list implementation due to it not being used often.
> Unfortunately, that implementation constructs a new Text value (using cons)
> on every iteration through the loop, and as you might expect that's very
> slow even on tiny inputs, as it has quadratic behaviour.
> I've rewritten both strict and lazy mapAccumL and mapAccumR to use as much
> of the stream fusion machinery as possible. (By the way, there's an
> interesting fact behind why those functions started out life as they did:
> you can't write mapAccum functions using only stream machinery, due to their
> types, and the strict code is more work to write if you can't use the stream
> machinery. In the early days it just wasn't worth writing the more complex
> variants of them, as I had more pressing performance concerns at the time.)
> Where the old version of mapAccumL caused your test case to take 5 seconds
> to process an 11KB file (ouch!), with the rewritten version, your code can
> process an 81MB file in the same amount of time, without any changes to your
> code, and that O(n^2) behaviour is gone :-)
> text 0.8.1.0 is now up on hackage, with the fix included. Enjoy!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe