[Haskell-cafe] How can I improve the pipes's performance with a huge file?

zhangjun.julian zhangjun.julian at gmail.com
Fri Nov 14 22:50:26 UTC 2014


Dear Tom and others

I’m sorry.
I think I had made a mistake, I test Tom’s advice in my master branch  not in the demo code.

In the master branch  I had a list file to read, so I use mapM_ to call rCount as blow

mapM_ (\(x,y) -> rCount num readhandle1 x y) handlePairList

If I change my Map to Strict and  call rCount directly( don’t use mapM_ ) the memory will not swell.

I can understand why lazy Map will cause swell, but I don’t know why mapM_ will cause swell?
Does the mapM_ is lazy too? 
Any strict alternative I can use?


> 在 2014年11月15日,上午1:31,Tom Ellis <tom-lists-haskell-cafe-2013 at jaguarpaw.co.uk> 写道:
> 
> On Fri, Nov 14, 2014 at 05:47:16PM +0100, Wojtek Narczyński wrote:
>> On 14.11.2014 10:43, zhangjun.julian wrote:
>>> emptyMap = DM.empty::(DM.Map (String,String) Int)
>> 
>> Laziness makes your data swell.
>> 
>> 1) Try using ByteString or Text instead of String.
>> 2) Try the UNPACK pragma, AFAIR it requires -O2.
>>    data Key = Key {-# UNPACK #-} !ByteString   {-# UNPACK #-} !ByteString
>>    https://hackage.haskell.org/package/ghc-datasize - this package
>> will help you to determine the actual data size
> 
> This is certainly true, but there is a distinction to be drawn between
> "swollen data" that is a few times bigger than it could be, and a space leak. 
> 
> Zhangjun Julian's biggest problem is definitely the latter.  There's no
> reason that compiling a dictionary counting occurences and printing it out
> should consume 9GB.  Once the space leak is fixed your suggestions will help
> reduce memory usage further.
> 
> Tom
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20141115/74c93014/attachment.html>


More information about the Haskell-Cafe mailing list