[Haskell-cafe] How can I improve the pipes's performance with a huge file?
zhangjun.julian
zhangjun.julian at gmail.com
Fri Nov 14 22:50:26 UTC 2014
Dear Tom and others
I’m sorry.
I think I had made a mistake, I test Tom’s advice in my master branch not in the demo code.
In the master branch I had a list file to read, so I use mapM_ to call rCount as blow
mapM_ (\(x,y) -> rCount num readhandle1 x y) handlePairList
If I change my Map to Strict and call rCount directly( don’t use mapM_ ) the memory will not swell.
I can understand why lazy Map will cause swell, but I don’t know why mapM_ will cause swell?
Does the mapM_ is lazy too?
Any strict alternative I can use?
> 在 2014年11月15日,上午1:31,Tom Ellis <tom-lists-haskell-cafe-2013 at jaguarpaw.co.uk> 写道:
>
> On Fri, Nov 14, 2014 at 05:47:16PM +0100, Wojtek Narczyński wrote:
>> On 14.11.2014 10:43, zhangjun.julian wrote:
>>> emptyMap = DM.empty::(DM.Map (String,String) Int)
>>
>> Laziness makes your data swell.
>>
>> 1) Try using ByteString or Text instead of String.
>> 2) Try the UNPACK pragma, AFAIR it requires -O2.
>> data Key = Key {-# UNPACK #-} !ByteString {-# UNPACK #-} !ByteString
>> https://hackage.haskell.org/package/ghc-datasize - this package
>> will help you to determine the actual data size
>
> This is certainly true, but there is a distinction to be drawn between
> "swollen data" that is a few times bigger than it could be, and a space leak.
>
> Zhangjun Julian's biggest problem is definitely the latter. There's no
> reason that compiling a dictionary counting occurences and printing it out
> should consume 9GB. Once the space leak is fixed your suggestions will help
> reduce memory usage further.
>
> Tom
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20141115/74c93014/attachment.html>
More information about the Haskell-Cafe
mailing list