[Haskell-cafe] How can I improve the pipes's performance with a huge file?
Tom Ellis
tom-lists-haskell-cafe-2013 at jaguarpaw.co.uk
Fri Nov 14 17:31:59 UTC 2014
On Fri, Nov 14, 2014 at 05:47:16PM +0100, Wojtek Narczyński wrote:
> On 14.11.2014 10:43, zhangjun.julian wrote:
> >emptyMap = DM.empty::(DM.Map (String,String) Int)
>
> Laziness makes your data swell.
>
> 1) Try using ByteString or Text instead of String.
> 2) Try the UNPACK pragma, AFAIR it requires -O2.
> data Key = Key {-# UNPACK #-} !ByteString {-# UNPACK #-} !ByteString
> https://hackage.haskell.org/package/ghc-datasize - this package
> will help you to determine the actual data size
This is certainly true, but there is a distinction to be drawn between
"swollen data" that is a few times bigger than it could be, and a space leak.
Zhangjun Julian's biggest problem is definitely the latter. There's no
reason that compiling a dictionary counting occurences and printing it out
should consume 9GB. Once the space leak is fixed your suggestions will help
reduce memory usage further.
Tom
More information about the Haskell-Cafe
mailing list