[Haskell-cafe] How on Earth Do You Reason about Space?

John Lato jwlato at gmail.com
Wed Jun 1 12:13:54 CEST 2011


>
> From: Brandon Moore <brandon_m_moore at yahoo.com>
>
>
> I was worried data sharing might mean your keys
> retain entire 64K chunks of the input. However, it
> seems enumLines depends on the StringLike ByteString
> instance, which just converts to and from String.
> That can't be efficient, but I suppose it avoids excessive sharing.


That's true for 'enumLines', however the OP is using 'enumLinesBS', which
operates on bytestrings directly.

Data sharing certainly could be an issue here.  I tried performing
Data.ByteString.copy before inserting the key into the map, but that used
more memory.  I don't have an explanation for this; it's not what I would
expect.

The other parameter which affects sharing is the chunk size.  I got a much
better memory profile when using a chunksize of 1024 instead of  65536.

Oddly enough, when using the large chunksize I saw lower memory usage from
Data.Map, but with the small chunksize Data.HashMap has a significant
advantage.

John Lato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110601/bed8fb11/attachment.htm>


More information about the Haskell-Cafe mailing list