[Haskell-cafe] hamming distance allocation
daniel.is.fischer at web.de
Mon Apr 19 13:31:35 EDT 2010
Am Montag 19 April 2010 17:17:11 schrieb Arnoldo Muller:
> The strings will not be longer than 30 characters.
For 20 -30 character strings, using ByteStrings should be better, in my
tests about 40% faster, allocation figures slightly lower, resident memory
much lower and bytes copied during GC very much lower.
For a sample of english text (many short words, few long), ByteStrings were
about 25% faster, allocation figures very slightly lower, resident memory
much lower, bytes copied much lower (relative difference not as large as
for longer strings).
> I am doing sets of 2000 (total of 2000^2 distance computations)
That shouldn't give memory problems either way.
> I am expecting that all the operations will be lazyly performed but at
> some point I get a memory error.
My guess is a bad consumption pattern.
> Most of the memory is being allocated for the hamming distance and I am
> still unable to find the source of my memory leak.
Allocation as such is not a problem, resident memory is the important
Try heap profiling to see what holds on to memory (+RTS -hc would be a good
> On Mon, Apr 19, 2010 at 3:47 PM, Daniel Fischer
<daniel.is.fischer at web.de>wrote:
> > Am Montag 19 April 2010 14:37:33 schrieb John Lato:
> > > Is it really necessary to use Strings? I think a packed type, e.g.
> > > Vector or ByteString, would be much more efficient here.
> > Not very much if the strings are fairly short (and the list isn't too
> > long, so there's not a big difference in cache-friendliness).
> > If eight-bit characters aren't enough, packing the strings into
> > UArray Int Char gives performance quite close to ByteStrings.
> > > Of course this is only likely to be a benefit if you can move away
> > > from String entirely.
> > >
> > > I suspect that "hamming2" would perform better then.
> > >
> > > John
More information about the Haskell-Cafe