[Haskell-cafe] hamming distance allocation

Arnoldo Muller arnoldomuller at gmail.com
Sun Apr 18 19:03:14 EDT 2010


Hello all:

I want to generate some hamming distance statistics about a set of strings.
As explained in another e-mail in this list, I used the following code to
call the
functions:
(exampl holds the list of strings of size w)
filter (\x -> x /= 0) $ map (uncurry hammingX) [(xs, ys) | xs <- exampl, ys
<- exampl]

I have two hamming functions:
-- hamming distance for variable length strings
hamming :: String -> String -> Int
hamming x y = hamming' x y 0
    where
      hamming' [] _ !c = c
      hamming' _ [] !c = c
      hamming' (x:xs) (y:ys) !c
          | x == y = hamming' xs ys c
          | otherwise = hamming' xs ys (c + 1)

-- function posted in this mailing list
hamming2 :: String -> String -> Int
hamming2 xs ys = length (filter not (zipWith (==) xs ys))

I am executing these functions millions of times and the bottleneck of my
program is in them as explained by running in profiling mode with  +RTS
-K400M -p -RTS

The costlier function is the hamming distance
COST CENTRE                    MODULE               %time %alloc

hamming                        Distances             66.6   41.9

It says that it is performing 41% of the allocations. In the case of
hamming2 the allocations go as far as 52%.  I could understand that there
are allocations in "hamming2" because we are creating pairs, but in the case
of "hamming" there should be no allocation.

How can I execute my hamming functions without allocating memory?

Best regards,

Arnoldo Muller
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100418/c70bae09/attachment.html


More information about the Haskell-Cafe mailing list