getting a grip on memory usage

Hal Daume III hdaume@ISI.EDU
Wed, 22 May 2002 10:48:37 -0700 (PDT)


so i have a function that is eating *tons* of memory.  my application is
clustering and i put everything in UArrays or Arrays.  for clustering 5
data points with 4 features (a tiny tiny set) the program gobbles
350mbs!  most of the usage is coming from this distance function:

dist dat Nothing pq@(p,q) = 
    (uncurry loop) (bounds xp) 0
      (\a i -> a + (sqr (xp!i - xq!i)))
    where Vector xp = dat !!! p
	  Vector xq = dat !!! q
dist dat (Just (Vector w)) pq@(p,q) =
    (uncurry loop) (bounds xp) 0
      (\a i -> a + ((sqr (w!i)) * (sqr (xp!i - xq!i))))
    where Vector xp = dat !!! p
	  Vector xq = dat !!! q

Where the relevant definitions are:

type Vector = Vector (UArray Int Double)
(!!!) = Array.(!)

and loop is:

loop :: (Num i, Ord i, Ix i) => i -> i -> a -> (a -> i -> a) -> a
loop low high a f = loop' a low
    where loop' a pos | pos > high = a
		      | otherwise  = loop' (f a pos) (pos+1)

i cannot figure out why such a function would be eating so much memory.  i
even tried changing the function provided to loop with "a `seq` ..." to
make sure we're not creating a huge thunk, but that didn't help at all.

but, according to my profiling:

      dist               Main        380000   27.7  23.6     64.9  61.0
       sqr               Main       2280000    1.6   7.9      1.6   7.9
       !!!               Main        760000    4.7   0.0      4.7   0.0
       loop              Loops       380000   30.8  29.5     30.8  29.5

i realize it's being entered a lot of times, and i can understand that it
would use a bunch of time resources, but i don't understand why it's using
so much space.

any help would be appreciated...

 - Hal

--
Hal Daume III

 "Computer science is no more about computers    | hdaume@isi.edu
  than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume