getting a grip on memory usage

Simon Peyton-Jones simonpj@microsoft.com
Fri, 24 May 2002 03:12:53 -0700


You surely need a seq in loop' (not loop).
GHC can't see that it's strict, because it's overloaded.

Something like
|     where loop' a pos | pos > high =3D a
| 		      | otherwise  =3D a1 `seq` p1 `seq` loop' a1 p1
		     where
		         a1 =3D f a pos
		         p1 =3D p+1

(The seq on p1 simply avoids the creation of  a thunk for p+1;
the thunk will be evaluated on the next iteration, assuming '>'
is strict, so it won't be a leak.)

Or specialise loop to type Int (or whatever), which will
help the strictness analyser no end.

I'm not certain that this will do it, but if you don't do this
you are in deep trouble.

By 'eat memory' I assume you mean residency rather
than just allocation

Simon

| -----Original Message-----
| From: Hal Daume III [mailto:hdaume@ISI.EDU]=20
| Sent: 22 May 2002 18:49
| To: GHC Users Mailing List
| Subject: getting a grip on memory usage
|=20
|=20
| so i have a function that is eating *tons* of memory.  my=20
| application is
| clustering and i put everything in UArrays or Arrays.  for=20
| clustering 5
| data points with 4 features (a tiny tiny set) the program gobbles
| 350mbs!  most of the usage is coming from this distance function:
|=20
| dist dat Nothing pq@(p,q) =3D=20
|     (uncurry loop) (bounds xp) 0
|       (\a i -> a + (sqr (xp!i - xq!i)))
|     where Vector xp =3D dat !!! p
| 	  Vector xq =3D dat !!! q
| dist dat (Just (Vector w)) pq@(p,q) =3D
|     (uncurry loop) (bounds xp) 0
|       (\a i -> a + ((sqr (w!i)) * (sqr (xp!i - xq!i))))
|     where Vector xp =3D dat !!! p
| 	  Vector xq =3D dat !!! q
|=20
| Where the relevant definitions are:
|=20
| type Vector =3D Vector (UArray Int Double)
| (!!!) =3D Array.(!)
|=20
| and loop is:
|=20
| loop :: (Num i, Ord i, Ix i) =3D> i -> i -> a -> (a -> i -> a) -> a
| loop low high a f =3D loop' a low
|     where loop' a pos | pos > high =3D a
| 		      | otherwise  =3D loop' (f a pos) (pos+1)
|=20
| i cannot figure out why such a function would be eating so=20
| much memory.  i
| even tried changing the function provided to loop with "a=20
| `seq` ..." to
| make sure we're not creating a huge thunk, but that didn't=20
| help at all.
|=20
| but, according to my profiling:
|=20
|       dist               Main        380000   27.7  23.6    =20
| 64.9  61.0
|        sqr               Main       2280000    1.6   7.9     =20
| 1.6   7.9
|        !!!               Main        760000    4.7   0.0     =20
| 4.7   0.0
|        loop              Loops       380000   30.8  29.5    =20
| 30.8  29.5
|=20
| i realize it's being entered a lot of times, and i can=20
| understand that it
| would use a bunch of time resources, but i don't understand=20
| why it's using
| so much space.
|=20
| any help would be appreciated...
|=20
|  - Hal
|=20
| --
| Hal Daume III
|=20
|  "Computer science is no more about computers    | hdaume@isi.edu
|   than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume
|=20
| _______________________________________________
| Glasgow-haskell-users mailing list
| Glasgow-haskell-users@haskell.org
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
|=20