[Haskell-cafe] Measuring memory usage

Claude Heiland-Allen claude at mathr.co.uk
Fri Jun 29 18:34:01 UTC 2018


On 2018-06-29 15:14, Vlatko Basic wrote:
> Indeed bang solves the issue. I didn't try it because the docs says
> value doesn't have to be forced for validateFunc (which is used for
> value), but obviously only to whnf.

I think the issue is something to do with the two default 
implementations for rnf in the NFData class.  Historically, `rnf a = seq 
a ()` was the default implementation (ie just WHNF), but more recently 
there is a Generic-based version that should automatically reduce to 
normal form.  I don't know why the Generic version is either 1. not used 
at all, or 2. not working properly, but I suspect lack of instance 
Generic (HashMap k v), or possibly instance Generic1/2 MapData (if they 
are things?), may have something to do with it.  I don't know why there 
is no instance, but maybe it would allow breaking internal data 
structure invariants?


Claude

> 
> Thanks. :-)
> Been wasting whole morning on this.
> 
> 
> -------- Original Message  --------
> Subject: Re: [Haskell-cafe] Measuring memory usage
> From: Claude Heiland-Allen <claude at mathr.co.uk>
> To: haskell-cafe at haskell.org
> Date: 29/06/18 15:37
> 
>> Hi Vlatko,
>> 
>> On 29/06/18 13:31, Vlatko Basic wrote:
>>> 
>>> Hello,
>>> 
>>> I've come to some strange results using Weigh package.
>>> 
>>> It shows that HashMap inside 'data' is using much, much more memory.
>>> 
>> This seems to be astrictness issue - you may be measuring the size of 
>> a thunk instead of the resulting evaluated data.
>> 
>> To confirm that this is the case, you can replace:
>> 
>> data MapData k v = MapData (HashMap k v) deriving Generic
>> 
>> with
>> 
>> data MapData k v = MapData !(HashMap k v) deriving Generic
>> 
>> Or replace:
>> 
>>    value "MapData"       (MapData $ mkHMList full)
>> 
>> with
>> 
>>    value "MapData"       (MapData $! mkHMList full)
>> 
>> Either of these changes gave me results like this:
>> 
>> Case           Allocated  GCs
>> HashMap          262,824    0
>> HashMap half      58,536    0
>> HashMap third     17,064    0
>> MapData          263,416    0
>> 
>> The real issue seems to be NFData not doing what you expect. I'm not 
>> sure what the generic NFData instance is supposed to do, as there is 
>> no instance Generic (HashMap k v), so maybe you need to write your own 
>> rnf if you don't like either of the above workarounds.
>> 
>> Claude
>>> 
>>> The strange thing is that I'm seeing too large mem usage in my app as 
>>> well (several "MapData" like in records), and trying to figure out 
>>> with 'weigh' what's keeping the mem.
>>> 
>>> Noticed that when I change the code to use HashMap directly (not 
>>> inside 'data', that's the only change), the mem usage observed with 
>>> top drops down for ~60M, from 850M to 790M.
>>> 
>>> 
>>> These are the test results for 10K, 5K and 3.3K items for "data 
>>> MapData k v = MapData (HashMap k v)" (at the end is the full runnable 
>>> example.)
>>> 
>>> Case           Allocated  GCs
>>> HashMap          262,824    0
>>> HashMap half      58,536    0
>>> HashMap third     17,064    0
>>> MapData        4,242,208    4
>>> 
>>> I tested by changing the order, disabling all but one etc., and the 
>>> results were the same. Same 'weigh' behaviour with IntMap and Map.
>>> 
>>> 
>>> So, if anyone knows and has some experience with such issues, my 
>>> questions are:
>>> 
>>> 1. Is 'weigh' package reliable/usable, at least to some extent? (the 
>>> results do show diff between full, half and third)
>>> 
>>> 2. How do you measure mem consumptions of your large data/records?
>>> 
>>> 3. If the results are even approximately valid, what could cause such 
>>> large discrepancies with 'data'?
>>> 
>>> 4. Is there a way to see if some record has been freed from memory, 
>>> GCed?
>>> 
>>> 
>>> 
>>> module Main where
>>> 
>>> import Prelude
>>> 
>>> import Control.DeepSeq     (NFData)
>>> import Data.HashMap.Strict (HashMap, fromList)
>>> import GHC.Generics        (Generic)
>>> import Weigh               (mainWith, value)
>>> 
>>> 
>>> data MapData k v = MapData (HashMap k v) deriving Generic
>>> instance (NFData k, NFData v) => NFData (MapData k v)
>>> 
>>> full, half, third :: Int
>>> full  = 10000
>>> half  =  5000
>>> third =  3333
>>> 
>>> main :: IO ()
>>> main = mainWith $ do
>>>   value "HashMap"       (          mkHMList full)
>>>   value "HashMap half"  (          mkHMList half)
>>>   value "HashMap third" (          mkHMList third)
>>>   value "MapData"       (MapData $ mkHMList full)
>>> 
>>> mkHMList :: Int -> HashMap Int String
>>> mkHMList n = fromList . zip [1..n] $ replicate n "some text"
>>> 
>>> 

-- 
https://mathr.co.uk


More information about the Haskell-Cafe mailing list