[Haskell-cafe] Measuring memory usage

Vlatko Basic vlatko.basic at gmail.com
Fri Jun 29 14:14:18 UTC 2018


Indeed bang solves the issue. I didn't try it because the docs says value 
doesn't have to be forced for validateFunc (which is used for value), but 
obviously only to whnf.

Thanks. :-)
Been wasting whole morning on this.


-------- Original Message  --------
Subject: Re: [Haskell-cafe] Measuring memory usage
From: Claude Heiland-Allen <claude at mathr.co.uk>
To: haskell-cafe at haskell.org
Date: 29/06/18 15:37

> Hi Vlatko,
> 
> On 29/06/18 13:31, Vlatko Basic wrote:
>>
>> Hello,
>>
>> I've come to some strange results using Weigh package.
>>
>> It shows that HashMap inside 'data' is using much, much more memory.
>>
> This seems to be astrictness issue - you may be measuring the size of a thunk 
> instead of the resulting evaluated data.
> 
> To confirm that this is the case, you can replace:
> 
> data MapData k v = MapData (HashMap k v) deriving Generic
> 
> with
> 
> data MapData k v = MapData !(HashMap k v) deriving Generic
> 
> Or replace:
> 
>    value "MapData"       (MapData $ mkHMList full)
> 
> with
> 
>    value "MapData"       (MapData $! mkHMList full)
> 
> Either of these changes gave me results like this:
> 
> Case           Allocated  GCs
> HashMap          262,824    0
> HashMap half      58,536    0
> HashMap third     17,064    0
> MapData          263,416    0
> 
> The real issue seems to be NFData not doing what you expect. I'm not sure what 
> the generic NFData instance is supposed to do, as there is no instance Generic 
> (HashMap k v), so maybe you need to write your own rnf if you don't like either 
> of the above workarounds.
> 
> Claude
>>
>> The strange thing is that I'm seeing too large mem usage in my app as well 
>> (several "MapData" like in records), and trying to figure out with 'weigh' 
>> what's keeping the mem.
>>
>> Noticed that when I change the code to use HashMap directly (not inside 
>> 'data', that's the only change), the mem usage observed with top drops down 
>> for ~60M, from 850M to 790M.
>>
>>
>> These are the test results for 10K, 5K and 3.3K items for "data MapData k v = 
>> MapData (HashMap k v)" (at the end is the full runnable example.)
>>
>> Case           Allocated  GCs
>> HashMap          262,824    0
>> HashMap half      58,536    0
>> HashMap third     17,064    0
>> MapData        4,242,208    4
>>
>> I tested by changing the order, disabling all but one etc., and the results 
>> were the same. Same 'weigh' behaviour with IntMap and Map.
>>
>>
>> So, if anyone knows and has some experience with such issues, my questions are:
>>
>> 1. Is 'weigh' package reliable/usable, at least to some extent? (the results 
>> do show diff between full, half and third)
>>
>> 2. How do you measure mem consumptions of your large data/records?
>>
>> 3. If the results are even approximately valid, what could cause such large 
>> discrepancies with 'data'?
>>
>> 4. Is there a way to see if some record has been freed from memory, GCed?
>>
>>
>>
>> module Main where
>>
>> import Prelude
>>
>> import Control.DeepSeq     (NFData)
>> import Data.HashMap.Strict (HashMap, fromList)
>> import GHC.Generics        (Generic)
>> import Weigh               (mainWith, value)
>>
>>
>> data MapData k v = MapData (HashMap k v) deriving Generic
>> instance (NFData k, NFData v) => NFData (MapData k v)
>>
>> full, half, third :: Int
>> full  = 10000
>> half  =  5000
>> third =  3333
>>
>> main :: IO ()
>> main = mainWith $ do
>>   value "HashMap"       (          mkHMList full)
>>   value "HashMap half"  (          mkHMList half)
>>   value "HashMap third" (          mkHMList third)
>>   value "MapData"       (MapData $ mkHMList full)
>>
>> mkHMList :: Int -> HashMap Int String
>> mkHMList n = fromList . zip [1..n] $ replicate n "some text"
>>
>>


More information about the Haskell-Cafe mailing list