[Haskell-cafe] Measuring memory usage
Claude Heiland-Allen
claude at mathr.co.uk
Fri Jun 29 18:34:01 UTC 2018
On 2018-06-29 15:14, Vlatko Basic wrote:
> Indeed bang solves the issue. I didn't try it because the docs says
> value doesn't have to be forced for validateFunc (which is used for
> value), but obviously only to whnf.
I think the issue is something to do with the two default
implementations for rnf in the NFData class. Historically, `rnf a = seq
a ()` was the default implementation (ie just WHNF), but more recently
there is a Generic-based version that should automatically reduce to
normal form. I don't know why the Generic version is either 1. not used
at all, or 2. not working properly, but I suspect lack of instance
Generic (HashMap k v), or possibly instance Generic1/2 MapData (if they
are things?), may have something to do with it. I don't know why there
is no instance, but maybe it would allow breaking internal data
structure invariants?
Claude
>
> Thanks. :-)
> Been wasting whole morning on this.
>
>
> -------- Original Message --------
> Subject: Re: [Haskell-cafe] Measuring memory usage
> From: Claude Heiland-Allen <claude at mathr.co.uk>
> To: haskell-cafe at haskell.org
> Date: 29/06/18 15:37
>
>> Hi Vlatko,
>>
>> On 29/06/18 13:31, Vlatko Basic wrote:
>>>
>>> Hello,
>>>
>>> I've come to some strange results using Weigh package.
>>>
>>> It shows that HashMap inside 'data' is using much, much more memory.
>>>
>> This seems to be astrictness issue - you may be measuring the size of
>> a thunk instead of the resulting evaluated data.
>>
>> To confirm that this is the case, you can replace:
>>
>> data MapData k v = MapData (HashMap k v) deriving Generic
>>
>> with
>>
>> data MapData k v = MapData !(HashMap k v) deriving Generic
>>
>> Or replace:
>>
>> value "MapData" (MapData $ mkHMList full)
>>
>> with
>>
>> value "MapData" (MapData $! mkHMList full)
>>
>> Either of these changes gave me results like this:
>>
>> Case Allocated GCs
>> HashMap 262,824 0
>> HashMap half 58,536 0
>> HashMap third 17,064 0
>> MapData 263,416 0
>>
>> The real issue seems to be NFData not doing what you expect. I'm not
>> sure what the generic NFData instance is supposed to do, as there is
>> no instance Generic (HashMap k v), so maybe you need to write your own
>> rnf if you don't like either of the above workarounds.
>>
>> Claude
>>>
>>> The strange thing is that I'm seeing too large mem usage in my app as
>>> well (several "MapData" like in records), and trying to figure out
>>> with 'weigh' what's keeping the mem.
>>>
>>> Noticed that when I change the code to use HashMap directly (not
>>> inside 'data', that's the only change), the mem usage observed with
>>> top drops down for ~60M, from 850M to 790M.
>>>
>>>
>>> These are the test results for 10K, 5K and 3.3K items for "data
>>> MapData k v = MapData (HashMap k v)" (at the end is the full runnable
>>> example.)
>>>
>>> Case Allocated GCs
>>> HashMap 262,824 0
>>> HashMap half 58,536 0
>>> HashMap third 17,064 0
>>> MapData 4,242,208 4
>>>
>>> I tested by changing the order, disabling all but one etc., and the
>>> results were the same. Same 'weigh' behaviour with IntMap and Map.
>>>
>>>
>>> So, if anyone knows and has some experience with such issues, my
>>> questions are:
>>>
>>> 1. Is 'weigh' package reliable/usable, at least to some extent? (the
>>> results do show diff between full, half and third)
>>>
>>> 2. How do you measure mem consumptions of your large data/records?
>>>
>>> 3. If the results are even approximately valid, what could cause such
>>> large discrepancies with 'data'?
>>>
>>> 4. Is there a way to see if some record has been freed from memory,
>>> GCed?
>>>
>>>
>>>
>>> module Main where
>>>
>>> import Prelude
>>>
>>> import Control.DeepSeq (NFData)
>>> import Data.HashMap.Strict (HashMap, fromList)
>>> import GHC.Generics (Generic)
>>> import Weigh (mainWith, value)
>>>
>>>
>>> data MapData k v = MapData (HashMap k v) deriving Generic
>>> instance (NFData k, NFData v) => NFData (MapData k v)
>>>
>>> full, half, third :: Int
>>> full = 10000
>>> half = 5000
>>> third = 3333
>>>
>>> main :: IO ()
>>> main = mainWith $ do
>>> value "HashMap" ( mkHMList full)
>>> value "HashMap half" ( mkHMList half)
>>> value "HashMap third" ( mkHMList third)
>>> value "MapData" (MapData $ mkHMList full)
>>>
>>> mkHMList :: Int -> HashMap Int String
>>> mkHMList n = fromList . zip [1..n] $ replicate n "some text"
>>>
>>>
--
https://mathr.co.uk
More information about the Haskell-Cafe
mailing list