[Haskell-cafe] Measuring memory usage

Claude Heiland-Allen claude at mathr.co.uk
Fri Jun 29 13:37:45 UTC 2018


Hi Vlatko,

On 29/06/18 13:31, Vlatko Basic wrote:
>
> Hello,
>
> I've come to some strange results using Weigh package.
>
> It shows that HashMap inside 'data' is using much, much more memory.
>
This seems to be astrictness issue - you may be measuring the size of a 
thunk instead of the resulting evaluated data.

To confirm that this is the case, you can replace:

data MapData k v = MapData (HashMap k v) deriving Generic

with

data MapData k v = MapData !(HashMap k v) deriving Generic

Or replace:

   value "MapData"       (MapData $ mkHMList full)

with

   value "MapData"       (MapData $! mkHMList full)

Either of these changes gave me results like this:

Case           Allocated  GCs
HashMap          262,824    0
HashMap half      58,536    0
HashMap third     17,064    0
MapData          263,416    0

The real issue seems to be NFData not doing what you expect. I'm not 
sure what the generic NFData instance is supposed to do, as there is no 
instance Generic (HashMap k v), so maybe you need to write your own rnf 
if you don't like either of the above workarounds.

Claude
>
> The strange thing is that I'm seeing too large mem usage in my app as 
> well (several "MapData" like in records), and trying to figure out 
> with 'weigh' what's keeping the mem.
>
> Noticed that when I change the code to use HashMap directly (not 
> inside 'data', that's the only change), the mem usage observed with 
> top drops down for ~60M, from 850M to 790M.
>
>
> These are the test results for 10K, 5K and 3.3K items for "data 
> MapData k v = MapData (HashMap k v)" (at the end is the full runnable 
> example.)
>
> Case           Allocated  GCs
> HashMap          262,824    0
> HashMap half      58,536    0
> HashMap third     17,064    0
> MapData        4,242,208    4
>
> I tested by changing the order, disabling all but one etc., and the 
> results were the same. Same 'weigh' behaviour with IntMap and Map.
>
>
> So, if anyone knows and has some experience with such issues, my 
> questions are:
>
> 1. Is 'weigh' package reliable/usable, at least to some extent? (the 
> results do show diff between full, half and third)
>
> 2. How do you measure mem consumptions of your large data/records?
>
> 3. If the results are even approximately valid, what could cause such 
> large discrepancies with 'data'?
>
> 4. Is there a way to see if some record has been freed from memory, GCed?
>
>
>
> module Main where
>
> import Prelude
>
> import Control.DeepSeq     (NFData)
> import Data.HashMap.Strict (HashMap, fromList)
> import GHC.Generics        (Generic)
> import Weigh               (mainWith, value)
>
>
> data MapData k v = MapData (HashMap k v) deriving Generic
> instance (NFData k, NFData v) => NFData (MapData k v)
>
> full, half, third :: Int
> full  = 10000
> half  =  5000
> third =  3333
>
> main :: IO ()
> main = mainWith $ do
>   value "HashMap"       (          mkHMList full)
>   value "HashMap half"  (          mkHMList half)
>   value "HashMap third" (          mkHMList third)
>   value "MapData"       (MapData $ mkHMList full)
>
> mkHMList :: Int -> HashMap Int String
> mkHMList n = fromList . zip [1..n] $ replicate n "some text"
>
>
-- 
https://mathr.co.uk



More information about the Haskell-Cafe mailing list