Vlatko Basic vlatko.basic at gmail.com
Wed Jul 4 11:12:37 UTC 2018

```Hello again,

I'm still trying to find some method to predict memory usage and came upon this

I tried to manually calculate how much memory will the record below consume
(64-bit, 1W = 8B):

newtype Id   = Id    Text deriving (Generic, Data, NFData) -- = 6W
newtype Uid  = Uid   Int  deriving (Generic, Data, NFData) -- = 2W
newtype XUid = XUid  Uid  deriving (Generic, Data, NFData) -- = 2W
newtype YUid = YUid  Uid  deriving (Generic, Data, NFData) -- = 2W
data    Dir  = Yes | No   deriving (Generic, Data)         -- = 2W

data X = X
{ a :: XUid                -- =  2W       -- Int
, b :: YUid                -- =  2W       -- Int
, c :: Id                  -- =  6W + 8B  -- Text len 4
, d :: Either Dir Dir      -- =  1W + 2W  -- Either + Dir + No/Yes
, e :: Text                -- =  6W + 8B  -- Text len 4
} deriving (Generic, Data) -- = 19W + 16B = 152 + 16 = 168B

and calculated the assumed sizes of few lists with different number of elements:

Expected list sizes    ([v] = (1 + 3N) words + N * sizeof(v))
30: 1 + 3W *   30 + (19W *   30 + 16 *   30) =   5,761 B
600: 1 + 3W *  600 + (19W *  600 + 16 *  600) = 115,201 B
5000: 1 + 3W * 5000 + (19W * 5000 + 16 * 5000) = 960,001 B

I also compared these sizes with three libs (Data.Generics.Schemes.gsize,
GHC.DataSize.recursiveSizeNF, Weigh) and the results were:

#items recursiveSizeNF gSize      Weigh    Expected  Diff/recursiveSizeNF
1:            1,416       18        696 168          -
30:            8,008      931     20,880 5,761        28%
600:          135,688   18,601    417,600 115,201        15%
5000:        1,121,288  155,001  3,480,000 960,001        14%

As you can see, the results are more than surprising (to me), with
recursiveSizeNF coming closest. They all measure the same variable.

What am I missing?

For completeness, here are relevant parts of code for creating elements (with
excessive forcing):

let mkX i = force X{ a = XUid \$ Uid i
, b = YUid \$ Uid i
, c = Id \$ tshow i
, d = if even i then (Left Yes) else (Right No)
, e = T.reverse (tshow i)
}
xs30  = force . map mkX \$ take   30 \$ randomRs (1000,1030) (mkStdGen 0)
xs600 = force . map mkX \$ take  600 \$ randomRs (1000,1600) (mkStdGen 0)
xs5K  = force . map mkX \$ take 5000 \$ randomRs (1000,5000) (mkStdGen 0)

dataSize <- recursiveSizeNF \$!! {a}
let gSize = gsize \$!! mkX 0

> -------- Original Message --------
> Subject: Measuring memory usage
> From: Vlatko Basic <vlatko.basic at gmail.com>
> Date: 29/06/18 14:31
>
>
> Hello,
>
> I've come to some strange results using Weigh package.
>
> It shows that HashMap inside 'data' is using much, much more memory.
>
> The strange thing is that I'm seeing too large mem usage in my app as well
> (several "MapData" like in records), and trying to figure out with 'weigh'
> what's keeping the mem.
>
> Noticed that when I change the code to use HashMap directly (not inside
> 'data', that's the only change), the mem usage observed with top drops down
> for ~60M, from 850M to 790M.
>
>
> These are the test results for 10K, 5K and 3.3K items for "data MapData k v =
> MapData (HashMap k v)" (at the end is the full runnable example.)
>
> Case           Allocated  GCs
> HashMap          262,824    0
> HashMap half      58,536    0
> HashMap third     17,064    0
> MapData        4,242,208    4
>
> I tested by changing the order, disabling all but one etc., and the results
> were the same. Same 'weigh' behaviour with IntMap and Map.
>
>
> So, if anyone knows and has some experience with such issues, my questions are:
>
> 1. Is 'weigh' package reliable/usable, at least to some extent? (the results
> do show diff between full, half and third)
>
> 2. How do you measure mem consumptions of your large data/records?
>
> 3. If the results are even approximately valid, what could cause such large
> discrepancies with 'data'?
>
> 4. Is there a way to see if some record has been freed from memory, GCed?
>
>
>
> module Main where
>
> import Prelude
>
> import Control.DeepSeq     (NFData)
> import Data.HashMap.Strict (HashMap, fromList)
> import GHC.Generics        (Generic)
> import Weigh               (mainWith, value)
>
>
> data MapData k v = MapData (HashMap k v) deriving Generic
> instance (NFData k, NFData v) => NFData (MapData k v)
>
> full, half, third :: Int
> full  = 10000
> half  =  5000
> third =  3333
>
> main :: IO ()
> main = mainWith \$ do
>   value "HashMap"       (          mkHMList full)
>   value "HashMap half"  (          mkHMList half)
>   value "HashMap third" (          mkHMList third)
>   value "MapData"       (MapData \$ mkHMList full)
>
> mkHMList :: Int -> HashMap Int String
> mkHMList n = fromList . zip [1..n] \$ replicate n "some text"
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...