<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style id="EHTipGlobalStyle">.EHTipToolTip * {background: inherit;font-family: inherit;font-size: inherit;font-size-adjust: none;font-stretch: normal;line-height: inherit;font-variant: normal;border: 0px;text-transform: inherit;color: inherit;font-style: inherit;text-decoration: inherit;margin: 0px 0px 0px 0px;padding: 0px 0px 0px 0px;float: none;display: inline;cursor: default;}
.EHTipReplacer, .EHTipKey, .EHTipAudio {cursor: pointer;}
.EHTipToolTip hr {margin: 0.4em 0;display: block;border: 1px inset;}
.EHTipTranslation {font-style: normal;}
.EHTipTranslation a {color: #000099;font-style: normal;text-decoration: none;}
.EHTipTranslation a:hover {background: inherit;color: #000000;text-decoration: underline;}
</style>
</head>
<body style="background-color: rgb(255, 255, 255); color: rgb(0, 0,
0);" text="#000000" bgcolor="#FFFFFF">
Hello again,<br>
<br>
I'm still trying to find some method to predict memory usage and
came upon this page: <a class="moz-txt-link-freetext" href="https://wiki.haskell.org/GHC/Memory_Footprint">https://wiki.haskell.org/GHC/Memory_Footprint</a><br>
<br>
I tried to manually calculate how much memory will the record below
consume (64-bit, 1W = 8B):<br>
<br>
<font size="-1"><tt>newtype Id = Id Text deriving (Generic,
Data, NFData) -- = 6W</tt><tt><br>
</tt><tt>newtype Uid = Uid Int deriving (Generic, Data,
NFData) -- = 2W</tt><tt><br>
</tt><tt>newtype XUid = XUid Uid deriving (Generic, Data,
NFData) -- = 2W</tt><tt><br>
</tt><tt>newtype YUid = YUid Uid deriving (Generic, Data,
NFData) -- = 2W</tt><tt><br>
</tt><tt>data Dir = Yes | No deriving (Generic,
Data) -- = 2W</tt><tt><br>
</tt><tt><br>
</tt><tt>data X = X</tt><tt><br>
</tt><tt> { a :: XUid -- = 2W -- Int</tt><tt><br>
</tt><tt> , b :: YUid -- = 2W -- Int</tt><tt><br>
</tt><tt> , c :: Id -- = 6W + 8B -- Text len 4</tt><tt><br>
</tt><tt> , d :: Either Dir Dir -- = 1W + 2W -- Either +
Dir + No/Yes</tt><tt><br>
</tt><tt> , e :: Text -- = 6W + 8B -- Text len 4</tt><tt><br>
</tt><tt> } deriving (Generic, Data) -- = 19W + 16B = 152 + 16 =
168</tt><tt> B<br>
</tt></font><br>
<p>and calculated the assumed sizes of few lists with different
number of elements:</p>
<p>Expected list sizes ([v] = (1 + 3N) words + N * sizeof(v))<br>
<font size="-1"><tt> 30: 1 + 3W * 30 + (19W * 30 + 16 * 30)
= 5,761 B</tt><tt><br>
</tt><tt> 600: 1 + 3W * 600 + (19W * 600 + 16 * 600) =
115,201 B</tt><tt><br>
</tt><tt>5000: 1 + 3W * 5000 + (19W * 5000 + 16 * 5000) =
960,001 B</tt><tt><br>
</tt></font></p>
<p>I also compared these sizes with three libs
(Data.Generics.Schemes.gsize, GHC.DataSize.recursiveSizeNF, Weigh)
and the results were:</p>
<p><font size="-1"><tt>#items </tt></font><font size="-1"><tt>recursiveSizeNF
gSize Weigh Expected Diff/</tt></font><font size="-1"><tt>recursiveSizeNF</tt><tt><br>
</tt><tt> 1: 1,416 18 696
168 -</tt><tt><br>
</tt><tt> 30: 8,008 931 20,880
5,761 28%</tt><tt><br>
</tt><tt> 600: 135,688 18,601 417,600
115,201 15%</tt><tt><br>
</tt><tt>5000: 1,121,288 155,001 3,480,000
960,001 14%</tt><tt><br>
</tt></font></p>
<p>As you can see, the results are more than surprising (to me),
with recursiveSizeNF coming closest. They all measure the same
variable.<br>
</p>
<p><br>
</p>
<p>What am I missing? <br>
</p>
<br>
<p>For completeness, here are relevant parts of code for creating
elements (with excessive forcing):</p>
<p><font size="-1"><tt> let mkX i = force X{ a = XUid $ Uid i</tt><tt><br>
</tt><tt> , b = YUid $ Uid i</tt><tt><br>
</tt><tt> , c = Id $ tshow i</tt><tt><br>
</tt><tt> , d = if even i then (Left Yes)
else (Right No)</tt><tt><br>
</tt><tt> , e = T.reverse (tshow i)</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> xs30 = force . map mkX $ take 30 $ randomRs
(1000,1030) (mkStdGen 0)</tt><tt><br>
</tt><tt> xs600 = force . map mkX $ take 600 $ randomRs
(1000,1600) (mkStdGen 0)</tt><tt><br>
</tt><tt> xs5K = force . map mkX $ take 5000 $ randomRs
(1000,5000) (mkStdGen 0)</tt></font></p>
<p><font size="-1"><tt></tt></font> dataSize <- recursiveSizeNF
$!! {a}<br>
let gSize = gsize $!! mkX 0<br>
<br>
</p>
<p><br>
</p>
<br>
<blockquote type="cite"
cite="mid:5e82210c-6aa4-a7d0-fce6-2668df343214@gmail.com"
style="border-left: 2px solid #330033 !important; border-right:
2px solid #330033 !important; padding: 0px 15px 0px 15px; margin:
8px 2px;"><!--[if !IE]><DIV style="border-left: 2px solid #330033; border-right: 2px solid #330033; padding: 0px 15px; margin: 2px 0px;"><![endif]--><span
class="headerSpan" style="color:#000000;">
<div class="moz-cite-prefix">-------- Original Message --------<br>
Subject: Measuring memory usage<br>
From: Vlatko Basic <a class="moz-txt-link-rfc2396E" href="mailto:vlatko.basic@gmail.com"><vlatko.basic@gmail.com></a><br>
To: haskell-cafe <a class="moz-txt-link-rfc2396E" href="mailto:Haskell-cafe@Haskell.org"><Haskell-cafe@Haskell.org></a><br>
Date: 29/06/18 14:31<br>
</div>
<br>
<br>
</span>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<p>Hello,</p>
<p>I've come to some strange results using Weigh package. <br>
</p>
<p>It shows that HashMap inside 'data' is using much, much more
memory. <br>
</p>
<p>The strange thing is that I'm seeing too large mem usage in my
app as well (several "MapData" like in records), and trying to
figure out with 'weigh' what's keeping the mem. <br>
</p>
<p>Noticed that when I change the code to use HashMap directly
(not inside 'data', that's the only change), the mem usage
observed with top drops down for ~60M, from 850M to 790M.<br>
</p>
<br>
<p>These are the test results for 10K, 5K and 3.3K items for "data
MapData k v = MapData (HashMap k v)" (at the end is the full
runnable example.)<br>
</p>
<p><font size="-1"><tt>Case Allocated GCs</tt><tt><br>
</tt><tt>HashMap 262,824 0</tt><tt><br>
</tt><tt>HashMap half 58,536 0</tt><tt><br>
</tt><tt>HashMap third 17,064 0</tt><tt><br>
</tt><tt>MapData 4,242,208 4</tt></font></p>
<p>I tested by changing the order, disabling all but one etc., and
the results were the same. Same 'weigh' behaviour with IntMap
and Map.<br>
</p>
<p><br>
</p>
<p>So, if anyone knows and has some experience with such issues,
my questions are:</p>
<p>1. Is 'weigh' package reliable/usable, at least to some extent?
(the results do show diff between full, half and third)</p>
<p>2. How do you measure mem consumptions of your large
data/records?</p>
<p>3. If the results are even approximately valid, what could
cause such large discrepancies with 'data'?</p>
<p>4. Is there a way to see if some record has been freed from
memory, GCed?<br>
</p>
<br>
<p><br>
</p>
<p><font size="-1"><tt>module Main where</tt><tt><br>
</tt><tt><br>
</tt><tt>import Prelude</tt><tt><br>
</tt><tt><br>
</tt><tt>import Control.DeepSeq (NFData)</tt><tt><br>
</tt><tt>import Data.HashMap.Strict (HashMap, fromList)</tt><tt><br>
</tt><tt>import GHC.Generics (Generic)</tt><tt><br>
</tt><tt>import Weigh (mainWith, value)</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>data MapData k v = MapData (HashMap k v) deriving
Generic</tt><tt><br>
</tt><tt>instance (NFData k, NFData v) => NFData (MapData k
v)</tt><tt><br>
</tt><tt><br>
</tt><tt>full, half, third :: Int</tt><tt><br>
</tt><tt>full = 10000</tt><tt><br>
</tt><tt>half = 5000</tt><tt><br>
</tt><tt>third = 3333</tt><tt><br>
</tt><tt><br>
</tt><tt>main :: IO ()</tt><tt><br>
</tt><tt>main = mainWith $ do</tt><tt><br>
</tt><tt> value "HashMap" ( mkHMList full)</tt><tt><br>
</tt><tt> value "HashMap half" ( mkHMList half)</tt><tt><br>
</tt><tt> value "HashMap third" ( mkHMList third)</tt><tt><br>
</tt><tt> value "MapData" (MapData $ mkHMList full)</tt><tt><br>
</tt><tt><br>
</tt><tt>mkHMList :: Int -> HashMap Int String</tt><tt><br>
</tt><tt>mkHMList n = fromList . zip [1..n] $ replicate n
"some text"</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt></font><br>
</p>
<!--[if !IE]></DIV><![endif]--></blockquote>
<br>
</body>
</html>