<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">Begin forwarded message:</div><br class="Apple-interchange-newline"><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">From: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">Mahdi <<a href="mailto:mdibaiee@aol.com" class="">mdibaiee@aol.com</a>><br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">Subject: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class=""><b class="">Re: [Haskell-beginners] CAF taking all the memory</b><br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">Date: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">May 8, 2016 at 18:06:29 GMT+4:30<br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">To: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">Ben Gamari <<a href="mailto:ben@smart-cactus.org" class="">ben@smart-cactus.org</a>><br class=""></span></div><br class=""><div class=""><div class="">Wow, This is a great explanation! Thank you!<br class="">I see your point on using lazy lists (although I was not aware of the specific problems you mentioned).<br class="">So, I switched over to HMatrix [0], (the libraries you mentioned added a lot of complexity to the code)<br class="">which uses unboxed arrays for matrices.<br class=""><br class="">After refactoring my code to use HMatrix, I ran the program, and the same thing happens,<br class="">the speed hasn’t changed much either, then I tried to take another profile to see if anything has changed,<br class="">but I couldn’t as described here [1].<br class=""><br class="">But my main question is, why would Haskell even store all this data? There are no side-effects, so<br class="">there should be no memory leak in my code.<br class="">From what I read online, Haskell is trying to avoid re-calculating the functions by caching their<br class="">input/outputs (they’re pure, so this can be done), but I hoped there would be a way to disable this,<br class="">unfortunately there doesn’t seem to be a way (or there is?).<br class="">If that’s the case, then using libraries won’t solve the problem, and that’s frustrating.<br class=""><br class="">* I also tried compiling with -O0, but no luck.<br class=""><br class="">[0] <a href="http://dis.um.es/~alberto/hmatrix/hmatrixtut.html" class="">http://dis.um.es/~alberto/hmatrix/hmatrixtut.html</a><br class="">[1] <a href="https://github.com/albertoruiz/hmatrix/issues/190" class="">https://github.com/albertoruiz/hmatrix/issues/190</a><br class=""><br class="">Thank you Ben!<br class=""><br class="">- Mahdi<br class=""><br class=""><blockquote type="cite" class="">On May 3, 2016, at 14:36, Ben Gamari <<a href="mailto:ben@smart-cactus.org" class="">ben@smart-cactus.org</a>> wrote:<br class=""><br class="">Mahdi <<a href="mailto:mdibaiee@aol.com" class="">mdibaiee@aol.com</a>> writes:<br class=""><br class=""><blockquote type="cite" class="">Hello there,<br class=""><br class=""></blockquote>Hi!<br class=""><br class=""><blockquote type="cite" class="">I’m pretty new to Haskell and I’m trying to write a Neural Network in<br class="">Haskell for educational purposes.<br class=""><br class="">So, I have my neural network working, it can learn XOR in 500<br class="">iterations, but the thing is, if I increase the iterations to<br class="">something like 5 milion times, the process just drains my RAM until<br class="">either it’s killed or the OS drowns. Here is the code: [0]<br class=""><br class="">I searched online and tried to get information on why this is<br class="">happening, I profiled the memory usage and found that the memory is<br class="">taken by CAF, searching online,<br class=""><br class=""></blockquote>Great! The next question you should then ask is "which CAF"? A CAF is<br class="">simply a top-level "constant" in your program. Indeed it sounds like you<br class="">have not defined any cost-centers, which means that GHC will attribute<br class="">the entire cost of your program to the ambiguous cost-center "CAF"<br class="">(which in this case really just means "the whole program").<br class=""><br class="">As discussed in the users guide [1] one way to define cost-centers<br class="">within your program is to manually annotate expressions with SCC<br class="">pragmas. However, in this case we simply want to let GHC do this for us,<br class="">for which we can use the `-fprof-auto -fprof-cafs` flags (which<br class="">automatically annotate top-level definitions and CAFs with<br class="">cost-centers),<br class=""><br class=""> $ ghc -O examples/xor.hs -fprof-auto -fprof-cafs<br class=""><br class="">Now your program should give a much more useful profile (run having set<br class="">the iteration count to 50000),<br class=""><br class=""> $ time examples/xor +RTS -p <br class=""> [[0.99548584],[4.5138146e-3],[0.9954874],[4.513808e-3]]<br class=""><br class=""> real<span class="Apple-tab-span" style="white-space:pre"> </span>0m4.019s<br class=""> user<span class="Apple-tab-span" style="white-space:pre"> </span>0m0.004s<br class=""> sys<span class="Apple-tab-span" style="white-space:pre"> </span>0m4.013s<br class=""> $ cat xor.prof<br class=""> Tue May 3 11:15 2016 Time and Allocation Profiling Report (Final)<br class=""><br class=""> xor +RTS -p -RTS<br class=""><br class=""> total time = 1.11 secs (1107 ticks @ 1000 us, 1 processor)<br class=""> total alloc = 1,984,202,600 bytes (excludes profiling overheads)<br class=""><br class=""> COST CENTRE MODULE %time %alloc<br class=""><br class=""> matrixMap Utils.Math 21.4 26.8<br class=""> matadd Utils.Math 20.8 22.7<br class=""> matmul.\ Utils.Math 10.4 16.0<br class=""> dot Utils.Math 7.1 13.4<br class=""> column Utils.Math 7.0 2.9<br class=""> dot.\ Utils.Math 6.9 1.9<br class=""> rowPairs Utils.Math 5.8 6.5<br class=""> sigmoid' NN 4.7 0.8<br class=""> train.helper Main 4.0 1.3<br class=""> sigmoid NN 3.3 0.8<br class=""> matmul Utils.Math 2.2 2.0<br class=""> hadamard Utils.Math 1.8 2.1<br class=""> columns Utils.Math 1.2 1.3<br class=""><br class=""> individual inherited<br class=""> COST CENTRE MODULE no. entries %time %alloc %time %alloc<br class=""><br class=""> MAIN MAIN 79 0 0.0 0.0 100.0 100.0<br class=""> [snip]<br class=""> CAF:main3 Main 143 0 0.0 0.0 100.0 100.0<br class=""> (...) Main 162 1 0.0 0.0 100.0 100.0<br class=""> train Main 163 1 0.0 0.0 100.0 100.0<br class=""> train.helper Main 164 50001 4.0 1.3 100.0 100.0<br class=""> train.helper.hweights Main 258 50001 0.5 0.0 0.5 0.0<br class=""> train.helper.oweights Main 235 50001 0.4 0.0 0.4 0.0<br class=""> train.helper.oback Main 207 50000 0.3 0.1 19.0 20.9<br class=""> backward' NN 208 50000 0.3 0.6 18.7 20.8<br class=""> [snip]<br class=""><br class="">So, here we see that costs are actually spread throughout the program.<br class=""><br class="">Without diving any deeper into this particular program it's hard to give<br class="">more guidance however I will say that your lazy list Matrix<br class="">representation is very rarely the right choice for even toy linear<br class="">algebra problems.<br class=""><br class="">First, consider the fact that even just a list cons constructor requires<br class="">three words (an info table pointer, a pointer to payload, and a pointer<br class="">to tail) plus the size of the payload (which in the case of an evaluated<br class="">`Float` is 2 words: one info table pointer and the floating point value<br class="">itself). So, a list of n *evaluated* `Float`s (which would require only<br class="">4*n bytes if packed densely) will require 40*n bytes if represented as a<br class="">lazy list.<br class=""><br class="">Then, consider the fact that indexing into a lazy list is an O(n)<br class="">operation: this means that your `Math.column` operation on an n x m<br class="">matrix may be O(n*m). Even worse, `Math.columns`, as used by<br class="">`Math.matmul` is O(n * m!).<br class=""><br class="">Finally, consider the fact that whenever you "construct" a lazy list you<br class="">aren't actually performing any computation: you are actually<br class="">constructing a single thunk which represents the entire result; however,<br class="">if you then go to index into the middle of that list you will end up<br class="">constructing n cons cells and a thunk for the payload of each. In the<br class="">case of primitive linear algebra operations the cost of constructing<br class="">this payload thunk can be greater than simply computing the result.<br class=""><br class="">For these reasons I wouldn't recommend that lazy lists are used in this<br class="">way. If you have a dense matrix use an array (probably even unboxed;<br class="">see, for instance, the `array`, `vector`, and `repa` libraries); if<br class="">you have a sparse matrix then use an appropriate sparse representation<br class="">(although sadly good sparse linear algebra tools are hard to come by in<br class="">Haskell) Not only will the result be significantly more efficient in<br class="">space and time but the runtime behavior of the program will be<br class="">significantly easier to follow since you can more easily ensure that<br class="">evaluation occurs when you expect it to.<br class=""><br class="">Hopefully this helps. Good luck and let us know if there are further<br class="">issues.<br class=""><br class="">Cheers,<br class=""><br class="">- Ben<br class=""><br class=""><br class="">[1] <a href="http://downloads.haskell.org/~ghc/master/users-guide//profiling.html#cost-centres-and-cost-centre-stacks" class="">http://downloads.haskell.org/~ghc/master/users-guide//profiling.html#cost-centres-and-cost-centre-stacks</a><br class=""></blockquote><br class=""></div></div></blockquote></div><br class=""></body></html>