[Haskell-beginners] Fwd: CAF taking all the memory
Mahdi
mdibaiee at aol.com
Mon May 9 14:11:03 UTC 2016
> Begin forwarded message:
>
> From: Mahdi <mdibaiee at aol.com>
> Subject: Re: [Haskell-beginners] CAF taking all the memory
> Date: May 8, 2016 at 18:06:29 GMT+4:30
> To: Ben Gamari <ben at smart-cactus.org>
>
> Wow, This is a great explanation! Thank you!
> I see your point on using lazy lists (although I was not aware of the specific problems you mentioned).
> So, I switched over to HMatrix [0], (the libraries you mentioned added a lot of complexity to the code)
> which uses unboxed arrays for matrices.
>
> After refactoring my code to use HMatrix, I ran the program, and the same thing happens,
> the speed hasn’t changed much either, then I tried to take another profile to see if anything has changed,
> but I couldn’t as described here [1].
>
> But my main question is, why would Haskell even store all this data? There are no side-effects, so
> there should be no memory leak in my code.
> From what I read online, Haskell is trying to avoid re-calculating the functions by caching their
> input/outputs (they’re pure, so this can be done), but I hoped there would be a way to disable this,
> unfortunately there doesn’t seem to be a way (or there is?).
> If that’s the case, then using libraries won’t solve the problem, and that’s frustrating.
>
> * I also tried compiling with -O0, but no luck.
>
> [0] http://dis.um.es/~alberto/hmatrix/hmatrixtut.html
> [1] https://github.com/albertoruiz/hmatrix/issues/190
>
> Thank you Ben!
>
> - Mahdi
>
>> On May 3, 2016, at 14:36, Ben Gamari <ben at smart-cactus.org> wrote:
>>
>> Mahdi <mdibaiee at aol.com> writes:
>>
>>> Hello there,
>>>
>> Hi!
>>
>>> I’m pretty new to Haskell and I’m trying to write a Neural Network in
>>> Haskell for educational purposes.
>>>
>>> So, I have my neural network working, it can learn XOR in 500
>>> iterations, but the thing is, if I increase the iterations to
>>> something like 5 milion times, the process just drains my RAM until
>>> either it’s killed or the OS drowns. Here is the code: [0]
>>>
>>> I searched online and tried to get information on why this is
>>> happening, I profiled the memory usage and found that the memory is
>>> taken by CAF, searching online,
>>>
>> Great! The next question you should then ask is "which CAF"? A CAF is
>> simply a top-level "constant" in your program. Indeed it sounds like you
>> have not defined any cost-centers, which means that GHC will attribute
>> the entire cost of your program to the ambiguous cost-center "CAF"
>> (which in this case really just means "the whole program").
>>
>> As discussed in the users guide [1] one way to define cost-centers
>> within your program is to manually annotate expressions with SCC
>> pragmas. However, in this case we simply want to let GHC do this for us,
>> for which we can use the `-fprof-auto -fprof-cafs` flags (which
>> automatically annotate top-level definitions and CAFs with
>> cost-centers),
>>
>> $ ghc -O examples/xor.hs -fprof-auto -fprof-cafs
>>
>> Now your program should give a much more useful profile (run having set
>> the iteration count to 50000),
>>
>> $ time examples/xor +RTS -p
>> [[0.99548584],[4.5138146e-3],[0.9954874],[4.513808e-3]]
>>
>> real 0m4.019s
>> user 0m0.004s
>> sys 0m4.013s
>> $ cat xor.prof
>> Tue May 3 11:15 2016 Time and Allocation Profiling Report (Final)
>>
>> xor +RTS -p -RTS
>>
>> total time = 1.11 secs (1107 ticks @ 1000 us, 1 processor)
>> total alloc = 1,984,202,600 bytes (excludes profiling overheads)
>>
>> COST CENTRE MODULE %time %alloc
>>
>> matrixMap Utils.Math 21.4 26.8
>> matadd Utils.Math 20.8 22.7
>> matmul.\ Utils.Math 10.4 16.0
>> dot Utils.Math 7.1 13.4
>> column Utils.Math 7.0 2.9
>> dot.\ Utils.Math 6.9 1.9
>> rowPairs Utils.Math 5.8 6.5
>> sigmoid' NN 4.7 0.8
>> train.helper Main 4.0 1.3
>> sigmoid NN 3.3 0.8
>> matmul Utils.Math 2.2 2.0
>> hadamard Utils.Math 1.8 2.1
>> columns Utils.Math 1.2 1.3
>>
>> individual inherited
>> COST CENTRE MODULE no. entries %time %alloc %time %alloc
>>
>> MAIN MAIN 79 0 0.0 0.0 100.0 100.0
>> [snip]
>> CAF:main3 Main 143 0 0.0 0.0 100.0 100.0
>> (...) Main 162 1 0.0 0.0 100.0 100.0
>> train Main 163 1 0.0 0.0 100.0 100.0
>> train.helper Main 164 50001 4.0 1.3 100.0 100.0
>> train.helper.hweights Main 258 50001 0.5 0.0 0.5 0.0
>> train.helper.oweights Main 235 50001 0.4 0.0 0.4 0.0
>> train.helper.oback Main 207 50000 0.3 0.1 19.0 20.9
>> backward' NN 208 50000 0.3 0.6 18.7 20.8
>> [snip]
>>
>> So, here we see that costs are actually spread throughout the program.
>>
>> Without diving any deeper into this particular program it's hard to give
>> more guidance however I will say that your lazy list Matrix
>> representation is very rarely the right choice for even toy linear
>> algebra problems.
>>
>> First, consider the fact that even just a list cons constructor requires
>> three words (an info table pointer, a pointer to payload, and a pointer
>> to tail) plus the size of the payload (which in the case of an evaluated
>> `Float` is 2 words: one info table pointer and the floating point value
>> itself). So, a list of n *evaluated* `Float`s (which would require only
>> 4*n bytes if packed densely) will require 40*n bytes if represented as a
>> lazy list.
>>
>> Then, consider the fact that indexing into a lazy list is an O(n)
>> operation: this means that your `Math.column` operation on an n x m
>> matrix may be O(n*m). Even worse, `Math.columns`, as used by
>> `Math.matmul` is O(n * m!).
>>
>> Finally, consider the fact that whenever you "construct" a lazy list you
>> aren't actually performing any computation: you are actually
>> constructing a single thunk which represents the entire result; however,
>> if you then go to index into the middle of that list you will end up
>> constructing n cons cells and a thunk for the payload of each. In the
>> case of primitive linear algebra operations the cost of constructing
>> this payload thunk can be greater than simply computing the result.
>>
>> For these reasons I wouldn't recommend that lazy lists are used in this
>> way. If you have a dense matrix use an array (probably even unboxed;
>> see, for instance, the `array`, `vector`, and `repa` libraries); if
>> you have a sparse matrix then use an appropriate sparse representation
>> (although sadly good sparse linear algebra tools are hard to come by in
>> Haskell) Not only will the result be significantly more efficient in
>> space and time but the runtime behavior of the program will be
>> significantly easier to follow since you can more easily ensure that
>> evaluation occurs when you expect it to.
>>
>> Hopefully this helps. Good luck and let us know if there are further
>> issues.
>>
>> Cheers,
>>
>> - Ben
>>
>>
>> [1] http://downloads.haskell.org/~ghc/master/users-guide//profiling.html#cost-centres-and-cost-centre-stacks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160509/b7ecd57c/attachment.html>
More information about the Beginners
mailing list