[Haskell-cafe] Help with Stack Space Overflow / Memory Issues
Travis B. Hartwell
nafai at travishartwell.net
Sun Feb 15 17:04:35 EST 2009
Hello,
I'm writing a small program to process Delicious [1] RSS feeds. I like
look at the recent feeds to see what others have bookmarked recently.
But, there are a lot of duplicates in the recent feeds as an entry is
shown for each person who bookmarks an individual URL. I decided to
write a small program that would trim out those that I've seen before.
I wrote a small program that read a feed (initially just a on-disk copy
of an RSS feed) and removed the duplicate items just within that feed.
It worked great. Then, I wanted to add persistence, so this would
maintain state from one run to the next. I decided to use Data.Binary
to serialize the Data.Map I was using and re-load it each time.
Unfortunately, making this change caused a "Stack Space Overflow" error
and I couldn't track down what was wrong. This was with GHC 6.8.2. I
recently upgraded to GHC 6.10.1 and the memory just grows unbounded,
until it actually locks up my machine.
This happens even when I comment out the code for the serialization /
de-serialization of the map, so essentially the only difference from my
prior version is the function where the map is initialized returns IO
[Item] instead of [Item].
The latest version of my code is up on github [2], and the sample RSS
feed I was processing is included in the repo. I'd appreciate some help
in how to attack this problem. I've even tried profiling this (back
when I was using 6.8.2) and there was nothing enlightening there, at
least with my limited Haskell experience. I am unsure of how to get
this to work, or if the problem is even my code.
Additionally, I am unsure if my serialization code would work anyway.
Because Haskell is not pass-by-reference, would the changes to the
seenMap propogate back to my deDupWithSerializedMap function where it
is serialized? If not, how would I go about doing this?
I think part of my problem might be the difference between pure and
impure code and how to separate it.
Thanks for the help!
[1] http://www.delicious.com/
[2] http://github.com/Nafai77/recent-feeds/tree/master
---------------
Travis B. Hartwell
Software Toolsmith
Blog:
http://www.travishartwell.net/blog
Where to find me:
http://www.travishartwell.net/blog/static/where_to_find_me
More information about the Haskell-Cafe
mailing list