[Haskell-cafe] Re: Mining Twitter data in Haskell and Clojure

Simon Marlow marlowsd at gmail.com
Tue Jun 15 06:27:29 EDT 2010


On 15/06/2010 06:09, braver wrote:
> In fact, the tag cafe2, when run on the full dataset, gets stuck at 11
> days, with RAM slowly getting into 50 GB; a previous version caused
> ghc 6.12.1 to segfault around day 12 -- -debug showing an assert
> failure in Storage.c.  ghc 6.10 got stuck at 30 days for good, and
> when profiling crashed twice with  a "strange closure" or a stack
> overflow.  So allocation is a problem still.

I'd be happy to help you track this down, but I don't have a machine big 
enough.  Do you have any runs that display a problem with a smaller heap 
(< 16GB)?

If the program is apparently hung, try connecting to it with 'gdb 
--pid=<pid>' and doing 'info thread' and 'where'.  That might give me 
enough clues to find out where the problem is.

Is this with -threaded, BTW?  With residency on that scale, I'd expect 
the parallel GC to help quite a lot.  But obviously getting it to not 
crash/hang is the first priority :)

Cheers,
	Simon


More information about the Haskell-Cafe mailing list