[Haskell-cafe] Re: Mining Twitter data in Haskell and Clojure

braver deliverable at gmail.com
Tue Jun 15 15:43:21 EDT 2010


On Jun 15, 6:27 am, Simon Marlow <marlo... at gmail.com> wrote:
> On 15/06/2010 06:09, braver wrote:
>
> > In fact, the tag cafe2, when run on the full dataset, gets stuck at 11
> > days, with RAM slowly getting into 50 GB; a previous version caused
> > ghc 6.12.1 to segfault around day 12 -- -debug showing an assert
> > failure in Storage.c.  ghc 6.10 got stuck at 30 days for good, and
> > when profiling crashed twice with  a "strange closure" or a stack
> > overflow.  So allocation is a problem still.
>
> I'd be happy to help you track this down, but I don't have a machine big
> enough.  Do you have any runs that display a problem with a smaller heap
> (< 16GB)?
>
> If the program is apparently hung, try connecting to it with 'gdb
> --pid=<pid>' and doing 'info thread' and 'where'.  That might give me
> enough clues to find out where the problem is.
>
> Is this with -threaded, BTW?  With residency on that scale, I'd expect
> the parallel GC to help quite a lot.  But obviously getting it to not
> crash/hang is the first priority :)

Simon - thanks for the tips, this is what gdb says when it's stuck at
45 GB when limited with -A5G -M40G:

...
0x00000000004c3c21 in free_mega_group ()
(gdb) info thread
* 1 Thread 0x2b21c1da4dc0 (LWP 10210)  0x00000000004c3c21 in
free_mega_group ()
(gdb) where
#0  0x00000000004c3c21 in free_mega_group ()
#1  0x00000000004c3ff9 in freeChain ()
#2  0x00000000004c5ab0 in GarbageCollect ()
#3  0x00000000004bff96 in scheduleDoGC ()
#4  0x00000000004c0b25 in scheduleWaitThread ()
#5  0x00000000004bea09 in real_main ()
#6  0x00000000004beb17 in hs_main ()
#7  0x00000037d5a1d974 in __libc_start_main () from /lib64/libc.so.6
#8  0x0000000000402ca9 in _start ()

I'll also supply heap profiles for small runs shortly.

-- Alexy


More information about the Haskell-Cafe mailing list