[Gtk2hs-devel] behaviour of ghci on .c modules that are part of a library

Simon Marlow marlowsd at gmail.com
Fri Jul 16 08:29:42 EDT 2010


On 16/07/2010 12:36, Axel Simon wrote:
> Dear Haskell maintainers,
>
> I've progressed a little and found that the problem is down to
> accessing global variables that are declared in dynamic libraries. In
> a nutshell, this doesn't as the addresses of these global variables
> are all wrong when ghci is executing the code. So, I think I hit:
>
> http://hackage.haskell.org/trac/ghc/ticket/781
>
> I was able to work around this problem by compiling the C modules with
> -fPIC. This bug is pretty bad, I'd say. I've added myself to its CC
> list.

Urgh.  It's a nasty bug, but not one that we can fix, because it's an 
artifact of the small memory model used on x86_64.  The only fix is to 
use -fPIC.

It might be possible to use -fPIC either by default, or perhaps just for 
.c files and when compiling data references from FFI declarations in 
Haskell code, that's something we could look into.  We might want -fPIC 
on by default anyway if we switch to using dynamic linking by default 
(but we're not yet sure what ramifications that will have).

Cheers,
	Simon



> Cheers,
> Axel
>
> On 14.07.2010, at 16:51, Axel Simon wrote:
>
>> Hi all,
>>
>> I'm trying to debug a segfault relating to the memory management in
>> Gtk2Hs. Rather than make you read the ticket http://hackage.haskell.org/trac/gtk2hs/ticket/1183
>>   , I'll describe the problem:
>>
>> - compiler 6.12.1 or 6.12.3
>> - darcs head of Gtk2Hs with #define DEBUG instead of #undef DEBUG in
>> gtk/Graphics/UI/Gtk/General/hsthread.c
>> - platform Ubuntu Linux, x86-64
>> - to reproduce: cd gtk2hs/gtk/demo/hello and run ghci World.hs and
>> type 'main'
>>
>> A window with the "Hello World" button appears. After a few seconds,
>> the GC runs and the finaliser of the GtkButton is run since the
>> Haskell program no longer holds a reference to that object (only the
>> GtkWindow in C land has).
>>
>> Thus, the GC calls a C function gtk2hs_g_object_unref_from_mainloop
>> which is supposed to enqueue the object into a global data structure
>> from which objects are later taken and g_object_unref is called on
>> them.
>>
>> This global data structure is protected by a mutex, which is
>> acquired using g_static_mutex_lock:
>>
>> void gtk2hs_g_object_unref_from_mainloop(gpointer object) {
>>
>>   int mutex_locked = 0;
>>   if (threads_initialised) {
>> #ifdef DEBUG
>>       printf("acquiring lock to add a %s object at %lx\n",
>>              g_type_name(G_OBJECT_TYPE(object)), (unsigned long)
>> object);
>>       printf("value of lock function is %lx\n",
>>              (unsigned long)
>> g_thread_functions_for_glib_use.mutex_lock);
>> #endif
>>     g_rand_new();
>> #if defined( WIN32 )
>>     EnterCriticalSection(&gtk2hs_finalizer_mutex);
>> #else
>>     g_static_mutex_lock(&gtk2hs_finalizer_mutex);
>> #endif
>>     mutex_locked = 1;
>>   }
>> [..]
>>
>> The program prints:
>>
>> acquiring lock to add a GtkButton object at 22d8020
>> value of lock function is 0
>> zsh: segmentation fault  ghci World
>>
>> Now the debugging weirdness starts. Whatever I do, I cannot get gdb
>> to find the symbol gtk2hs_g_object_unref_from_mainloop.
>>
>> Since the function above is contained in a C file that comes with
>> our Haskell library, I tried to add "cc-options: -g" and "cc-
>> options: -ggdb -O0", but maybe somewhere symbols are stripped. So I
>> added the bogus function call to "g_rand_new()" which is not called
>> anywhere else and gdb stops as follows:
>>
>> acquiring lock to add a GtkButton object at 2105020
>> value of lock function is 0
>> [Switching to Thread 0x7ffff41ff710 (LWP 15735)]
>>
>> Breakpoint 12, 0x00007ffff115bfa0 in g_rand_new () from /usr/lib/
>> libglib-2.0.so
>>
>> This all seems reasonable, but:
>>
>> (gdb) bt
>> #0  0x00007ffff115bfa0 in g_rand_new () from /usr/lib/libglib-2.0.so
>> #1  0x00000000419b3792 in ?? ()
>> #2  0x00007ffff678f078 in ?? ()
>>
>> i.e. the calling context is broken. I'm very, very sure that the
>> caller is indeed the above mentioned function and since g_rand_new
>> isn't called anywhere in my Haskell program (and otherwise the
>> calling context would be sane).
>> I'm also passing the address of gtk2hs_g_object_unref_from_mainloop
>> as FinalizerPtr to all my ForeignPtrs, so there is no inlining going
>> on.
>>
>> Back to the culprit, the call to g_static_mutex_lock. This is a
>> macro that expands to
>>
>> *g_thread_functions_for_glib_use.mutex_lock
>>
>> where g_thread_functions_for_glib is a global variable that contains
>> a lot of function pointers. At the break point, it contains this:
>>
>> (gdb) print g_thread_functions_for_glib_use
>> $33 = {mutex_new = 0x7ffff0cd9820<g_mutex_new_posix_impl>,
>>   mutex_lock = 0x7ffff6c8b3c0<__pthread_mutex_lock>,
>>   mutex_trylock = 0x7ffff0cd97b0<g_mutex_trylock_posix_impl>,
>>   mutex_unlock = 0x7ffff6c8ca00<__pthread_mutex_unlock>,
>>   mutex_free = 0x7ffff0cd9740<g_mutex_free_posix_impl>,
>> [..]
>>
>> So the call to g_mutex_lock should call the function
>> __pthread_mutex_lock but it calls NULL.
>>
>> I hoped that writing this email would give me a bit more insight
>> into the problem, but for now I suspect that something overwrites
>> either the stack or the code of the function.
>>
>> On the same platform, the compiled version prints:
>>
>> acquiring lock to add a GtkButton object at 1b05820
>> value of lock function is 7f7adcabd3c0
>> within mutex: adding finalizer to a GtkButton object!
>>
>> On Mac OS or i386, using ghci or ghc, version 6.10.4, it works as
>> well.
>> Now for the fun bit: on i386 using ghci version 6.12.1 it works too.
>>
>> So it's an x86-64 and ghc 6.12.1 bug. According to Christian Maeder
>> who submitted the ticket, the problem persists in 6.12.3.
>>
>> Any hints and help appreciated,
>> Cheers,
>> Axel
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users at haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Gtk2hs-devel mailing list
> Gtk2hs-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gtk2hs-devel



More information about the Glasgow-haskell-users mailing list