[GHC] #8935: Obscure linker bug leads to crash in GHCi
GHC
ghc-devs at haskell.org
Wed May 7 07:34:26 UTC 2014
#8935: Obscure linker bug leads to crash in GHCi
-------------------------------------+------------------------------------
Reporter: simonmar | Owner: simonmar
Type: bug | Status: patch
Priority: high | Milestone: 7.8.3
Component: Runtime System | Version: 7.8.1-rc2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: GHCi crash | Difficulty: Rocket Science
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by simonmar):
> The point though, is that dlsym is doing exactly what it should: It
returns the first global definition it finds.
That's right, but let me add that weak symbols are a red herring. Let me
demonstrate this without using weak symbols or `environ`.
foo.c:
{{{
int bar = 1;
int getbar(void)
{
return bar;
}
}}}
Compile it like this:
{{{
$ gcc -fPIC foo.c -shared -o libfoo.so
}}}
Now the test program, `test.c`:
{{{
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
extern int bar;
extern int getbar(void);
int main(int argc, char *argv[])
{
void *deflt, *hdl;
int *pbar;
int (*pgetbar)(void);
bar = 2;
printf("&bar = %p, bar = %d, getbar() = %d\n", &bar, bar, getbar());
deflt = dlopen(NULL, RTLD_LAZY | RTLD_GLOBAL);
if (hdl == NULL) {
printf("%s\n", dlerror());
exit(1);
}
pbar = dlsym(deflt, "bar");
printf("dlsym(deflt, \"bar\") = %p, *pbar = %d\n", pbar, *pbar);
pgetbar = dlsym(deflt, "getbar");
printf("dlsym(deflt, \"getbar\") = %p, pgetbar() = %d\n", pgetbar,
(*pgetbar)());
hdl = dlopen("libfoo.so", RTLD_LAZY);
if (hdl == NULL) {
printf("%s\n", dlerror());
exit(1);
}
pbar = dlsym(hdl, "bar");
printf("dlsym(\"./libfoo.so\", \"bar\") = %p, *pbar = %d\n", pbar,
*pbar);
pgetbar = dlsym(hdl, "getbar");
printf("dlsym(deflt, \"getbar\") = %p, pgetbar() = %d\n", pgetbar,
(*pgetbar)());
}
}}}
Note that we have `extern` references for both the data variable `bar` and
the function `getbar`. Compile and run it like this:
{{{
$ gcc test.c -ldl libfoo.so
$ LD_LIBRARY_PATH=. ./a.out
}}}
{{{
$ LD_LIBRARY_PATH=. ./a.out
&bar = 0x601060, bar = 2, getbar() = 2
dlsym(deflt, "bar") = 0x601060, *pbar = 2
dlsym(deflt, "getbar") = 0x2b9809ab259c, pgetbar() = 2
dlsym("./libfoo.so", "bar") = 0x2b9809cb3010, *pbar = 1
dlsym(deflt, "getbar") = 0x2b9809ab259c, pgetbar() = 2
}}}
Note that:
* the address of the function `getbar` is the same, regardless of whether
we look it up in the main program or `libfoo.so`
* the data variable `bar` has one address in the main program, and
another one in `libfoo.so`.
* the version in `libfoo.so` has the initial value 1, it wasn't updated
by the assignment `bar = 2` in the main program
* Calling `getbar()` returns the correct value 2, because the refernce to
`bar` in `libfoo.so` has been relocated to point to the version of `bar`
in the main program, not the one in `libfoo.so`.
So I'm pretty sure that this is all due to the copy semantics that moves
data variables into the main program. It means that if you use `dlsym` to
find the address of a data variable, you might get the wrong answer.
In any case, the correct fix for GHC ought to be to use `RTLD_LOCAL` and
look up in the main program first.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8935#comment:37>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list