strace breaks cabal - how to find the problem?
Simon Marlow
marlowsd at gmail.com
Wed Mar 25 09:12:46 EDT 2009
Robin Green wrote:
> On my obscure configuration (GHC 6.10.1, with pkgenv activated, on
> Fedora Linux rawhide running on a VirtualBox x86 VM with hardware
> virtualisation enabled), running strace on cabal causes it to
> misbehave, as described below.
>
> I don't know whether this is due to a bug in cabal, the GHC
> runtime, strace, the Linux kernel, VirtualBox, or my processor's
> virtualization support. Not sure how to proceed to debug this. Perhaps
> someone else could try this, on their configuration, and report what
> they find?
>
> This command is ran in an untarred copy of hs-bibutils 0.1 (not that it
> matters):
>
> strace -f -e trace=file cabal install
> --extra-include-dirs=../bibutils_4.1/lib/ 2>&1|cat >cabal.strace
>
> Cabal fails to determine the version of ghc-pkg (or sometimes ghc), and
> stops, complaining that it can't verify that the version of ghc(-pkg)
> is the required one. ghc-pkg is execve'd, but something goes wrong -
> not sure what.
This sounds slightly familiar. The System.Process library uses vfork() on
Unix systems, which as it turns out helps to avoid some race conditions.
However, while debugging something in this area recently (using strace) I
remember seeing different behaviour when running under strace. I suspect
that strace is doing something to vfork().
Perhaps we should bite the bullet and use fork(), and fix the race
conditions properly. As I recall, what was happening was that the fork()
took so long that it got interrupted by the timer signal (1/50 secs), and
restarted, ad infinitum. So in order to use fork() we have to disable
timer signals (for the whole process? what if multiple threads are doing
fork()? sigh.).
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list