strace breaks cabal - how to find the problem?

Simon Marlow marlowsd at gmail.com
Wed Mar 25 09:12:46 EDT 2009


Robin Green wrote:
> On my obscure configuration (GHC 6.10.1, with pkgenv activated, on
> Fedora Linux rawhide running on a VirtualBox x86 VM with hardware
> virtualisation enabled), running strace on cabal causes it to
> misbehave, as described below.
> 
> I don't know whether this is due to a bug in cabal, the GHC
> runtime, strace, the Linux kernel, VirtualBox, or my processor's
> virtualization support. Not sure how to proceed to debug this. Perhaps
> someone else could try this, on their configuration, and report what
> they find?
> 
> This command is ran in an untarred copy of hs-bibutils 0.1 (not that it
> matters):
> 
> strace -f -e trace=file cabal install
> --extra-include-dirs=../bibutils_4.1/lib/ 2>&1|cat >cabal.strace
> 
> Cabal fails to determine the version of ghc-pkg (or sometimes ghc), and
> stops, complaining that it can't verify that the version of ghc(-pkg)
> is the required one. ghc-pkg is execve'd, but something goes wrong -
> not sure what.

This sounds slightly familiar.  The System.Process library uses vfork() on 
Unix systems, which as it turns out helps to avoid some race conditions. 
However, while debugging something in this area recently (using strace) I 
remember seeing different behaviour when running under strace.  I suspect 
that strace is doing something to vfork().

Perhaps we should bite the bullet and use fork(), and fix the race 
conditions properly.  As I recall, what was happening was that the fork() 
took so long that it got interrupted by the timer signal (1/50 secs), and 
restarted, ad infinitum.  So in order to use fork() we have to disable 
timer signals (for the whole process? what if multiple threads are doing 
fork()? sigh.).

Cheers,
	Simon


More information about the Glasgow-haskell-users mailing list