Haskell threads & pipes & UNIX processes

Simon Marlow simonmar@microsoft.com
Thu, 15 Feb 2001 09:31:36 -0800

[ moved to glasgow-haskell-users@haskell.org... ]

> I need to compress the output of my Hakell program. To avoid 
> the creation
> of huge files, I want to compress before writing by means of gzip or
> bzip2. However, this seems to be quite involved.
> I decided to run the compressor with 
> runProcess :: FilePath                    -- Command
>            -> [String]                    -- Arguments
>             -> Maybe [(String, String)]    -- 
> Environment(Nothing -> Inherited)
>             -> Maybe FilePath              -- Working 
> directory (Nothing -> inherited)
>             -> Maybe Handle                -- stdin (Nothing 
> -> inherited)
>             -> Maybe Handle                -- stdout (Nothing 
> -> inherited)
>             -> Maybe Handle                -- stderr (Nothing 
> -> inherited)
>             -> IO ()
> First I wanted to use Posix.createPipe to connect to the 
> compressor but
> Posix.createPipe returns a Fd. So this does not fit.

Aha, you want the (unadvertised) function
	PosixIO.fdToHandle :: Fd -> Handle

> I ended up with using Posix.createNamedPipe. This creates a named pipe
> that can be openend with IO.openFile (which again yields the 
> Handle needed
> for runProcess).
> At this point, it turned out that the pipe has to be opened 
> for reading
> before it can be opened for writing (ghc-4.08.1). This seems 
> to be a bug.
> (In a shell, the order does not matter; the processes are suspended 
> as expected.)

We've had some problems with named pipes - check the archives of
glasgow-haskell-bugs.  It seems certain operating systems disagree on
the semantics of a non-blocking open of a named pipe.  I don't recommend
using named pipes if you want portable code.

> Then, as expected, the problem occured that writing to the pipe and
> compressing cannot be sequenced. If the producer is started first, the
> producer blocks because there is no consumer. If the consumer 
> is started
> first, the consumer blocks because there is no input.
> So I played around with threads. If both consumer and producer are
> executed as threads, the program terminates immediately without any
> output. There seems to be no need to start executing any of 
> the threads.

In GHC, the main program terminates as soon as the main thread
terminates.  It doesn't wait for any child threads to terminate - if you
want this behaviour, you can program it using MVars.

We'll be happy to look at any of the other problems you mentioned:
please some code demonstrating the problem to