Bug in IO libraries when sending data through a pipe?
Jens Petersen
petersen@redhat.com
21 Mar 2002 23:20:17 +0900
Hi Volker,
Volker Wysk <post@volker-wysk.de> writes:
> On Mit, 2002-03-20 at 07:00, Jens Petersen wrote:
> > Jens Petersen <petersen@redhat.com> writes:
> >
> > > > The problem is that the child process doesn't
> > > > receive all the data which the parent sends. It's as
> > > > if "hPutStr vonh txt" sends the data lazily somehow,
> > > > and "hClose vonh" closes the pipe prematurely.
> > > >
> > > > It varies from run to run exactly which data gets
> > > > through. If I cause the child process to read all
> > > > its input immediately, the problem doesn't seem to
> > > > occur. Normally, it does so gradually, which takes a
> > > > few seconds.
> > > >
> > > > I'm using GHC 5.02.2
> > >
> > > Quite possibly could be a bug. Lazy IO is rather subtle I
> > > think, specially when done across pipes. I faced some
> > > similar problem with in POpen recently. You can see how I
> > > solved it (worked round it?) by comparing the latest release
> > > 1.00 with the previous one 0.00.1:
> > >
> > > http://www.01.246.ne.jp/~juhp/haskell/popenhs/
>
> POpen-1.0.0 contains the same bug which I made. It doesn't ensure that
> the values which are needed after the call of forkProcess, before that
> of executeFile, are fully evaluated. So, if they are read lazily from a
> stream, the newly spawned child process reads data from a stream which
> it shares with its parent, making it disappear from the parent's input.
> In this situation, this sure isn't intended.
Perhaps you could give an explicit example?
> Inserting the following lines just before the line "pid <- forkProcess",
> in POpen.hs, would force the corresponding values to be evaluated, so no
> data will be lost.
>
> seq (length path) $ seq (sum (map length args)) $ return ()
> when (isJust env) $ seq (sum (map (\(a,b) -> length a + length b)
> (fromJust env))) $ return ()
> when (isJust dir) $ seq (length (fromJust dir)) $ return ()
Hmmm, I don't really see why this is necessary. Don't the
lazy values of "path", "env" and "dir" just get evaluated
when they're needed here as normal? (If what you say is
true though it would be simpler just to use "$!" or "!"s for
strict evaluation I guess.)
I would be more worried about the input stream string not
being complete when the input handle is closed.
> I'm also not sure what this part is supposed to do:
>
> inr <- if (isJust inpt) then
> do
> (inr', inw) <- createPipe
> hin <- fdToHandle inw
> hPutStr hin $ fromJust inpt
> hClose hin
> return $ Just inr'
> else
> return Nothing
It returns the output end of a pipe containing the input
string if one is given.
> Doesn't it write the input data to a pipe which no process reads
> from..??
Nope, "doTheBusiness" dup2's it to the stdin of the subprocess:
(outr, outw) <- createPipe
(errr, errw) <- createPipe
pid <- forkProcess
case pid of
Nothing -> doTheBusiness inr outw errw -- ***
Just p ->
do
-- close other end of pipes in here
when (isJust inr) $
fdClose $ fromJust inr
fdClose outw
fdClose errw
hout <- fdToHandle outr
outstrm <- hGetContents hout
herr <- fdToHandle errr
errstrm <- hGetContents herr
return (outstrm, errstrm , p)
where
doTheBusiness ::
Maybe Fd -- stdin
-> Fd -- stdout
-> Fd -- stderr
-> IO (String, String, ProcessID) -- (stdout, stderr)
doTheBusiness inr outw errw =
do
maybeChangeWorkingDirectory dir
when (isJust inr) $
dupTo (fromJust inr) stdInput -- ***
dupTo outw stdOutput
dupTo errw stdError
executeFile path True args env
-- for typing, should never actually run
error "executeFile failed!"
Jens