popen lazy args and input pipe

Jens Petersen petersen@redhat.com
22 Mar 2002 14:37:00 +0900


Volker Wysk <post@volker-wysk.de> writes:

> On 21 Mar 2002, Jens Petersen wrote:
> > Volker Wysk <post@volker-wysk.de> writes:
> > > POpen-1.0.0 contains the same bug which I made. It doesn't ensure that
> > > the values which are needed after the call of forkProcess, before that
> > > of executeFile, are fully evaluated. So, if they are read lazily from a
> > > stream, the newly spawned child process reads data from a stream which
> > > it shares with its parent, making it disappear from the parent's input.
> > > In this situation, this sure isn't intended.

Ok, I agree this is a potential if unlikely problem.  I
can't really think of any useful examples though.

I guess using "$!"s in the call to popen would solve this
part.

> > Perhaps you could give an explicit example?
> 
> I haven't tried it, but it's exactly the same thing.

Well, an explicit example using your "pipeto" or popen would
be helpful.

> > > Inserting the following lines just before the line "pid <- forkProcess",
> > > in POpen.hs, would force the corresponding values to be evaluated, so no
> > > data will be lost.
> > >
> > >     seq (length path) $ seq (sum (map length args)) $ return ()
> > >     when (isJust env) $ seq (sum (map (\(a,b) -> length a + length b)
> > >                                       (fromJust env))) $ return ()
> > >     when (isJust dir) $ seq (length (fromJust dir)) $ return ()

I would prefer not to add strict evaluation to POpen unless
it's absolutely necessary.  I guess I really need a testcase
for this problem.  If you have one please send it to me.  I
should really add some unit tests to popenhs.

> > Hmmm, I don't really see why this is necessary.  Don't the
> > lazy values of "path", "env" and "dir" just get evaluated
> > when they're needed here as normal?  (If what you say is
> > true though it would be simpler just to use "$!" or "!"s for
> > strict evaluation I guess.)
> 
> Yes, and that's *after* forkProcess. So when they are computed from
> the lazily read contents of a stream, the newly spawned child will read
> data from a stream which it shares with its parent.

Btw I guess one can say that popen inherits this problem
from "Posix.runProcess".

But usually "path", "env" and "dir" are not streams, just
strings, right?  Even for args I feel pushed to think of a
real example where it could be a problem.  Something like
"xargs" taking a long stream of arguments from stdin, but
arguments instead??  (Most shells have restrictions on the
size of argv I think though.)

Are you're referring to these comments from Posix.lhs, or
you've rediscovered them?

-- ***NOTE***: make sure you completely force the evaluation of the path
-- and arguments to the child before calling runProcess. If you don't do
-- this *and* the arguments from runProcess are read in from a file lazily,
-- be prepared for some rather weird parent-child file I/O behaviour.
--
-- [If you don't force the args, consider the case where the
--  arguments emanate from a file that is read lazily, using
--  hGetContents or some such. Since a child of a fork()
--  inherits the opened files of the parent, the child can
--  force the evaluation of the arguments and read them off the
--  file without any problems.  The problem is that while the
--  child share a file table with the parent, it has separate
--  buffers, so a child may fill up its (copy of) the buffer,
--  but only read it partially. When the *parent* tries to read
--  from the shared file again, the (shared) file offset will
--  have been stepped on by whatever number of chars that was
--  copied into the file buffer of the child. i.e., the unused
--  parts of the buffer will *not* be seen, resulting in
--  random/unpredicatable results.
--
--  Based on a true (, debugged :-) story.
-- ]

Reading this again I start to understand the problem you're
describing a little better.  Perhaps something does need to
be done about it, as you say. ;-)

> But hPutStr, followed by hClose, won't complete until all
> the input string has been written, while no process is
> listening.

Oops, you're right!  Indeed popen hangs on input greater
than 4kB on my system.  Thank you very much for reporting
this.

Jens