[Haskell-cafe] Discussion: The CLOEXEC problem

Niklas Hambüchen mail at nh2.me
Wed Jul 22 12:33:19 UTC 2015

Hello Donn,

Python has a detailed discussion of this suggestion:


It highlights some problems with this approach, most notably Windows
problems, not solving the problem when you exec() without fork(), and
looping up to MAXFD being slow (this is what the current Haskell
`runInteractiveProcess` code
seems to be doing; Python improved upon this by not looping up to MAXFD,
but instead looking up the open FDs in /proc/<PID>/fd/, after people
complained about this loop of close() syscalls being very slow when many
FDs were open.

> do that, you set up the conditions for breaking something that
> works in C, which I hate to see happen with Haskell.

While I understand your opinion here, I'm not sure that "breaking
something that works in C" is the right description. O_CLOEXEC changes a
default setting, but does not irrevocably disable any feature that is
available in C. The difference is that you'd have to say which FDs you
want to keep in the child - which to my knowledge is OK, since it is a
much more common thing to work with *some* designated FDs in the child
process than with all of them.

To elaborate a bit, if you wanted to write a program where a child
process would access the parent's Fds, you would in most cases already
have those Fds in some Haskell variables you're working with. In that
case, it is easy to `setFdOption fd CloseOnExec False` on those if
CLOEXEC is the default, and everybody is happy.
If CLOEXEC is not the default, then you'd get a problem with all those
Fds on which do *not* have a grip in your program, and it's much harder
to fix problems with these resources that are around invisible in the
background than with those that you have in variables that you use.

In other words, CLOEXEC is something that is easy to *undo* locally when
you don't want it, but hard to *do* globally when you need it.

Let me know what you think about this.


On 22/07/15 04:47, Donn Cave wrote:
> quoth Niklas Hambüchen,
> ...
>> The scope of this program is quite general unfortunately: It will happen
>> for any program that uses parallel threads, and that runs two or more
>> external processes at some time. It cannot be fixed by the part that
>> starts the external process (e.g. you can't write a reliable
>> `readProcess` function that doesn't have this problem, since the problem
>> is rooted in the Fds, and there is no version of `exec()` that doesn't
>> inherit parent Fds).
>> This problem is a general problem in C on Unix, and was discovered quite
>> late.
> I believe it has actually been a familiar issue for decades.  I don't
> have any code handy to check, but I'm pretty sure the UNIX system(3)
> and popen(3) functions closed extraneous file descriptors back in the
> early '90s, and probably had been doing it for some time by then.
> I believe this approach to the problem is supported in System.Process,
> via close_fds.  Implementation is a walk through open FDs, in the child
> fork, closing anything not called for by the procedure's parameters
> prior to the exec.
> That approach has the advantage that it applies to all file descriptors,
> whether created by open(2) or by other means - socket, dup(2), etc.
> I like this already implemented solution much better than adding a
> new flag to "all" opens (really only those opens that occur within
> the Haskell runtime, while of course not for external library FDs.)
> The O_CLOEXEC proposal wouldn't be the worst or most gratuitous
> way Haskell tampers with normal UNIX parameters, but any time you
> do that, you set up the conditions for breaking something that
> works in C, which I hate to see happen with Haskell.
> 	Donn
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

More information about the Haskell-Cafe mailing list