[Haskell-cafe] Re: I/O interface

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Wed Jan 12 21:21:19 EST 2005


Ben Rudiak-Gould <Benjamin.Rudiak-Gould at cl.cam.ac.uk> writes:

> is there *any* way to get, without an exploitable race condition,
> two filehandles to the same file which don't share a file pointer?

AFAIK it's not possible if the only thing you know is one of the
descriptors. Of course independent open() calls which refer to the
same file have separate file pointers (I mean the true filename,
not /proc/*/fd/*).

On Linux the current file position is stored in struct file in the
kernel. struct file includes "void *private_data" whose internals
depend on the nature of the file, in particular they can be reference
counted. Among polymorphic operations on files in struct file_operations
there is nothing which clones the struct file. This means that a
device driver would have no means to specify how private_data of
its files should be duplicated (e.g. by bumping the reference count).
If my understanding is correct, it implies that the kernel has no way
to clone an arbitrary struct file.

Just don't use the current position of seekable files if you don't
like it: use pread/pwrite.

> Is there any way to pass a filehandle as stdin to an untrusted/
> uncooperative child process in such a way that the child can't
> interfere with your attempts to (say) append to the same file?

You can set O_APPEND flag to force each write to happen at the end
of file. It doesn't prevent the process from clearing the flag.

If it's untrusted, how do you know that it won't truncate the file
or just write garbage to it where you would have written something?

If the file is seekable, you can use pread/pwrite. If it's not
seekable, the concept of concurrent but non-interfering reads or
writes is meaningless.

> I think we just need more kinds of streams. With regard to file-backed
> streams, there are three cases:
>
>   1. We open a file and use it in-process.
>   2. We open a file and share it with child processes.
>   3. We get a handle at process startup which happens to be a file.

I disagree. IMHO the only distinction is whether we want to perform
I/O at the current position (shared between processes) or explicitly
specified position (possible only in case of seekable files). Neither
can be emulated in terms of the other.

> In case 2 we could avoid OS problems by creating a pipe and managing
> our end in-process.

It's not transparent: it translates only read and write, but not
sendto/recvfrom, setsockopt, ioctl, lseek etc., and obviously it will
stop working when our process finishes but the other does not.

A pipe can be created when the program really wants this, but it should
not be created autimatically whenever we redirect stdin/stdout/stderr
of another program to a file we have opened.

> Case 3 is the most interesting. In an ideal world I would argue for
> treating stdin/out/err simply as streams, but that's not practical.
> Failing that, if we have pread and pwrite, we should provide two
> versions of stdin/out/err, one of type InputStream/OutputStream and
> the other of type Maybe File. We can safely layer other streams on top
> of these files (if they exist) without interfering with the stream
> operation.

I'm not sure what do you mean. Haskell should not use pread/pwrite for
functions like putStr, even if stdout is seekable. The current file
position *should* be shared between processes by default, otherwise
redirection of stdout to a file will break if the program delegates
some work with corresponding output to other programs it runs.

> Indeed, file positions are exactly as evil as indices into shared
> memory arrays, which is to say not evil at all. But suppose each
> shared memory array came with a shared "current index", and there
> was no way to create additional ones.

Bad analogy: if you open() the file independently, the position is not
shared. The position is not tied to a file with its shared contents
but to the given *open* file structure.

And there is pread/pwrite (on some OSes at least). It's not suitable
as the basic API of all reads and writes though.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak at knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/


More information about the Haskell-Cafe mailing list