Proposal for a new I/O library design

Tue, 29 Jul 2003 10:19:21 +0200

Ben Rudiak-Gould wrote:
> On Mon, 28 Jul 2003, Simon Marlow wrote:
>>[...] lookupFileByPathname must open the file, without knowing whether the
>>file will be used for reading or writing in the future.
> 
> 
> I know; I'm hoping against hope that this isn't an insurmountable problem.

Well, I fear it is, at least on POSIX...

> If the OS provides a "reopen" function which is like open except that it
> takes a file handle instead of a pathname,

On POSIX, I'm not aware of anything like that, only dup/dup2, but you
can't change the access mode after duplicating the fd (at least fcntl
on Linux is not capable of doing it).

> [...] a File contains a handle with minimal access permissions
> and maximal sharing permissions,

The next problem: How should one get a file descriptor on POSIX without
knowing the access mode in advance? If the file is not readable O_RDONLY
will fail, if it is only writeable O_WRONLY will fail, O_RDWR is even
worse... OK, we could stat the file first, but there is no guarantee
that the file permissions are still the same when we later want to
"reopen" it.

> [...] If there's a way to open files by unique ID instead of pathname, that
> would also work.

I'm not aware of this on POSIX (open a file by inode/fs?).

> [...] All we need here is a way to change the access and sharing rights on an
> already-open handle. I find it hard to believe that after decades of use
> by millions of people, the UNIX file API provides no way to do this
> safely.

Personally, I think this is a sign that one is heading towards the wrong direction...
:-)

> [...] What are the practical problems with relying on finalizers? As far as I
> can see, the "no more filehandles available" problem is completely solved
> by forcing a major GC and trying again when it occurs.

But on quite a few systems there is an upper limit on the *global* number of
open files, so you would be a "bad citizen" for such a system.

>>How did you intend text encodings to work?  I see several possibilities:
>>
>>   textDecode :: TextEncoding -> [Octet] -> [Char]
>>
>>or
>>  
>>   decodeInputStream :: TextEncoding -> InputStream -> TextInputStream
>>   getChar :: TextInputStream -> IO Char
>>   etc.
>>
>>or
>>  
>>   setInputStreamCoding :: InputStream -> TextEncoding -> IO ()
>>   getChar :: InputStream -> IO Char
> 
> 
> I was thinking of the second. It could easily be implemented as the third
> under the hood. But I was hoping someone else would worry about it. :-)

In the non-IO versions you have a problem if the encoder/decoder encounters
an error because of a malformed InputStream. In the IO case one can simply
raise an IO exception. And using "Maybe TextInputStream" won't help, because
this would essentially make the encoder/decoder strict in its InputStream
argument.

Cheers,
    S.