Re: Adding new contructors to IOMode to support "don't overwrite if already exists" behavior
Victoria Mitchell
victoria at quietmisdreavus.net
Mon May 11 15:48:02 UTC 2020
Hi all!
I took another look at this this past weekend and came up with a different solution, that might fit Sven's idea of a more-general mechanism to expose the `open()` flags.
I have a new branch open on GitLab: https://gitlab.haskell.org/QuietMisdreavus/ghc/-/commit/ed8f142f9e18f8ef7389b8e8e4d722dcaf740971?w=1 <https://gitlab.haskell.org/QuietMisdreavus/ghc/-/commit/ed8f142f9e18f8ef7389b8e8e4d722dcaf740971?view=parallel&w=1>
I'm curious at what point i should turn this branch into an MR.
The rough idea is that i introduced a new `IOFlags` type, which wraps the idea of `open()` flags, and added some new "openFileWithFlags" functions that take this new type in addition to the existing `IOMode`. These flags are then combined with the base ones from `IOMode` to create the final set of flags given to `open()`. I decided to keep all the existing "openFile" (et al) behavior w.r.t. always creating the file, clearing the file if `WriteMode`/`ReadWriteMode` were given instead of `AppendMode`, etc.
In the initial commit, i only added EXCL since all the other existing flags are exposed by either existing IOMode behavior or in other functions, but it could be possible to add new flags.
-----
Speaking of additional flags to support, i did some additional research into the different sets of flags exposed on different platforms. I'm choosing to ignore the MSVC version of `open()` for the moment, since MSYS is used on Windows currently, and they expose the Linux version of the syscall.
For example, here's an excerpt from the man page for `open(2)`, on my macOS system, talking about `oflags`:
```
The flags specified for the oflag argument must include exactly one of the following
file access modes:
O_RDONLY open for reading only
O_WRONLY open for writing only
O_RDWR open for reading and writing
In addition any combination of the following values can be or'ed in oflag:
O_NONBLOCK do not block on open or for data to become available
O_APPEND append on each write
O_CREAT create file if it does not exist
O_TRUNC truncate size to 0
O_EXCL error if O_CREAT and the file exists
O_SHLOCK atomically obtain a shared lock
O_EXLOCK atomically obtain an exclusive lock
O_NOFOLLOW do not follow symlinks
O_SYMLINK allow open of symlinks
O_EVTONLY descriptor requested for event notifications only
O_CLOEXEC mark as close-on-exec
```
(I'm not sure if there are up-to-date man pages for macOS online any more? The only pages i could find on the Apple Developer site were out-of-date pages labeled for iOS, most recently updated in 2016.)
Just from this list, it's worthwhile to winnow out the flags that are already exposed or made irrelevant by existing behavior. For example, O_CREAT is always passed for IOModes that open a file for writing, and O_APPEND is already exposed in AppendMode. O_TRUNC is done manually in GHC.IO.FD.openFile' when the IOMode of WriteMode is passed in. O_SHLOCK and O_EXLOCK are also done inside the runtime in C code in `rts/fileLock.c`. (Would it be worthwhile to expose these flags anyway to allow for system-level locking?) It looks like O_SYMLINK is macOS-exclusive, since it doesn't appear in the pages for Linux or FreeBSD (linked below). O_CLOEXEC sounds like behavior that's otherwise not exposed, but is it valid to `exec()` here? I'm admittedly not super familiar with all the internals that would need to be accounted for in a fork/exec combo.
To my eyes, that leaves O_EXCL (which i've exposed in my commit), O_NONBLOCK (which is currently exposed in separate functions which i did not wrap in my commit), O_NOFOLLOW, and maybe O_SHLOCK/O_EXLOCK. This also doesn't mention Linux/BSD-exclusive flags that are not exposed on macOS.
For reference, here is the Linux page for `open(3p)`:
http://man7.org/linux/man-pages/man3/open.3p.html
and the FreeBSD page for `open(2)`:
https://www.freebsd.org/cgi/man.cgi?query=open&sektion=2&manpath=FreeBSD+12.1-RELEASE+and+Ports
These mention even more flags, including the additional modes of O_EXEC and (on Linux) O_SEARCH. An interesting flag not mentioned in the macOS page is O_SYNC (and friends, O_FSYNC/O_DIRECT on BSD and O_DSYNC/O_RSYNC on Linux), which provide extra filesystem synchronization outside of the runtime. They both also mention O_DIRECTORY, which seems like a good guard to expose similarly to O_EXCL or O_NOFOLLOW.
If necessary, we could expose the IOFlags constructor and allow libraries like `unix` to expose system-specific flags themselves.
-----
Regardless, i'm curious about everyone's impression of this approach. This should allow a more extensible method to expose these additional flags, which should sidestep the previous concerns about naming the new IOModes or creating "new ad-hoc combinations of flags".
Thanks,
Victoria Mitchell (formerly Grey Mitchell) (@QuietMisdreavus)
https://quietmisdreavus.net/
On Wed, Apr 1, 2020, at 12:45 AM, Sven Panne wrote:
> Let's have a look at the POSIX spec for open(): https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html It very clearly distinguishes between *2* disjoint sets of flags:
>
> * You have to use exactly one of O_EXEC, O_RDONLY, O_RDWR, O_SEARCH, and O_WRONLY.
>
> * In addition, you can specify any combination of O_APPEND, O_CLOEXEC, O_CREAT, O_DIRECTORY, O_DSYNC, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_NONBLOCK, O_RSYNC, O_SYNC, O_TRUNC, and O_TTY_INIT.
>
> Alas, GHC.IO.FD and GHC.IO.IOMode completely ignore this distinction already and add tons of special cases:
>
> * They add an ad hoc combination of O_WRONLY and O_APPEND (AppendMode).
>
> * They leave out O_EXEC and O_SEARCH.
>
> * They add an ad hoc boolean flag for non-blocking I/O to openFile, i.e. pick 1 of the 13 possible additional flags.
>
> * For some obscure reason, O_NOCTTY is always added.
>
> * HandleType doesn't reflect the first set of flags.
>
> My proposal is to not make this mess even worse by adding yet another ad hoc combination, but to think about how to expose these two sets in a sane way.
>
> I didn't have a look at the Windows counterpart or into all the *nix variants yet, but I guess that the overlap in the API is quite big. But even the intersection of all the relevant platforms will very probably have the distinction between the 2 sets of flags, so we should reflect that in the Haskell API, too.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200511/c56788ee/attachment.html>
More information about the Libraries
mailing list