New Windows I/O manager in GHC 8.12

Phyx lonetiger at gmail.com
Fri Jul 17 15:03:43 UTC 2020


Hi All,

In case you've missed it, about 150 or so commits were committed to master
yesterday.  These commits add WinIO (Windows I/O) to GHC.  This is a new I/O
manager that is designed for the native Windows I/O subsystem instead of
relying on the broken posix-ish compatibility layer that MIO used.

This is one of 3 big patches I have been working on for years now..

So before I continue on why WinIO was made I'll add a TL;DR;

WinIO adds an internal API break compared to previous GHC releases.  That is
the internal code was modified to support a completely asynchronous I/O
system.

What this means is that we have to keep track of the file pointer offset
which
previously was done by the C runtime.  This is because in async I/O you
cannot
assume the offset to be at any given location.

What does this mean for you? Very little. If you did not use internal GHC
I/O code.
In particular if you haven't used Buffer, BufferIO and RawIO. If you have
you will
to explicitly add support for GHC 8.12+.

Because FDs are a Unix concept and don't behave as you would expect on
Windows, the
new I/O manager also uses HANDLE instead of FD. This means that any library
that has
used the internal GHC Fd type won't work with WinIO. Luckily the number of
libraries
that have seems quite low. If you can please stick to the external Handle
interface
for I/O functions.

The boot libraries have been updated, and in particular process *requires*
the version
that is shipped with GHC.  Please respect the version bounds here!  I will
be writing
a migration guide for those that need to migrate code.  The amount of work
is usually
trivial as Base provides shims to do most of the common things you would
have used Fd for.

Also if I may make a plea to GHC developers.. Do not add non-trivial
implementations
in the external exposed modules (e.g. System.xxx, Data.xxx) but rather add
them to internal
modules (GHC.xxx) and re-export them from the external modules.  This
allows us to avoid
import cycles inside the internal modules :)

--

So why WinIO? Over the years a number of hard to fix issues popped up on
Windows, including
proper Unicode console I/O, cooked inputs, ability to cancel I/O requests.
This also allows libraries like Brick to work on Windows without
re-inventing the wheel or have to hide their I/O from the I/O manager.

In order to attempt to do some of these with MIO layer upon layers of hacks
were added.  This means that things sometimes worked.., but when it didn't
was rather unpredictable.  Some of the issues were simply unfixable with
MIO.  I will be making some posts about how WinIO works (and also archiving
them on the wiki don't worry :)) but for now some highlights:

WinIO is 3 years of work, First started by Joey Hess, then picked up by
Mikhail Glushenkov before landing at my feet.  While the majority has been
rewritten their work did provide a great jumping off point so thanks!  Also
thanks to Ben and AndreasK for helping me get it over the line.. As you can
imagine I was exhausted by this point :).

Some stats: ~8000 new lines and ~1100 removed ones spread over 130+ commits
(sorry this was the smallest we could get it while not losing some
historical context) and with over 153 files changed not counting the
changes to boot libraries.

It Fixes #18307, #17035, #16917, #15366, #14530, #13516, #13396, #13359,
#12873, #12869, #11394, #10542, #10484, #10477, #9940, #7593, #7353, #5797,
#5305, #4471, #3937, #3081, #12117, #2408, #10956, #2189
(but only on native windows consoles, so no msys shells) and #806 which is
14 years old!

WinIO is a dynamic choice, so you can switch between I/O managers using the
RTS flag --io-manager=[native|posix].

On non-Windows native is the same as posix.

The chosen Async interface for this implementation is using Completion
Ports.

The I/O manager uses a new interface added in Windows Vista called
GetQueuedCompletionStatusEx which allows us to service multiple
request interrupts in one go.

Some highlights:

* Drops Windows Vista support
  Vista is out of extended support as of 2017. The new minimum is Windows
7.  This allows us to use much more efficient OS provided abstractions.

* Replace Events and Monitor locks with much faster and efficient
Conditional Variables and SlimReaderWriterLocks.
* Change GHC's Buffer and I/O structs to support asynchronous operation by
not relying on the OS managing File Offset.
* Implement a new command line flag +RTS --io-manager=[native|posix] to
control which I/O manager is used.
* Implement a new Console I/O interface supporting much faster reads/writes
and unicode output correctly.  Also supports things like cooked input etc.
* In new I/O manager if the user still has their code-page set to OEM, then
we use UTF-8 by default. This allows Unicode to work correctly out of the
box.
* Add Atomic Exchange PrimOp and implement Atomic Ptr exchanges.
* Flush event logs eagerly as to not rely on finalizers running.
* A lot of refactoring and more use of hsc2hs to share constants
* Control aborts Ctrl+C should be a bit more reliable.
* Add a new IOPort primitive that should be only used for these I/O
operations. Essentially an IOPort is based on an MVar with the following
major
  differences:
  - Does not allow multiple pending writes. If the port is full a second
write is just discarded.
  - There is no deadlock avoidance guarantee. If you block on an IOPort and
your Haskell application does not have any work left to do the whole
application is
stalled.  In the threaded RTS we just continue idling, in the non-threaded
rts the scheduler is blocked.

* Support various optimizations in the Windows I/O manager such as skipping
I/O Completion if the request finished synchronously etc.
* The I/O manager is now agnostic to the handle type. i.e. There is no
socket specific code in the manager.  This is now all pushed to the network
library. Completely de-coupling these.
* Unified threaded and non-threaded I/O code. The only major difference is
where event loop is driven from and that the non-threaded rts will always
use a single OS thread to service requests. We cannot use more as there are
no rts locks to make concurrent modifications safe.

Cheers,
Tamar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200717/c33587bc/attachment.html>


More information about the ghc-devs mailing list