I/O manager: relying solely upon kqueue is not a safe way to go

PHO pho at cielonegro.org
Fri Mar 15 20:54:23 CET 2013


I found the HEAD stopped working on MacOS X 10.5.8 since the parallel
I/O manager got merged to HEAD. Stage-2 compiler successfully builds
(including Language.Haskell.TH.Syntax contrary to the report by Kazu
Yamamoto) but the resulting binary is very unstable especially for
ghci:

  % inplace/bin/ghc-stage2  --interactive
  GHCi, version 7.7.20130313: http://www.haskell.org/ghc/  :? for help
  Loading package ghc-prim ... linking ... done.
  Loading package integer-gmp ... linking ... done.
  Loading package base ... linking ... done.
  Prelude>
  <stdin>: hGetChar: failed (Operation not supported)

So I took a dtruss log and found it was kevent(2) that returned
ENOTSUP. GHC.Event.KQueue was just registering the stdin for
EVFILT_READ, whose type was of course tty, and then kevent(2) said
"tty is not supported". Didn't the old I/O manager do the same thing?
Why was it working then?

After a hard investigation, I concluded that the old I/O manager was
not really working. It just looked fine but in fact wasn't. Here's an
explanation: If a fd to be registered is unsupported by kqueue,
kevent(2) returns -1 iff no incoming event buffer is passed
together. Otherwise it successfully returns with an incoming kevent
whose "flags" is EV_ERROR and "data" contains an errno. The I/O
manager has always been passing a non-empty event buffer until the
commit e5f5cfcd, while it wasn't (and still isn't) checking if a
received event in fact represents an error. That is, the KQueue
backend asks the kernel to monitor the stdin's readability. The kernel
then immediately delivers an event saying ENOTSUP. The KQueue backend
thinks "Hey, the stdin is now readable!" so it invokes a callback
associated with the fd. The thread which called "threadWaitRead" is
now awakened and performs a supposedly non-blocking read on the fd,
which in fact blocks but works anyway.

However the situation has changed since the commit e5f5cfcd. The I/O
manager now registers fds without passing an incoming event buffer, so
kevent(2) no longer successfully delivers an error event instead it
directly returns -1 with errno set to ENOTSUP, hence the "Operation
not supported" exception.

The Darwin's kqueue has started supporting tty since MacOS X 10.7
(Lion), but I heard it still doesn't support some important devices
like /dev/random. FreeBSD's kqueue has some difficulties too. It's no
doubt kqueue is a great mechanism for sockets, but IMHO it's not
something to use for all kinds of file I/O. Should we then try
kqueue(2) first and fallback to poll(2) if failed? Sadly no. Darwin's
poll(2) is broken too, and select(2) is the only method reliable:
http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS

I wrote a small program to test if a given stdin is supported by
kqueue, poll and select:
https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp

MacOS X 10.5.8 is hopelessly broken. We can't use anything other than
select(2) for tty and other devices:
https://gist.github.com/phonohawk/5169980#file-powerpc-apple-darwin9-8-0-txt

FreeBSD 8.0 does support tty but not /dev/random. I don't know what
about the latest FreeBSD 9.1:
https://gist.github.com/phonohawk/5169980#file-i386-unknown-freebsd8-0-txt

NetBSD 6.99.17 works perfectly here:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-netbsd6-99-17-txt

Just for reference, Linux 2.6.16 surely doesn't have kqueue(2) but it
supports poll(2)ing on devices:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-linux2-6-16-txt

I accordingly suggest that we should have some means to run two
independent I/O managers on BSD-like systems: KQueue for sockets,
pipes and regular file reads, and Select for any other devices and
regular file writes. Note also that no implementations of kqueue
support monitoring writability of regular file writes, which also
endangers the current use of kqueue-based I/O manager namely
"threadWaitWrite".

Any ideas?

Thanks,
PHO
_______________________________________________________
 - PHO -                         http://cielonegro.org/
OpenPGP public key: 1024D/1A86EF72
Fpr: 5F3E 5B5F 535C CE27 8254  4D1A 14E7 9CA7 1A86 EF72
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130316/1e954758/attachment.pgp>


More information about the ghc-devs mailing list