[GHC] #13497: GHC does not use select()/poll() correctly on non-Linux platforms

GHC ghc-devs at haskell.org
Thu Sep 28 13:24:43 UTC 2017


#13497: GHC does not use select()/poll() correctly on non-Linux platforms
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Runtime System    |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:  8684
 Related Tickets:  #8684, #12912     |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Description changed by nh2:

Old description:

> From my discovery at https://phabricator.haskell.org/D42#30542:
>
> {{{
> Why does the existing code work on platforms that are not Linux? In my
> select man page it says:
>
> On Linux, select() modifies timeout to reflect the amount of  time  not
> slept;  most  other implementations do not do this.  (POSIX.1-2001 per‐
> mits either behavior.)  This causes problems both when Linux code which
> reads  timeout  is  ported to other operating systems, and when code is
> ported to Linux that reuses a struct timeval for multiple select()s  in
> a  loop  without  reinitializing  it.  Consider timeout to be undefined
> after select() returns.
>
> The existing select loop seems to rely on the fact that &tv is updated as
> described here.
> }}}
>
> Same for `man 2 poll`.
>
> E.g. `man 2 select` on FreeBSD 11 says explicitly:
>
> {{{
> BUGS
>      Version 2 of the Single UNIX Specification (``SUSv2'') allows
> systems to
>      modify the original timeout in place.  Thus, it is unwise to assume
> that
>      the timeout value will be unmodified by the select() system call.
>      FreeBSD does not modify the return value, which can cause problems
> for
>      applications ported from other systems.
> }}}
>
> I have tested this now on FreeBSD, and indeed it doesn't work as
> expected.
>
> With GHC 7.10.2:
>
> {{{
> import System.IO
> main = hWaitForInput stdin (1 * 1000)
> }}}
>
> `ghc --make test.hs -rtsopts`
>
> {{{
> [root@ ~]# time ./test
>
> real    0m1.386s
> user    0m0.004s
> sys     0m0.000s
> [root@ ~]# time ./test +RTS -V0.01
>
> real    0m1.386s
> user    0m0.001s
> sys     0m0.000s
> [root@ ~]# time ./test +RTS -V0.001
>
> real    0m1.678s
> user    0m0.003s
> sys     0m0.002s
> [root@ ~]# time ./test +RTS -V0.0001
>
> real    0m11.311s
> user    0m0.032s
> sys     0m0.139s
> }}}
>
> See how when we increase the timer signal, the sleep suddenly takes 10x
> longer than it should.
>
> That's because it triggers the case where EINTR is received in
> https://github.com/ghc/ghc/blob/f46369b8a1bf90a3bdc30f2b566c3a7e03672518%5E/libraries/base/cbits/inputReady.c#L48,
> letting us use the same unmodified 1-second `struct timeval *timeout`
> again and again.
>
> This demo of the bug works for GHC 7.10 and 8.0.1; in 8.0.2
> `hWaitForInput` is broken
> (https://ghc.haskell.org/trac/ghc/ticket/12912#comment:4) so the demo
> doesn't work there.

New description:

 From my discovery at https://phabricator.haskell.org/D42#30542:

 {{{
 Why does the existing code work on platforms that are not Linux? In my
 select man page it says:

 On Linux, select() modifies timeout to reflect the amount of  time  not
 slept;  most  other implementations do not do this.  (POSIX.1-2001 per‐
 mits either behavior.)  This causes problems both when Linux code which
 reads  timeout  is  ported to other operating systems, and when code is
 ported to Linux that reuses a struct timeval for multiple select()s  in
 a  loop  without  reinitializing  it.  Consider timeout to be undefined
 after select() returns.

 The existing select loop seems to rely on the fact that &tv is updated as
 described here.
 }}}

 Same for `man 2 poll`.

 E.g. `man 2 select` on FreeBSD 11 says explicitly:

 {{{
 BUGS
      Version 2 of the Single UNIX Specification (``SUSv2'') allows systems
 to
      modify the original timeout in place.  Thus, it is unwise to assume
 that
      the timeout value will be unmodified by the select() system call.
      FreeBSD does not modify the return value, which can cause problems
 for
      applications ported from other systems.
 }}}

 I have tested this now on FreeBSD, and indeed it doesn't work as expected.

 With GHC 7.10.2:

 {{{
 import System.IO
 main = hWaitForInput stdin (1 * 1000)
 }}}

 `ghc --make test.hs -rtsopts`

 {{{
 [root@ ~]# time ./test

 real    0m1.386s
 user    0m0.004s
 sys     0m0.000s
 [root@ ~]# time ./test +RTS -V0.01

 real    0m1.386s
 user    0m0.001s
 sys     0m0.000s
 [root@ ~]# time ./test +RTS -V0.001

 real    0m1.678s
 user    0m0.003s
 sys     0m0.002s
 [root@ ~]# time ./test +RTS -V0.0001

 real    0m11.311s
 user    0m0.032s
 sys     0m0.139s
 }}}

 See how when we increase the timer signal, the sleep suddenly takes 10x
 longer than it should.

 That's because it triggers the case where EINTR is received in
 https://github.com/ghc/ghc/blob/f46369b8a1bf90a3bdc30f2b566c3a7e03672518%5E/libraries/base/cbits/inputReady.c#L48,
 letting us use the same unmodified 1-second `struct timeval *timeout`
 again and again.

 This demo of the bug works for GHC 7.10 and 8.0.1; in 8.0.2
 `hWaitForInput` is broken
 (https://ghc.haskell.org/trac/ghc/ticket/12912#comment:4) so the demo
 doesn't work there.

 ---

 Convenience: Here is the call chain of
 [https://gist.github.com/nh2/6f571ce00667bc49d845ab4c8fdf9769
 hWaitForInput]

--

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13497#comment:28>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list