[GHC] #13497: GHC does not use select()/poll() correctly on non-Linux platforms
GHC
ghc-devs at haskell.org
Thu Sep 28 13:24:43 UTC 2017
#13497: GHC does not use select()/poll() correctly on non-Linux platforms
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking: 8684
Related Tickets: #8684, #12912 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Description changed by nh2:
Old description:
> From my discovery at https://phabricator.haskell.org/D42#30542:
>
> {{{
> Why does the existing code work on platforms that are not Linux? In my
> select man page it says:
>
> On Linux, select() modifies timeout to reflect the amount of time not
> slept; most other implementations do not do this. (POSIX.1-2001 per‐
> mits either behavior.) This causes problems both when Linux code which
> reads timeout is ported to other operating systems, and when code is
> ported to Linux that reuses a struct timeval for multiple select()s in
> a loop without reinitializing it. Consider timeout to be undefined
> after select() returns.
>
> The existing select loop seems to rely on the fact that &tv is updated as
> described here.
> }}}
>
> Same for `man 2 poll`.
>
> E.g. `man 2 select` on FreeBSD 11 says explicitly:
>
> {{{
> BUGS
> Version 2 of the Single UNIX Specification (``SUSv2'') allows
> systems to
> modify the original timeout in place. Thus, it is unwise to assume
> that
> the timeout value will be unmodified by the select() system call.
> FreeBSD does not modify the return value, which can cause problems
> for
> applications ported from other systems.
> }}}
>
> I have tested this now on FreeBSD, and indeed it doesn't work as
> expected.
>
> With GHC 7.10.2:
>
> {{{
> import System.IO
> main = hWaitForInput stdin (1 * 1000)
> }}}
>
> `ghc --make test.hs -rtsopts`
>
> {{{
> [root@ ~]# time ./test
>
> real 0m1.386s
> user 0m0.004s
> sys 0m0.000s
> [root@ ~]# time ./test +RTS -V0.01
>
> real 0m1.386s
> user 0m0.001s
> sys 0m0.000s
> [root@ ~]# time ./test +RTS -V0.001
>
> real 0m1.678s
> user 0m0.003s
> sys 0m0.002s
> [root@ ~]# time ./test +RTS -V0.0001
>
> real 0m11.311s
> user 0m0.032s
> sys 0m0.139s
> }}}
>
> See how when we increase the timer signal, the sleep suddenly takes 10x
> longer than it should.
>
> That's because it triggers the case where EINTR is received in
> https://github.com/ghc/ghc/blob/f46369b8a1bf90a3bdc30f2b566c3a7e03672518%5E/libraries/base/cbits/inputReady.c#L48,
> letting us use the same unmodified 1-second `struct timeval *timeout`
> again and again.
>
> This demo of the bug works for GHC 7.10 and 8.0.1; in 8.0.2
> `hWaitForInput` is broken
> (https://ghc.haskell.org/trac/ghc/ticket/12912#comment:4) so the demo
> doesn't work there.
New description:
From my discovery at https://phabricator.haskell.org/D42#30542:
{{{
Why does the existing code work on platforms that are not Linux? In my
select man page it says:
On Linux, select() modifies timeout to reflect the amount of time not
slept; most other implementations do not do this. (POSIX.1-2001 per‐
mits either behavior.) This causes problems both when Linux code which
reads timeout is ported to other operating systems, and when code is
ported to Linux that reuses a struct timeval for multiple select()s in
a loop without reinitializing it. Consider timeout to be undefined
after select() returns.
The existing select loop seems to rely on the fact that &tv is updated as
described here.
}}}
Same for `man 2 poll`.
E.g. `man 2 select` on FreeBSD 11 says explicitly:
{{{
BUGS
Version 2 of the Single UNIX Specification (``SUSv2'') allows systems
to
modify the original timeout in place. Thus, it is unwise to assume
that
the timeout value will be unmodified by the select() system call.
FreeBSD does not modify the return value, which can cause problems
for
applications ported from other systems.
}}}
I have tested this now on FreeBSD, and indeed it doesn't work as expected.
With GHC 7.10.2:
{{{
import System.IO
main = hWaitForInput stdin (1 * 1000)
}}}
`ghc --make test.hs -rtsopts`
{{{
[root@ ~]# time ./test
real 0m1.386s
user 0m0.004s
sys 0m0.000s
[root@ ~]# time ./test +RTS -V0.01
real 0m1.386s
user 0m0.001s
sys 0m0.000s
[root@ ~]# time ./test +RTS -V0.001
real 0m1.678s
user 0m0.003s
sys 0m0.002s
[root@ ~]# time ./test +RTS -V0.0001
real 0m11.311s
user 0m0.032s
sys 0m0.139s
}}}
See how when we increase the timer signal, the sleep suddenly takes 10x
longer than it should.
That's because it triggers the case where EINTR is received in
https://github.com/ghc/ghc/blob/f46369b8a1bf90a3bdc30f2b566c3a7e03672518%5E/libraries/base/cbits/inputReady.c#L48,
letting us use the same unmodified 1-second `struct timeval *timeout`
again and again.
This demo of the bug works for GHC 7.10 and 8.0.1; in 8.0.2
`hWaitForInput` is broken
(https://ghc.haskell.org/trac/ghc/ticket/12912#comment:4) so the demo
doesn't work there.
---
Convenience: Here is the call chain of
[https://gist.github.com/nh2/6f571ce00667bc49d845ab4c8fdf9769
hWaitForInput]
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13497#comment:28>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list