[GHC] #13497: GHC does not use select()/poll() correctly on non-Linux platforms
GHC
ghc-devs at haskell.org
Fri Mar 31 19:38:08 UTC 2017
#13497: GHC does not use select()/poll() correctly on non-Linux platforms
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #8684, #12912 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by nh2):
Replying to [comment:11 nh2]:
> The `select()` occurrence in `awaitEvent()` waits for
`sleeping_queue->block_info.target - now`
([https://github.com/ghc/ghc/blob/380b25ea4754c2aea683538ffdb179f8946219a0/rts/posix/Select.c#L316
code]) so that also needs to be vetted on whether it has the `&tv`
updating problem on non-Linux.
OK, I've looked into that in detail now and added some printfs, and I
think this `select()` needs some update as well if we want it to wake up
precisely at `sleeping_queue->block_info.target`.
The reasoning is the following:
Inside the `while ((numFound = select(maxfd+1, &rfd, &wfd, NULL, ptv)) <
0) { ... }` there's this code:
{{{
/* check for threads that need waking up
*/
wakeUpSleepingThreads(getLowResTimeOfDay());
/* If new runnable threads have arrived, stop waiting for
* I/O and run them.
*/
if (!emptyRunQueue(&MainCapability)) {
return; /* still hold the lock */
}
}}}
After an EINTR has interrupted `select()`,
`wakeUpSleepingThreads(getLowResTimeOfDay())` checks whether there is a
Haskell thread that wants to run (whether we are past its
`sleeping_queue->block_info.target` time) -- and that includes the thread
for which we're currently `select`ing -- and if so, we `return` out of the
C code. So we're looking at the current time in each loop iteration.
Consequently, we don't have the same problem as in `fdReady`, as this
scheme does not rely on `select()` updating the passed in `struct timeval
*ptv` pointer.
However, there is still a problem: The `EINTR`s interrupting the
`select()` come at fixed intervals (the timer signal). That can result in
us waiting slighly too long.
For example, assume `*ptv` is set to 15 ms
(`sleeping_queue->block_info.target` is 15 ms from `now`), and assume the
timer signal is every 10 ms.
Then we would enter `select(15ms)`, get interrupted with EINTR after 10ms,
and then call `select(15ms)` again in the same while loop. With the next
timer EINTR 10ms later, we would `return` out of the `while` loop, so
we're not at risk of running forever due to EINTR. But we have now waited
20ms in total instead of the desired 15ms.
That's why I currently believe that the timeout argument to this
`select()` should be recalculated based on the current time in every
iteration, similar to how my fix for `fdReady()` does it.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13497#comment:15>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list