[Haskell-cafe] When is a bug GHC's fault/strange STM behaviour
Daniel Fischer
daniel.is.fischer at web.de
Sat Mar 13 16:53:38 EST 2010
Am Samstag 13 März 2010 17:36:49 schrieb Michael Lesniak:
> Hello,
>
> In one of my example programs I have a strange behaviour: it is a very
> simple taskpool using STM; in pseudocode it's
>
> 1. generate data structures
> 2. initialize data structures
> 3. fork threads
> 4. wait (using STM) until the pool is empty and all threads are finished
> 5. print a final message
>
> In very few cases, which depend on the number of threads spawned, the
> program hangs *after* the final message of step 5 has been printed.
> "Few cases" means, for example, 50.000 good, terminating runs before
> it hangs. If you increment the number of spawned threads (to a few
> hundred or thousands), it hangs much faster. Since forked threads
> terminate after the main thread terminates (which it should after
> printing the message), this behaviour is quite unexpected.
I won't pretend I really understand what's going on, but it seems that
occasionally a couple of threads are caught in a retry-loop. Having each
thread print out its ThreadId after it's done, when it hangs, only one
thread says it's done.
I don't see how that could happen, but that's what I found.
For the attached programme, in the task-getting,
else if Set.null work
then return Nothing
else retry
doesn't really make sense, when the channel is empty, we could return
Nothing right away. I suppose, in the real programme, some threads might
write further tasks to the channel, so while not all threads have finished,
the channel might not be permanently empty?
If not, "return Nothing" whenever the channel is empty ought to reliably
end all threads and prevent hanging. If yes, writing strict values to
working:
get chan working = do
tid <- myThreadId
-- atomically commit that this thread is not working anymore (since we
-- try to get a task we must be quasi-idle!
atomically $ do
work <- Set.delete tid `fmap` readTVar working
writeTVar working $! work
-- waits for a new task. if all threads are idle and the pool is empty,
-- return.
atomically $ do
empty <- isEmptyTChan chan
work <- readTVar working
if (not empty)
then do
task <- readTChan chan
writeTVar working $! (Set.insert tid work)
return (Just task)
else if Set.null work
then return Nothing
else retry
seems to prevent hanging on my box (running fine with "100 64 1 +RTS -N"
nearing task 60000, without the strict writes it typically hangs after a
few dozen or hundred runs).
I think the strict write in "writeTVar working $! (Set.insert tid work)"
isn't necessary, but I haven't yet tested it.
Why writing a thunk in
atomically $ do
work <- Set.delete tid `fmap` readTVar working
writeTVar working work
should cause it to hang sometimes, I've no idea. Nor whether that really
fixes it or it's just a fluke.
>
> Since I've experienced strange behaviour in the past which was the
> fault of my system configuration[1], I am a bit cautious before
> reporting a bug on GHC's bugtracker, especially since its reproduction
> is so difficult and random.
>
> So my question is how much circumspection is expected/needed before
> one should enter a bug in the bug tracker? I've tested the attached
> code on three different systems (with different linux systems, but
> always GHC 6.12.1 (since it's a bit costly to install the older
> versions)) and observed the mentioned behaviour. Is this enough to
> justify a bug report? Or, on the other hand, could someone spot the
I'd ask such things on glasgow-haskell-users, less traffic, it's a GHC-
specific list, you're more likely that one of the GHC experts notices it
there and can tell you whether it's a bug, a feature or an error in your
code.
> error in the attached code. Given my history with strange parallel
> behaviour, I am much more sure that it's the fault of my code, but I
> can't spot the error and the described behaviour (halting *after* the
> final message) is really strange.
>
>
> Cheers,
> Michael
>
> [1] http://www.haskell.org/pipermail/haskell-cafe/2010-March/073938.html
More information about the Haskell-Cafe
mailing list