GHC 7.8.3 thread hang

Carter Schonwald carter.schonwald at gmail.com
Tue Nov 11 15:01:02 UTC 2014


what OS are you on?
what build options did you use?

On Tue, Nov 11, 2014 at 2:11 AM, Michael Jones <mike at proclivis.com> wrote:

> I am trying to debug a lockup problem (and need help with debugging
> technique), where hang means a thread stops at a known place during
> evaluation, and other threads continue.
>
> The code near the problem is like:
>
>        ec <- return $ encode command
>        l <- return $ BSL.length ec
>        ss <- return $ BSC.unpack ec
>
> It does not matter if I use let or return, or if the length is taken after
> unpack. I used return so I could use this code for tracing, with strictness
> to try to find the exact statement that is the problem:
>
>        traceEventIO "sendCommand"
>        ec <- return $ encode command
>        traceEventIO $ "sendCommand: encoded"
>        l <- ec `seq` return $ BSL.length ec
>        traceEventIO $ "sendCommand: size " ++ (show l)
>        ss <- ec `seq` return $ BSC.unpack ec
>
> When this runs, the program executes this many times, but always hangs
> under a certain condition.
>
> For good evaluations:
>
> 7f04173ff700: cap 0: sendCommand
> 7f04173ff700: cap 0: sendCommand: encoded
> 7f04173ff700: cap 0: sendCommand: size 4
> 7f04173ff700: cap 0: sendCommand: unpacked
> 7f04173ff700: cap 0: Sending command of size 4
> 7f04173ff700: cap 0: Sending command of size "\NUL\EOT"
> 7f04173ff700: cap 0: sendCommand: sent
> 7f04173ff700: cap 0: sendCommand: flushed
>
> for bad evaluation:
>
> 7f04173ff700: cap 0: sendCommand
> 7f04173ff700: cap 0: sendCommand: encoded
>
> The lockup occurs when length is taken.
>
> The difference between the working and non-working case is as follows:
>
> A wxHaskell callback stuffs some data in a TChan. A thread started at
> application startup is reading the TChan and calling the code that hangs.
> If it did not hang, it would send it by TCP to another process/computer.
>
> In the working case the callback pops a dialog, and passes data from one
> TChan to another TChan.
>
> In the failing case, the data is used to generate strings in a wxHaskell
> grid, then it is parsed, and a new data is made. The new data is a
> combination of old and new pieces of sub data. The shape of the date is
> identical, because I am not making any edits to the rows.
>
> So when data that the callback sends to TChan is unmodified, no hang. But
> when the data is used to make text, put it in the gui, process it, and
> generate new data, it hangs.
>
> As a test I modified the code so that the text is not put into the gui.
> The results are the same. This indicates it has something to do with
> creating strings and then data from strings and mixing old and new subdata.
> Strings are created with show. Data is created by pattern matching and
> generating numbers from strings. I should also point out that in the
> working case, the size of the resulting string is small, say 3. In the hang
> case, the resulting string would be long, say 5000-10000.
>
> I assume there are no limits to the size of ByteStrings or fundemental
> issues with the RTS stack/heap that require special settings.
>
> I am using the following revisions:
>
> GHC 7.8.3
>     base ==4.7.*,
>     mtl ==2.2.1,
>     containers == 0.5.5.1,
>     transformers ==0.4.1.0,
>     random == 1.0.1.1,
>     wx == 0.91.0.0,
>     wxcore == 0.91.0.0,
>     wxdirect == 0.91.0.0,
>     colour == 2.3.3,
>     stm == 2.4.2,
>     monad-loops == 0.4.2.1,
>     time == 1.4.2,
>     old-locale == 1.0.0.6,
>     fast-logger == 2.2.3,
>     network == 2.6.0.2,
>     bytestring == 0.10.4.0,
>     control-monad-loop == 0.1,
>     binary == 0.7.2.2,
>
> I know that nobody can have an answer based on this. But what I am hoping
> is either there is some known bug, or someone can guide me in narrowing it
> down. The event log does not have anything unusual in it. Other threads
> keep running, and I can exit the application normally. The thread does not
> throw an exception. It just hangs.
>
> When I run the app, I just use +RTS -v
>
> Perhaps there are some other options that might give more info?
>
> — SNIPPET of log —
>
> 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
> 7fe544cfea40: cap 0: thread 5 stopped (yielding)
> 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
> 7fe544cfea40: cap 0: thread 5 stopped (suspended while making a foreign
> call)
> 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
> 7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar)
> 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
> 7fe537eff700: cap 0: waking up thread 5 on cap 0
> 7fe537eff700: cap 0: thread 2 stopped (yielding)
> 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
> 7fe544cfea40: cap 0: sendCommand
> 7fe544cfea40: cap 0: sendCommand: encoded
> 7fe544cfea40: cap 0: sendCommand: size 3
>  WORKS HERE
> 7fe544cfea40: cap 0: sendCommand: unpacked
> 7fe544cfea40: cap 0: Sending command of size 3
> 7fe544cfea40: cap 0: Sending command of size "\NUL\ETX"
> 7fe544cfea40: cap 0: sendCommand: sent
> 7fe544cfea40: cap 0: sendCommand: flushed
> 7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar)
> 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
> 7fe537eff700: cap 0: thread 2 stopped (yielding)
> 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
> 7fe537eff700: cap 0: thread 2 stopped (suspended while making a foreign
> call)
> 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
> 7fe537eff700: cap 0: waking up thread 5 on cap 0
>
>>
> 7fe537eff700: cap 0: fetchTelemetryServer: got lock
> 7fe537eff700: cap 0: thread 45 stopped (yielding)
> 7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
> 7fe537eff700: cap 0: thread 7 stopped (heap overflow)
> 7fe537eff700: cap 0: requesting parallel GC
> 7fe537eff700: cap 0: starting GC
> 7fe537eff700: cap 0: GC working
> 7fe537eff700: cap 0: GC idle
> 7fe537eff700: cap 0: GC done
> 7fe537eff700: cap 0: GC idle
> 7fe537eff700: cap 0: GC done
> 7fe537eff700: cap 0: all caps stopped for GC
> 7fe537eff700: cap 0: finished GC
> 7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
> 7fe537eff700: cap 0: sendCommand
> 7fe537eff700: cap 0: sendCommand: encoded                       PROBLEM
> HERE
> 7fe537eff700: cap 0: thread 7 stopped (heap overflow)
> 7fe537eff700: cap 0: requesting parallel GC
> 7fe537eff700: cap 0: starting GC
> 7fe537eff700: cap 0: GC working
> 7fe537eff700: cap 0: GC idle
> 7fe537eff700: cap 0: GC done
> 7fe537eff700: cap 0: GC idle
> 7fe537eff700: cap 0: GC done
> 7fe537eff700: cap 0: all caps stopped for GC
> 7fe537eff700: cap 0: finished GC
> 7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
> 7fe537eff700: cap 0: thread 7 stopped (yielding)
> 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
> 7fe537eff700: cap 0: thread 2 stopped (yielding)
> 7fe544cfea40: cap 0: running thread 442 (ThreadRunGHC)
> 7fe544cfea40: cap 0: thread 442 stopped (suspended while making a foreign
> call)
> 7fe544cfea40: cap 0: running thread 442 (ThreadRunGHC)
> 7fe544cfea40: cap 0: thread 442 stopped (suspended while making a foreign
> call)
> 7fe537eff700: cap 0: running thread 45 (ThreadRunGHC)
> 7fe537eff700: cap 0: fetchTelemetryServer: unlock
> 7fe537eff700: cap 0: fetchTelemetryServer
> 7fe537eff700: cap 0: fetchTelemetryServer: got lock
> 7fe537eff700: cap 0: fetchTelemetryServer: pump seq
>
>
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20141111/52e8c38a/attachment.html>


More information about the Glasgow-haskell-users mailing list