Thread behavior in 7.8.3

Simon Peyton Jones simonpj at microsoft.com
Thu Oct 30 08:19:35 UTC 2014


I wonder if the knowledge embodied in this thread might usefully be summarised in the user manual?  Or on the GHC section of the Haskell wiki https://www.haskell.org/haskellwiki/GHC?

Simon

|  -----Original Message-----
|  From: Glasgow-haskell-users [mailto:glasgow-haskell-users-
|  bounces at haskell.org] On Behalf Of Edward Z. Yang
|  Sent: 30 October 2014 00:41
|  To: John Lato
|  Cc: GHC Users List
|  Subject: Re: Thread behavior in 7.8.3
|  
|  Yes, that's right.
|  
|  I brought it up because you mentioned that there might still be
|  occasional delays, and those might be caused by a thread not being
|  preemptible for a while.
|  
|  Edward
|  
|  Excerpts from John Lato's message of 2014-10-29 17:31:45 -0700:
|  > My understanding is that -fno-omit-yields is subtly different.  I
|  > think that's for the case when a function loops without performing
|  any
|  > heap allocations, and thus would never yield even after the context
|  > switch timeout.  In my case the looping function does perform heap
|  > allocations and does eventually yield, just not until after the
|  timeout.
|  >
|  > Is that understanding correct?
|  >
|  > (technically, doesn't it change to yielding after stack checks or
|  > something like that?)
|  >
|  > On Thu, Oct 30, 2014 at 8:24 AM, Edward Z. Yang <ezyang at mit.edu>
|  wrote:
|  >
|  > > I don't think this is directly related to the problem, but if you
|  > > have a thread that isn't yielding, you can force it to yield by
|  > > using -fno-omit-yields on your code.  It won't help if the
|  > > non-yielding code is in a library, and it won't help if the
|  problem
|  > > was that you just weren't setting timeouts finely enough (which
|  > > sounds like what was happening). FYI.
|  > >
|  > > Edward
|  > >
|  > > Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700:
|  > > > I guess I should explain what that flag does...
|  > > >
|  > > > The GHC RTS maintains capabilities, the number of capabilities
|  is
|  > > specified
|  > > > by the `+RTS -N` option.  Each capability is a virtual machine
|  > > > that executes Haskell code, and maintains its own runqueue of
|  > > > threads to
|  > > process.
|  > > >
|  > > > A capability will perform a context switch at the next heap
|  block
|  > > > allocation (every 4k of allocation) after the timer expires.
|  The
|  > > > timer defaults to 20ms, and can be set by the -C flag.
|  > > > Capabilities perform context switches in other circumstances as
|  > > > well, such as when a thread yields or blocks.
|  > > >
|  > > > My guess is that either the context switching logic changed in
|  > > > ghc-7.8,
|  > > or
|  > > > possibly your code used to trigger a switch via some other
|  > > > mechanism
|  > > (stack
|  > > > overflow or something maybe?), but is optimized differently now
|  so
|  > > instead
|  > > > it needs to wait for the timer to expire.
|  > > >
|  > > > The problem we had was that a time-sensitive thread was getting
|  > > > scheduled on the same capability as a long-running non-yielding
|  > > > thread, so the time-sensitive thread had to wait for a context
|  > > > switch timeout (even
|  > > though
|  > > > there were free cores available!).  I expect even with -N4
|  you'll
|  > > > still
|  > > see
|  > > > occasional delays (perhaps <5% of calls).
|  > > >
|  > > > We've solved our problem with judicious use of `forkOn`, but
|  that
|  > > > won't help at N1.
|  > > >
|  > > > We did see this behavior in 7.6, but it's definitely worse in
|  7.8.
|  > > >
|  > > > Incidentally, has there been any interest in a work-stealing
|  scheduler?
|  > > > There was a discussion from about 2 years ago, in which Simon
|  > > > Marlow
|  > > noted
|  > > > it might be tricky, but it would definitely help in situations
|  like this.
|  > > >
|  > > > John L.
|  > > >
|  > > > On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones
|  > > > <mike at proclivis.com>
|  > > wrote:
|  > > >
|  > > > > John,
|  > > > >
|  > > > > Adding -C0.005 makes it much better. Using -C0.001 makes it
|  > > > > behave more like -N4.
|  > > > >
|  > > > > Thanks. This saves my project, as I need to deploy on a single
|  > > > > core
|  > > Atom
|  > > > > and was stuck.
|  > > > >
|  > > > > Mike
|  > > > >
|  > > > > On Oct 29, 2014, at 5:12 PM, John Lato <jwlato at gmail.com>
|  wrote:
|  > > > >
|  > > > > By any chance do the delays get shorter if you run your
|  program
|  > > > > with
|  > > `+RTS
|  > > > > -C0.005` ?  If so, I suspect you're having a problem very
|  > > > > similar to
|  > > one
|  > > > > that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8
|  for
|  > > > > some reason), involving possible misbehavior of the thread
|  scheduler.
|  > > > >
|  > > > > On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones
|  > > > > <mike at proclivis.com>
|  > > wrote:
|  > > > >
|  > > > >> I have a general question about thread behavior in 7.8.3 vs
|  > > > >> 7.6.X
|  > > > >>
|  > > > >> I moved from 7.6 to 7.8 and my application behaves very
|  > > > >> differently. I have three threads, an application thread that
|  > > > >> plots data with
|  > > wxhaskell or
|  > > > >> sends it over a network (depends on settings), a thread doing
|  > > > >> usb bulk writes, and a thread doing usb bulk reads. Data is
|  > > > >> moved around with
|  > > TChan,
|  > > > >> and TVar is used for coordination.
|  > > > >>
|  > > > >> When the application was compiled with 7.6, my stream of usb
|  > > > >> traffic
|  > > was
|  > > > >> smooth. With 7.8, there are lots of delays where nothing
|  seems
|  > > > >> to be running. These delays are up to 40ms, whereas with 7.6
|  > > > >> delays were a
|  > > 1ms or
|  > > > >> so.
|  > > > >>
|  > > > >> When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6
|  it
|  > > > >> runs
|  > > fine
|  > > > >> without with -N2/4.
|  > > > >>
|  > > > >> The program is compiled -O2 with profiling. The -N2/4 version
|  > > > >> uses
|  > > more
|  > > > >> memory,  but in both cases with 7.8 and with 7.6 there is no
|  > > > >> space
|  > > leak.
|  > > > >>
|  > > > >> I tired to compile and use -ls so I could take a look with
|  > > threadscope,
|  > > > >> but the application hangs and writes no data to the file. The
|  > > > >> CPU
|  > > fans run
|  > > > >> wild like it is in an infinite loop. It at least pops an
|  > > > >> unpainted wxhaskell window, so it got partially running.
|  > > > >>
|  > > > >> One of my libraries uses option -fsimpl-tick-factor=200 to
|  get
|  > > > >> around
|  > > the
|  > > > >> compiler.
|  > > > >>
|  > > > >> What do I need to know about changes to threading and event
|  > > > >> logging between 7.6 and 7.8? Is there some general
|  > > > >> documentation somewhere
|  > > that
|  > > > >> might help?
|  > > > >>
|  > > > >> I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar
|  > > > >> ball and installed myself, after removing 7.6 with apt-get.
|  > > > >>
|  > > > >> Any hints appreciated.
|  > > > >>
|  > > > >> Mike
|  > > > >>
|  > > > >>
|  > > > >> _______________________________________________
|  > > > >> Glasgow-haskell-users mailing list
|  > > > >> Glasgow-haskell-users at haskell.org
|  > > > >> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
|  > > > >>
|  > > > >
|  > > > >
|  > > > >
|  > >
|  _______________________________________________
|  Glasgow-haskell-users mailing list
|  Glasgow-haskell-users at haskell.org
|  http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


More information about the Glasgow-haskell-users mailing list