Question about indirectees of BLACKHOLE closures

Fri Mar 23 15:51:18 UTC 2018

Hi Omer,

As per my understanding, a BLACKHOLE can point to a THUNK when an exception
is thrown. An exception walks up the stack and overwrites the blackholes
pointed to by the update frames as it walks with an stg_raise closure. That
way, if any concurrent thread happens to evaluate a thunk that was walked,
it'll evaluate the thunk which will blow up as well thereby throwing the
exception on the other thread(s) too.

Definition of stg_raise:
https://github.com/ghc/ghc/blob/ba5797937e575ce6119de6c07703e90dda2557e8/rts/Exception.cmm#L424-L427

raiseExceptionHelper dealing with update frames:
https://github.com/ghc/ghc/blob/d9d463289fe20316cff12a8f0dbf414db678fa72/rts/Schedule.c#L2864-L2875

In general, yes, you can think that a BLACKHOLE will point to a non-THUNK
object assuming that everything went right.

Hope that helps,
Rahul

On Fri, Mar 23, 2018 at 5:48 PM, Ömer Sinan Ağacan <omeragacan at gmail.com>
wrote:

> Thanks Simon, that's really helpful.
>
> A few more questions:
>
> As far as I understand the difference between
>
> - BLACKHOLE pointing to a TSO
> - BLACKHOLE pointing to a BLOCKING_QUEUE
>
> is that in the former we don't yet have any threads blocked by the
> BLACKHOLE
> whereas in the latter we have and the blocking queue holds all those
> blocked
> threads. Did I get this right?
>
> Secondly, can a BLACKHOLE point to a THUNK? I'd expect no, because we
> BLACKHOLE
> a closure when we're done evaluating it (assuming no eager blackholing),
> and
> evaluation usually happens up to WHNF.
>
> Thanks,
>
> Ömer
>
> 2018-03-20 18:27 GMT+03:00 Simon Marlow <marlowsd at gmail.com>:
> > Added comments: https://phabricator.haskell.org/D4517
> >
> > On 20 March 2018 at 14:58, Simon Marlow <marlowsd at gmail.com> wrote:
> >>
> >> Hi Omer,
> >>
> >> On 20 March 2018 at 13:05, Ömer Sinan Ağacan <omeragacan at gmail.com>
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I've been looking at BLACKHOLE closures and how the indirectee field is
> >>> used
> >>> and I have a few questions:
> >>>
> >>> Looking at evacuate for BLACKHOLE closures:
> >>>
> >>>     case BLACKHOLE:
> >>>     {
> >>>         StgClosure *r;
> >>>         const StgInfoTable *i;
> >>>         r = ((StgInd*)q)->indirectee;
> >>>         if (GET_CLOSURE_TAG(r) == 0) {
> >>>             i = r->header.info;
> >>>             if (IS_FORWARDING_PTR(i)) {
> >>>                 r = (StgClosure *)UN_FORWARDING_PTR(i);
> >>>                 i = r->header.info;
> >>>             }
> >>>             if (i == &stg_TSO_info
> >>>                 || i == &stg_WHITEHOLE_info
> >>>                 || i == &stg_BLOCKING_QUEUE_CLEAN_info
> >>>                 || i == &stg_BLOCKING_QUEUE_DIRTY_info) {
> >>>                 copy(p,info,q,sizeofW(StgInd),gen_no);
> >>>                 return;
> >>>             }
> >>>             ASSERT(i != &stg_IND_info);
> >>>         }
> >>>         q = r;
> >>>         *p = r;
> >>>         goto loop;
> >>>     }
> >>>
> >>> It seems like indirectee can be a TSO, WHITEHOLE, BLOCKING_QUEUE_CLEAN,
> >>> BLOCKING_QUEUE_DIRTY, and it can't be IND. I'm wondering what does it
> >>> mean for
> >>> a BLACKHOLE to point to a
> >>>
> >>> - TSO
> >>> - WHITEHOLE
> >>> - BLOCKING_QUEUE_CLEAN
> >>> - BLOCKING_QUEUE_DIRTY
> >>
> >>
> >> That sounds right to me.
> >>
> >>>
> >>> Is this documented somewhere or otherwise could someone give a few
> >>> pointers on
> >>> where to look in the code?
> >>
> >>
> >> Unfortunately I don't think we have good documentation for this, but you
> >> should look at the comments around messageBlackHole in Messages.c.
> >>
> >>>
> >>> Secondly, I also looked at the BLACKHOLE entry code, and it seems like
> it
> >>> has a
> >>> different assumption about what can indirectee field point to:
> >>>
> >>>     INFO_TABLE(stg_BLACKHOLE,1,0,BLACKHOLE,"BLACKHOLE","BLACKHOLE")
> >>>         (P_ node)
> >>>     {
> >>>         W_ r, info, owner, bd;
> >>>         P_ p, bq, msg;
> >>>
> >>>         TICK_ENT_DYN_IND(); /* tick */
> >>>
> >>>     retry:
> >>>         p = StgInd_indirectee(node);
> >>>         if (GETTAG(p) != 0) {
> >>>             return (p);
> >>>         }
> >>>
> >>>         info = StgHeader_info(p);
> >>>         if (info == stg_IND_info) {
> >>>             // This could happen, if e.g. we got a BLOCKING_QUEUE that
> >>> has
> >>>             // just been replaced with an IND by another thread in
> >>>             // wakeBlockingQueue().
> >>>             goto retry;
> >>>         }
> >>>
> >>>         if (info == stg_TSO_info ||
> >>>             info == stg_BLOCKING_QUEUE_CLEAN_info ||
> >>>             info == stg_BLOCKING_QUEUE_DIRTY_info)
> >>>         {
> >>>             ("ptr" msg) = ccall allocate(MyCapability() "ptr",
> >>>
> >>> BYTES_TO_WDS(SIZEOF_MessageBlackHole));
> >>>
> >>>             SET_HDR(msg, stg_MSG_BLACKHOLE_info, CCS_SYSTEM);
> >>>             MessageBlackHole_tso(msg) = CurrentTSO;
> >>>             MessageBlackHole_bh(msg) = node;
> >>>
> >>>             (r) = ccall messageBlackHole(MyCapability() "ptr", msg
> >>> "ptr");
> >>>
> >>>             if (r == 0) {
> >>>                 goto retry;
> >>>             } else {
> >>>                 StgTSO_why_blocked(CurrentTSO) =
> BlockedOnBlackHole::I16;
> >>>                 StgTSO_block_info(CurrentTSO) = msg;
> >>>                 jump stg_block_blackhole(node);
> >>>             }
> >>>         }
> >>>         else
> >>>         {
> >>>             ENTER(p);
> >>>         }
> >>>     }
> >>>
> >>> The difference is, when the tag of indirectee is 0, evacuate assumes
> that
> >>> indirectee can't point to an IND, but BLACKHOLE entry code thinks it's
> >>> possible
> >>> and there's even a comment about why. (I don't understand the comment
> >>> yet) I'm
> >>> wondering if this code is correct, and why. Again any pointers would be
> >>> appreciated.
> >>
> >>
> >> Taking a quick look at the code, my guess is that:
> >> - a BLOCKING_QUEUE gets overwritten by an IND in wakeBlockingQueue()
> >> - but when this happens, the indirectee of the BLACKHOLE will also be
> >> overwritten to point to the value
> >>
> >> At runtime a thread might see an intermediate state because these
> >> mutations are happening in another thread, so we might follow the
> indirectee
> >> and see the IND. But this state can't be observed by the GC, because all
> >> mutator threads have stopped at a safe point.
> >>
> >> Cheers
> >> Simon
> >>
> >>
> >>>
> >>> Thanks,
> >>>
> >>> Ömer
> >>> _______________________________________________
> >>> ghc-devs mailing list
> >>> ghc-devs at haskell.org
> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>
> >>
> >
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>

-- 
Rahul Muttineni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20180323/11c7040c/attachment.html>