Confused about PAP object layout

Simon Peyton Jones simonpj at microsoft.com
Mon Feb 24 10:45:10 UTC 2020


I’m not following this in detail, but do please make sure that the results of this discussion end up in a suitable Note.  Obviously it’s not transparently clear as-is, and I can see clarity emerging

Thanks!

Simon

From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of Simon Marlow
Sent: 24 February 2020 08:22
To: Ömer Sinan Ağacan <omeragacan at gmail.com>
Cc: ghc-devs <ghc-devs at haskell.org>
Subject: Re: Confused about PAP object layout

On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan <omeragacan at gmail.com<mailto:omeragacan at gmail.com>> wrote:
> I'm not sure what you mean by "garbage". The bitmap merely determines whether
> a field is a pointer,

I think the bitmap is for liveness, not for whether a field is pointer or not.
Relevant code for building an info table for a function:

    mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
      = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
           ; let fun_type | null liveness_data = aRG_GEN
                          | otherwise          = aRG_GEN_BIG
                 extra_bits = [ packIntsCLit dflags fun_type arity ]
                           ++ (if inlineSRT dflags then [] else [ srt_lit ])
                           ++ [ liveness_lit, slow_entry ]
           ; return (Nothing, Nothing, extra_bits, liveness_data) }

This uses the word "liveness" rather than "pointers".

However I just realized that the word "garbage" is still not the best way to
describe what I'm trying to say. In the example

    [pap_info, x, y, z]

If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
argument, or "garbage" as I describe in my previous email) OR it may be a
non-pointer, but used (i.e. not a garbage).

I don't think we ever put a zero in the bitmap for a pointer-but-not-used argument. We don't do liveness analysis for function arguments, as far as I'm aware. So a 0 in the bitmap always means "non-pointer".

The only reaosn the code uses the terminology "liveness" here is that it's sharing code with the code that handles bitmaps for stack frames, which do deal with liveness.

So maybe "liveness" is also not the best way to describe this bitmap, as 0 does
not mean dead but rather "don't follow in GC".

On my quest to understand and document this code better I have one more
question. When generating info tables for functions with know argument patterns
(ArgSpec) we initialize the bitmap as 0. Relevant code:

    mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
      = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
           ; return (Nothing, Nothing,  extra_bits, []) }

Here the last return value is for the liveness data. I don't understand how can
this be correct, because when we use this function in a PAP this will cause NOT
scavenging the PAP payload. Relevant code (simplified):

    STATIC_INLINE GNUC_ATTR_HOT StgPtr
    scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
    {
        const StgFunInfoTable *fun_info =
            get_fun_itbl(UNTAG_CONST_CLOSURE(fun));

        StgPtr p = (StgPtr)payload;

        StgWord bitmap;
        switch (fun_info->f.fun_type) {
        ...
        default:
            bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
        small_bitmap:
            p = scavenge_small_bitmap(p, size, bitmap);
            break;
        }
        return p;
    }

Here if I have a function with three pointer args (ARG_PPP) the shown branch
that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code
above) nothing in the PAPs payload will be scavenged.

It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the info table.  Hope this helps.

Cheers
Simon



Here's an example from a debugging session:

    >>> print pap
    $10 = (StgPAP *) 0x42001fe030

    >>> print *pap
    $11 = {
      header = {
        info = 0x7fbdd1f06640 <stg_PAP_info>
      },
      arity = 2,
      n_args = 1,
      fun = 0x7fbdd2d23ffb,
      payload = 0x42001fe048
    }

So this PAP is applied one argument, which is a boxed object (a FUN_2_0):

    >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
    $20 = {
      layout = {
        payload = {
          ptrs = 2,
          nptrs = 0
        },
        bitmap = 2,
        large_bitmap_offset = 2,
        __pad_large_bitmap_offset = 2,
        selector_offset = 2
      },
      type = 11,
      srt = 1914488,
      code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
    }

However if I look at the function of this PAP:

    >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
    $21 = {
      f = {
        slow_apply_offset = 16,
        __pad_slow_apply_offset = 3135120895,
        b = {
          bitmap = 74900193017889,
          bitmap_offset = 258342945,
          __pad_bitmap_offset = 258342945
        },
        fun_type = 23,
        arity = 3
      },
      i = {
        layout = {
          payload = {
            ptrs = 0,
            nptrs = 0
          },
          bitmap = 0,
          large_bitmap_offset = 0,
          __pad_large_bitmap_offset = 0,
          selector_offset = 0
        },
        type = 14,
        srt = 1916288,
        code = 0x7fbdd2b50260 <base_GHCziRead_list3_info>
"I\203\304(M;\245X\003"
      }
    }

It has arity 3. Since the first argument is a boxed object and this function has
arity 3, if the argument is actually live in the function (i.e. not an unused
argument), then the bitmap should have a 1 for this. But because the argument
pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how this
can work.

What am I missing?

Thanks,

Ömer

Ben Gamari <ben at smart-cactus.org<mailto:ben at smart-cactus.org>>, 14 Şub 2020 Cum, 20:25 tarihinde şunu yazdı:
>
> Ömer Sinan Ağacan <omeragacan at gmail.com<mailto:omeragacan at gmail.com>> writes:
>
> > I think that makes sense, with the invariant that n_args <= bitmap_size. We
> > evacuate the arguments used by the function but not others. Thanks.
> >
> > It's somewhat weird to see an object with useful stuff, then garbage, then
> > useful stuff again in the heap, but that's not an issue by itself. For example
> > if I have something like
> >
> >     [pap_info, x, y, z]
> >
> > and according to the function `y` is dead, then after evacuating I get
> >
> >     [pap_info, x, <garbage>, z]
> >
> > This "garbage" is evacuated again and again every time we evacuate this PAP.
> >
> I'm not sure what you mean by "garbage". The bitmap merely determines
> whether a field is a pointer, not whether it is copied during
> evacuation. A field's bitmap bit not being set merely means that we won't
> evacuate the value of that field during scavenging.
>
> Nevertheless, this all deserves a comment in scavenge_PAP.
>
> Cheers,
>
> - Ben
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200224/03127cbe/attachment.html>


More information about the ghc-devs mailing list