Confused about PAP object layout

Ömer Sinan Ağacan omeragacan at gmail.com
Thu Feb 20 09:21:00 UTC 2020


> I'm not sure what you mean by "garbage". The bitmap merely determines whether
> a field is a pointer,

I think the bitmap is for liveness, not for whether a field is pointer or not.
Relevant code for building an info table for a function:

    mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
      = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
           ; let fun_type | null liveness_data = aRG_GEN
                          | otherwise          = aRG_GEN_BIG
                 extra_bits = [ packIntsCLit dflags fun_type arity ]
                           ++ (if inlineSRT dflags then [] else [ srt_lit ])
                           ++ [ liveness_lit, slow_entry ]
           ; return (Nothing, Nothing, extra_bits, liveness_data) }

This uses the word "liveness" rather than "pointers".

However I just realized that the word "garbage" is still not the best way to
describe what I'm trying to say. In the example

    [pap_info, x, y, z]

If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
argument, or "garbage" as I describe in my previous email) OR it may be a
non-pointer, but used (i.e. not a garbage).

So maybe "liveness" is also not the best way to describe this bitmap, as 0 does
not mean dead but rather "don't follow in GC".

On my quest to understand and document this code better I have one more
question. When generating info tables for functions with know argument patterns
(ArgSpec) we initialize the bitmap as 0. Relevant code:

    mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
      = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
           ; return (Nothing, Nothing,  extra_bits, []) }

Here the last return value is for the liveness data. I don't understand how can
this be correct, because when we use this function in a PAP this will cause NOT
scavenging the PAP payload. Relevant code (simplified):

    STATIC_INLINE GNUC_ATTR_HOT StgPtr
    scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
    {
        const StgFunInfoTable *fun_info =
            get_fun_itbl(UNTAG_CONST_CLOSURE(fun));

        StgPtr p = (StgPtr)payload;

        StgWord bitmap;
        switch (fun_info->f.fun_type) {
        ...
        default:
            bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
        small_bitmap:
            p = scavenge_small_bitmap(p, size, bitmap);
            break;
        }
        return p;
    }

Here if I have a function with three pointer args (ARG_PPP) the shown branch
that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code
above) nothing in the PAPs payload will be scavenged.

Here's an example from a debugging session:

    >>> print pap
    $10 = (StgPAP *) 0x42001fe030

    >>> print *pap
    $11 = {
      header = {
        info = 0x7fbdd1f06640 <stg_PAP_info>
      },
      arity = 2,
      n_args = 1,
      fun = 0x7fbdd2d23ffb,
      payload = 0x42001fe048
    }

So this PAP is applied one argument, which is a boxed object (a FUN_2_0):

    >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
    $20 = {
      layout = {
        payload = {
          ptrs = 2,
          nptrs = 0
        },
        bitmap = 2,
        large_bitmap_offset = 2,
        __pad_large_bitmap_offset = 2,
        selector_offset = 2
      },
      type = 11,
      srt = 1914488,
      code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
    }

However if I look at the function of this PAP:

    >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
    $21 = {
      f = {
        slow_apply_offset = 16,
        __pad_slow_apply_offset = 3135120895,
        b = {
          bitmap = 74900193017889,
          bitmap_offset = 258342945,
          __pad_bitmap_offset = 258342945
        },
        fun_type = 23,
        arity = 3
      },
      i = {
        layout = {
          payload = {
            ptrs = 0,
            nptrs = 0
          },
          bitmap = 0,
          large_bitmap_offset = 0,
          __pad_large_bitmap_offset = 0,
          selector_offset = 0
        },
        type = 14,
        srt = 1916288,
        code = 0x7fbdd2b50260 <base_GHCziRead_list3_info>
"I\203\304(M;\245X\003"
      }
    }

It has arity 3. Since the first argument is a boxed object and this function has
arity 3, if the argument is actually live in the function (i.e. not an unused
argument), then the bitmap should have a 1 for this. But because the argument
pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how this
can work.

What am I missing?

Thanks,

Ömer

Ben Gamari <ben at smart-cactus.org>, 14 Şub 2020 Cum, 20:25 tarihinde şunu yazdı:
>
> Ömer Sinan Ağacan <omeragacan at gmail.com> writes:
>
> > I think that makes sense, with the invariant that n_args <= bitmap_size. We
> > evacuate the arguments used by the function but not others. Thanks.
> >
> > It's somewhat weird to see an object with useful stuff, then garbage, then
> > useful stuff again in the heap, but that's not an issue by itself. For example
> > if I have something like
> >
> >     [pap_info, x, y, z]
> >
> > and according to the function `y` is dead, then after evacuating I get
> >
> >     [pap_info, x, <garbage>, z]
> >
> > This "garbage" is evacuated again and again every time we evacuate this PAP.
> >
> I'm not sure what you mean by "garbage". The bitmap merely determines
> whether a field is a pointer, not whether it is copied during
> evacuation. A field's bitmap bit not being set merely means that we won't
> evacuate the value of that field during scavenging.
>
> Nevertheless, this all deserves a comment in scavenge_PAP.
>
> Cheers,
>
> - Ben
>


More information about the ghc-devs mailing list