Confused about PAP object layout

Ömer Sinan Ağacan omeragacan at gmail.com
Wed Feb 26 18:52:48 UTC 2020


> I still don't understand why this assertion
>
>     ASSERT(BITMAP_SIZE(bitmap) >= size);
>
> I added to scavenge_small_bitmap in !2727 is failing though.

Ahh, this is becuase in the call sites we do a bit shift only pass the contents
of the bitmap, without the size:

    bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
    p = scavenge_small_bitmap(p, size, bitmap);

BITMAP_BITS is the macro that does this.

Ömer

Ömer Sinan Ağacan <omeragacan at gmail.com>, 26 Şub 2020 Çar, 21:48
tarihinde şunu yazdı:
>
> So the key points from this thread are:
>
> - PAP payloads are scavenged using the function's bitmap. Because a PAPs payload
>   will have less number of closures than the function's arity the bitmap will
>   always have enough bits.
>
> - A bit in a function bitmap is NOT for liveness (e.g. does not indicate whether
>   an argument used or not), but for pointers vs. non-pointers. Function bitmaps
>   are called "liveness bits" in the code generator which is misleading.
>
> - In a function bitmap (small or large), 0 means pointer, 1 means non-pointer.
>
>   This is really what confused me in my last email above. For some reason I
>   intuitively expected 1 to mean pointer, not 0. Simon M also got this wrong
>   ("So a 0 in the bitmap always means non-pointer.") so maybe this is confusing
>   to others too.
>
> - For functions with known argument patterns we don't use the function's bitmap.
>   These function's type are greater than ARG_BCO (2), and for those we use the
>   stg_arg_bitmaps array to get the bitmap.
>
>   For example, the bitmap for ARG_PPP (function with 3 pointer arguments) is at
>   index 23 in this array, which is 0b11. For ARG_PNN it's 0b110000011. The least
>   significant 6 bits are for the size (3), the remaining 0b110 means the first
>   argument is a pointer, rest of the two are non-pointers.
>
> I still don't understand why this assertion
>
>     ASSERT(BITMAP_SIZE(bitmap) >= size);
>
> I added to scavenge_small_bitmap in !2727 is failing though.
>
> Ömer
>
> Simon Peyton Jones <simonpj at microsoft.com>, 24 Şub 2020 Pzt, 13:45
> tarihinde şunu yazdı:
> >
> > I’m not following this in detail, but do please make sure that the results of this discussion end up in a suitable Note.  Obviously it’s not transparently clear as-is, and I can see clarity emerging
> >
> >
> >
> > Thanks!
> >
> >
> > Simon
> >
> >
> >
> > From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of Simon Marlow
> > Sent: 24 February 2020 08:22
> > To: Ömer Sinan Ağacan <omeragacan at gmail.com>
> > Cc: ghc-devs <ghc-devs at haskell.org>
> > Subject: Re: Confused about PAP object layout
> >
> >
> >
> > On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan <omeragacan at gmail.com> wrote:
> >
> > > I'm not sure what you mean by "garbage". The bitmap merely determines whether
> > > a field is a pointer,
> >
> > I think the bitmap is for liveness, not for whether a field is pointer or not.
> > Relevant code for building an info table for a function:
> >
> >     mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
> >       = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits
> >            ; let fun_type | null liveness_data = aRG_GEN
> >                           | otherwise          = aRG_GEN_BIG
> >                  extra_bits = [ packIntsCLit dflags fun_type arity ]
> >                            ++ (if inlineSRT dflags then [] else [ srt_lit ])
> >                            ++ [ liveness_lit, slow_entry ]
> >            ; return (Nothing, Nothing, extra_bits, liveness_data) }
> >
> > This uses the word "liveness" rather than "pointers".
> >
> > However I just realized that the word "garbage" is still not the best way to
> > describe what I'm trying to say. In the example
> >
> >     [pap_info, x, y, z]
> >
> > If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> > argument, or "garbage" as I describe in my previous email) OR it may be a
> > non-pointer, but used (i.e. not a garbage).
> >
> >
> >
> > I don't think we ever put a zero in the bitmap for a pointer-but-not-used argument. We don't do liveness analysis for function arguments, as far as I'm aware. So a 0 in the bitmap always means "non-pointer".
> >
> >
> >
> > The only reaosn the code uses the terminology "liveness" here is that it's sharing code with the code that handles bitmaps for stack frames, which do deal with liveness.
> >
> >
> >
> > So maybe "liveness" is also not the best way to describe this bitmap, as 0 does
> > not mean dead but rather "don't follow in GC".
> >
> >
> > On my quest to understand and document this code better I have one more
> > question. When generating info tables for functions with know argument patterns
> > (ArgSpec) we initialize the bitmap as 0. Relevant code:
> >
> >     mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
> >       = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label
> >            ; return (Nothing, Nothing,  extra_bits, []) }
> >
> > Here the last return value is for the liveness data. I don't understand how can
> > this be correct, because when we use this function in a PAP this will cause NOT
> > scavenging the PAP payload. Relevant code (simplified):
> >
> >     STATIC_INLINE GNUC_ATTR_HOT StgPtr
> >     scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size)
> >     {
> >         const StgFunInfoTable *fun_info =
> >             get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
> >
> >         StgPtr p = (StgPtr)payload;
> >
> >         StgWord bitmap;
> >         switch (fun_info->f.fun_type) {
> >         ...
> >
> >         default:
> >             bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
> >         small_bitmap:
> >             p = scavenge_small_bitmap(p, size, bitmap);
> >             break;
> >         }
> >         return p;
> >     }
> >
> >
> > Here if I have a function with three pointer args (ARG_PPP) the shown branch
> > that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code
> > above) nothing in the PAPs payload will be scavenged.
> >
> >
> >
> > It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the info table.  Hope this helps.
> >
> >
> >
> > Cheers
> >
> > Simon
> >
> >
> >
> >
> >
> >
> > Here's an example from a debugging session:
> >
> >     >>> print pap
> >     $10 = (StgPAP *) 0x42001fe030
> >
> >     >>> print *pap
> >     $11 = {
> >       header = {
> >         info = 0x7fbdd1f06640 <stg_PAP_info>
> >       },
> >       arity = 2,
> >       n_args = 1,
> >       fun = 0x7fbdd2d23ffb,
> >       payload = 0x42001fe048
> >     }
> >
> > So this PAP is applied one argument, which is a boxed object (a FUN_2_0):
> >
> >     >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
> >     $20 = {
> >       layout = {
> >         payload = {
> >           ptrs = 2,
> >           nptrs = 0
> >         },
> >         bitmap = 2,
> >         large_bitmap_offset = 2,
> >         __pad_large_bitmap_offset = 2,
> >         selector_offset = 2
> >       },
> >       type = 11,
> >       srt = 1914488,
> >       code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
> >     }
> >
> > However if I look at the function of this PAP:
> >
> >     >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
> >     $21 = {
> >       f = {
> >         slow_apply_offset = 16,
> >         __pad_slow_apply_offset = 3135120895,
> >         b = {
> >           bitmap = 74900193017889,
> >           bitmap_offset = 258342945,
> >           __pad_bitmap_offset = 258342945
> >         },
> >         fun_type = 23,
> >         arity = 3
> >       },
> >       i = {
> >         layout = {
> >           payload = {
> >             ptrs = 0,
> >             nptrs = 0
> >           },
> >           bitmap = 0,
> >           large_bitmap_offset = 0,
> >           __pad_large_bitmap_offset = 0,
> >           selector_offset = 0
> >         },
> >         type = 14,
> >         srt = 1916288,
> >         code = 0x7fbdd2b50260 <base_GHCziRead_list3_info>
> > "I\203\304(M;\245X\003"
> >       }
> >     }
> >
> > It has arity 3. Since the first argument is a boxed object and this function has
> > arity 3, if the argument is actually live in the function (i.e. not an unused
> > argument), then the bitmap should have a 1 for this. But because the argument
> > pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how this
> > can work.
> >
> > What am I missing?
> >
> > Thanks,
> >
> > Ömer
> >
> > Ben Gamari <ben at smart-cactus.org>, 14 Şub 2020 Cum, 20:25 tarihinde şunu yazdı:
> > >
> > > Ömer Sinan Ağacan <omeragacan at gmail.com> writes:
> > >
> > > > I think that makes sense, with the invariant that n_args <= bitmap_size. We
> > > > evacuate the arguments used by the function but not others. Thanks.
> > > >
> > > > It's somewhat weird to see an object with useful stuff, then garbage, then
> > > > useful stuff again in the heap, but that's not an issue by itself. For example
> > > > if I have something like
> > > >
> > > >     [pap_info, x, y, z]
> > > >
> > > > and according to the function `y` is dead, then after evacuating I get
> > > >
> > > >     [pap_info, x, <garbage>, z]
> > > >
> > > > This "garbage" is evacuated again and again every time we evacuate this PAP.
> > > >
> > > I'm not sure what you mean by "garbage". The bitmap merely determines
> > > whether a field is a pointer, not whether it is copied during
> > > evacuation. A field's bitmap bit not being set merely means that we won't
> > > evacuate the value of that field during scavenging.
> > >
> > > Nevertheless, this all deserves a comment in scavenge_PAP.
> > >
> > > Cheers,
> > >
> > > - Ben
> > >


More information about the ghc-devs mailing list