Confused about PAP object layout

Simon Marlow marlowsd at gmail.com
Mon Feb 24 08:21:54 UTC 2020


On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan <omeragacan at gmail.com>
wrote:

> > I'm not sure what you mean by "garbage". The bitmap merely determines
> whether
> > a field is a pointer,
>
> I think the bitmap is for liveness, not for whether a field is pointer or
> not.
> Relevant code for building an info table for a function:
>
>     mk_pieces (Fun arity (ArgGen arg_bits)) srt_label
>       = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags
> arg_bits
>            ; let fun_type | null liveness_data = aRG_GEN
>                           | otherwise          = aRG_GEN_BIG
>                  extra_bits = [ packIntsCLit dflags fun_type arity ]
>                            ++ (if inlineSRT dflags then [] else [ srt_lit
> ])
>                            ++ [ liveness_lit, slow_entry ]
>            ; return (Nothing, Nothing, extra_bits, liveness_data) }
>
> This uses the word "liveness" rather than "pointers".
>
> However I just realized that the word "garbage" is still not the best way
> to
> describe what I'm trying to say. In the example
>
>     [pap_info, x, y, z]
>
> If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused
> argument, or "garbage" as I describe in my previous email) OR it may be a
> non-pointer, but used (i.e. not a garbage).
>

I don't think we ever put a zero in the bitmap for a pointer-but-not-used
argument. We don't do liveness analysis for function arguments, as far as
I'm aware. So a 0 in the bitmap always means "non-pointer".

The only reaosn the code uses the terminology "liveness" here is that it's
sharing code with the code that handles bitmaps for stack frames, which do
deal with liveness.


> So maybe "liveness" is also not the best way to describe this bitmap, as 0
> does
> not mean dead but rather "don't follow in GC".
>

> On my quest to understand and document this code better I have one more
> question. When generating info tables for functions with know argument
> patterns
> (ArgSpec) we initialize the bitmap as 0. Relevant code:
>
>     mk_pieces (Fun arity (ArgSpec fun_type)) srt_label
>       = do { let extra_bits = packIntsCLit dflags fun_type arity :
> srt_label
>            ; return (Nothing, Nothing,  extra_bits, []) }
>
> Here the last return value is for the liveness data. I don't understand
> how can
> this be correct, because when we use this function in a PAP this will
> cause NOT
> scavenging the PAP payload. Relevant code (simplified):
>
>     STATIC_INLINE GNUC_ATTR_HOT StgPtr
>     scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord
> size)
>     {
>         const StgFunInfoTable *fun_info =
>             get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
>
>         StgPtr p = (StgPtr)payload;
>
>         StgWord bitmap;
>         switch (fun_info->f.fun_type) {
>         ...
>
        default:
>             bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]);
>         small_bitmap:
>             p = scavenge_small_bitmap(p, size, bitmap);
>             break;
>         }
>         return p;
>     }
>

> Here if I have a function with three pointer args (ARG_PPP) the shown
> branch
> that will be taken, but because the bitmap is 0 (as shown in the mk_pieces
> code
> above) nothing in the PAPs payload will be scavenged.
>

It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from the
info table.  Hope this helps.

Cheers
Simon



>
> Here's an example from a debugging session:
>
>     >>> print pap
>     $10 = (StgPAP *) 0x42001fe030
>
>     >>> print *pap
>     $11 = {
>       header = {
>         info = 0x7fbdd1f06640 <stg_PAP_info>
>       },
>       arity = 2,
>       n_args = 1,
>       fun = 0x7fbdd2d23ffb,
>       payload = 0x42001fe048
>     }
>
> So this PAP is applied one argument, which is a boxed object (a FUN_2_0):
>
>     >>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0]))
>     $20 = {
>       layout = {
>         payload = {
>           ptrs = 2,
>           nptrs = 0
>         },
>         bitmap = 2,
>         large_bitmap_offset = 2,
>         __pad_large_bitmap_offset = 2,
>         selector_offset = 2
>       },
>       type = 11,
>       srt = 1914488,
>       code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003"
>     }
>
> However if I look at the function of this PAP:
>
>     >>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun))
>     $21 = {
>       f = {
>         slow_apply_offset = 16,
>         __pad_slow_apply_offset = 3135120895,
>         b = {
>           bitmap = 74900193017889,
>           bitmap_offset = 258342945,
>           __pad_bitmap_offset = 258342945
>         },
>         fun_type = 23,
>         arity = 3
>       },
>       i = {
>         layout = {
>           payload = {
>             ptrs = 0,
>             nptrs = 0
>           },
>           bitmap = 0,
>           large_bitmap_offset = 0,
>           __pad_large_bitmap_offset = 0,
>           selector_offset = 0
>         },
>         type = 14,
>         srt = 1916288,
>         code = 0x7fbdd2b50260 <base_GHCziRead_list3_info>
> "I\203\304(M;\245X\003"
>       }
>     }
>
> It has arity 3. Since the first argument is a boxed object and this
> function has
> arity 3, if the argument is actually live in the function (i.e. not an
> unused
> argument), then the bitmap should have a 1 for this. But because the
> argument
> pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how
> this
> can work.
>
> What am I missing?
>
> Thanks,
>
> Ömer
>
> Ben Gamari <ben at smart-cactus.org>, 14 Şub 2020 Cum, 20:25 tarihinde şunu
> yazdı:
> >
> > Ömer Sinan Ağacan <omeragacan at gmail.com> writes:
> >
> > > I think that makes sense, with the invariant that n_args <=
> bitmap_size. We
> > > evacuate the arguments used by the function but not others. Thanks.
> > >
> > > It's somewhat weird to see an object with useful stuff, then garbage,
> then
> > > useful stuff again in the heap, but that's not an issue by itself. For
> example
> > > if I have something like
> > >
> > >     [pap_info, x, y, z]
> > >
> > > and according to the function `y` is dead, then after evacuating I get
> > >
> > >     [pap_info, x, <garbage>, z]
> > >
> > > This "garbage" is evacuated again and again every time we evacuate
> this PAP.
> > >
> > I'm not sure what you mean by "garbage". The bitmap merely determines
> > whether a field is a pointer, not whether it is copied during
> > evacuation. A field's bitmap bit not being set merely means that we won't
> > evacuate the value of that field during scavenging.
> >
> > Nevertheless, this all deserves a comment in scavenge_PAP.
> >
> > Cheers,
> >
> > - Ben
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200224/c2edc477/attachment.html>


More information about the ghc-devs mailing list