Run-time absence info

David Feuer david.feuer at
Wed Apr 13 20:57:49 UTC 2022

> Date: Wed, 13 Apr 2022 08:59:06 +0200
> From: Andreas Klebinger <klebinger.andreas at>
> To: ghc-devs at
> Subject: Re: Absence info at run-time
> Message-ID: <a577aa9c-6c96-945d-85f6-eb207f88b5b3 at>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> W/W should transform such a function into one who takes one less
> argument removing any runtime overhead at least for fully applied
> functions.

Fully applied functions are definitely not what I'm talking about.

I suppose your suggestion is then if we an expression`f x` where bar
> takes multiple arguments, but doesn't use the current argument then GHC
> should:
> * Inspect f, check if the first argument to f is used
> * If we can determine it isn't used instead of creating a PAP capturing
> `f` and `x` instead only capture `f` and record this in the PAP closure
> somehow.
> * Once the PAP is fully applied pass a dummy argument instead of `x` to f.

Yes, that's the idea. The "somehow" is described below.

If f is a known call that seems doable, although adding a bitmap to paps
> might require us to increase the size of all PAP closures, making this
> optimization less useful.

 Every PAP has to carry its arity. I just looked it up:

typedef struct {
    StgHeader header;
    StgHalfWord arity; // number of arguments left to apply, if zero this
is an AP closure
    StgHalfWord n_args; // number of applied arguments
    StgClosure *fun; // guaranteed to point to a FUN closure
    StgClosure *payload[];
} StgPAP;

On a 64-bit machine, we can combine arity and absence info for functions of
up to 31 arguments into half a word. Specifically, set the lowest bit to
indicate whether the first argument is used, etc. Then bitwise-or that with
1 `shiftL` arity. Each partial application performed shifts right by the
number of arguments applied; if the result is 1, we know it's fully applied.

(Any idea why we need the number of arguments that have been applied in the
PAP? If that's actually only need for AP, then we might be able to make
things a little more compact.)

> If `f` is a unknown function there is currently no way to get
> absent/used info for it's arguments at runtime. And changing that would
> be a major change which seems unlikely to pay off.

I described a mechanism for encoding this concisely above. The same
encoding could be used for functions that haven't been partially applied. I
don't know where their arities are stashed. In their info tables?

> So I think this would be theoretically possible, but it would rarely pay
> off.

I'm not sure it's so rare. We end up having to work around this issue in
libraries, with varying levels of effort and success.

> Also do you have an example where `(const a) b` leads to stupid thunks?
> It seems to me const should always be inlined in such a case, avoiding a
> PAP allocation.

There won't be a PAP allocation if const inlines. The stupid thunk is a
thunk to apply (\ _ -> a) to b. Suppose we define `fmap` for plain arrays
(I know; probably not the best example), and let <$ take its default

    (<$) a = fmap (const a)

If we calculate x <$ arr, that will fill an array with thunks, each of them
retaining an element of the original array which will never be used.

> Am 12/04/2022 um 23:02 schrieb David Feuer:
> > Suppose `f` doesn't use its first argument. When forming the thunk (or
> > partial application) `f a`, we don't need to record `a`. What if
> > instead of arity, we store a bitmap used/absent arguments, terminated
> > by a 1 bit? Could we then get rid of "stupid thunks" like `(const a) b`?
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at
> >
> ------------------------------
> Message: 3
> Date: Wed, 13 Apr 2022 08:21:25 +0000
> From: "Sebastian Graf" <sgraf1337 at>
> To: "GHC Devs" <ghc-devs at>
> Subject: Markup language/convention for Notes?
> Message-ID: <eme97afb2c-f674-4ed3-9d50-52c6482169d0 at sebastian-pc>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
> Hi Devs,
> When writing Notes, I find myself using markdown-inspired or
> haddock-inspired features. The reason is that I keep telling myself
>  > In 5 years time, we'll surely have an automated tool that renders
> Notes referenced under the cursor in a popup in our IDE
> And I might not be completely wrong about that, after all the strong
> conventions about Note declaration syntax allow me to do
> jump-to-definition on Note links in my IDE already (thanks to a shell
> script written by Zubin!).
> Still, over the years I kept drifting between markdown and haddock
> syntax, sometimes used `backticked inline code` or haddock 'ticks' to
> refer to functions in the compiler (sometimes even
> 'GHC.Fully.Qualified.ticks') and for code blocks I used all of the
> following forms:
> Haddock "code quote"
>  > id :: a -> a
>  > id x = x
> Markdown triple backticks
> ```hs
> id :: a -> a
> id x = x
> ```
> Indentation by spaces
>    id :: a -> a
>    id x = x
> And so on.
> I know that at least Simon was thrown off in the past about my use of
> "tool-aware markup", perhaps also because I kept switching the targetted
> tool. I don't like that either. So I wonder
> Do you think it is worth optimising Notes for post-processing by an
> external tool?I think it's only reasonable if we decide for a target
> syntax. Which syntax should it be?
> Cheers,
> Sebastian
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> >
> ------------------------------
> Subject: Digest Footer
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at
> ------------------------------
> End of ghc-devs Digest, Vol 224, Issue 10
> *****************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list