High-level Cmm code and stack allocation

Fri Jan 10 16:35:47 UTC 2014

Oh, ok.  Alas, a good chunk of my model of Cmm has just gone out of the window.  I thought that areas were such a lovely, well-behaved abstraction.  I was thrilled when we came up with them, and I'm very sorry to see them go.

There are no many things that I no longer understand.  I now have no idea how we save live variables over a call, or how multiple returned values from one call (returned on the stack) stay right where they are if they are live across the next call.

What was the actual problem?  That functions used too much stack, so the stack was getting too big?  But a one slot area corresponds exactly to a live variable, so I don't see how the area abstraction could possibly increase stack size.  And is stack size a crucial issue anyway?

Apart from anything else, areas would have given a lovely solution to the problem this thread started with!

I guess we can talk about this when you next visit?  But some documentation would be welcome.

Simon

| -----Original Message-----
| From: Simon Marlow [mailto:marlowsd at gmail.com]
| Sent: 10 January 2014 16:24
| To: Simon Peyton Jones; Herbert Valerio Riedel
| Cc: ghc-devs at haskell.org
| Subject: Re: High-level Cmm code and stack allocation
| 
| There are no one-slot areas any more, I ditched those when I rewrote the
| stack allocator.  There is only ever one live area: either the old area
| or the young area for a call we are about to make or have just made.
| (see the data type: I removed the one-slot areas)
| 
| I struggled for a long time with this.  The problem is that with the
| semantics of non-overlapping areas, code motion optimisations would tend
| to increase the stack requirements of the function by overlapping the
| live ranges of the areas.  I concluded that actually what we wanted was
| areas that really do overlap, and optimisations that respect that, so
| that we get more efficient stack usage.
| 
| Cheers,
| 	Simon
| 
| On 10/01/2014 15:22, Simon Peyton Jones wrote:
| > That documentation would be good, yes!  I don't know what it means to
| say "we don't really have a general concept of areas any more".  We did
| before, and I didn't know that it had gone away.  Urk!  We can have lots
| of live areas, notably the old area (for the current call/return
| parameters, the call area for a call we are preparing, and the one-slot
| areas for variables we are saving on the stack.
| >
| > Here's he current story
| > https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/StackAreas
| >
| > I agree that we have no concrete syntax for talking about areas, but
| that is something we could fix.  But I'm worried that they may not mean
| what they used to mean.
| >
| > Simon
| >
| > | -----Original Message-----
| > | From: Simon Marlow [mailto:marlowsd at gmail.com]
| > | Sent: 09 January 2014 08:39
| > | To: Simon Peyton Jones; Herbert Valerio Riedel
| > | Cc: ghc-devs at haskell.org
| > | Subject: Re: High-level Cmm code and stack allocation
| > |
| > | On 08/01/2014 10:07, Simon Peyton Jones wrote:
| > | > | > Can't we just allocate a Cmm "area"? The address of an area is
| > | > | > a
| > | > | perfectly well-defined Cmm value.
| > | >
| > | > What about this idea?
| > |
| > | We don't really have a general concept of areas (any more), and
| > | areas aren't exposed in the concrete Cmm syntax at all.  The current
| > | semantics is that areas may overlap with each other, so there should
| > | only be one active area at any point.  I found that this was
| > | important to ensure that we could generate good code from the stack
| > | layout algorithm, otherwise it had to make pessimistic assumptions
| and use too much stack.
| > |
| > | You're going to ask me where this is documented, and I think I have
| > | to admit to slacking off, sorry :-)  We did discuss it at the time,
| > | and I made copious notes, but I didn't transfer those to the code.
| > | I'll add a Note.
| > |
| > | Cheers,
| > | Simon
| > |
| > |
| > | > Simon
| > | >
| > | > | -----Original Message-----
| > | > | From: Simon Marlow [mailto:marlowsd at gmail.com]
| > | > | Sent: 08 January 2014 09:26
| > | > | To: Simon Peyton Jones; Herbert Valerio Riedel
| > | > | Cc: ghc-devs at haskell.org
| > | > | Subject: Re: High-level Cmm code and stack allocation
| > | > |
| > | > | On 07/01/14 22:53, Simon Peyton Jones wrote:
| > | > | > | Yes, this is technically wrong but luckily works.  I'd very
| > | > | > | much like to have a better solution, preferably one that
| > | > | > | doesn't add any extra overhead.
| > | > | >
| > | > | > | __decodeFloat_Int is a C function, so it will not touch the
| > | > | > | Haskell stack.
| > | > | >
| > | > | > This all seems terribly fragile to me.  At least it ought to
| > | > | > be
| > | > | surrounded with massive comments pointing out how terribly
| > | > | fragile it is, breaking all the rules that we carefully document
| elsewhere.
| > | > | >
| > | > | > Can't we just allocate a Cmm "area"? The address of an area is
| > | > | > a
| > | > | perfectly well-defined Cmm value.
| > | > |
| > | > | It is fragile, yes.  We can't use static memory because it needs
| > | > | to be thread-local.  This particular hack has gone through
| > | > | several iterations over the years: first we had static memory,
| > | > | which broke when we did the parallel runtime, then we had
| > | > | special storage in the Capability, which we gave up when GMP was
| > | > | split out into a separate library, because it didn't seem right
| > | > | to have magic fields in the Capability for one library.
| > | > |
| > | > | I'm looking into whether we can do temporary allocation on the
| > | > | heap for this instead.
| > | > |
| > | > | Cheers,
| > | > | Simon
| > | > |
| > | > |
| > | > | > Simon
| > | > | >
| > | > | > | -----Original Message-----
| > | > | > | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On
| > | > | > | Behalf Of Simon Marlow
| > | > | > | Sent: 07 January 2014 16:05
| > | > | > | To: Herbert Valerio Riedel; ghc-devs at haskell.org
| > | > | > | Subject: Re: High-level Cmm code and stack allocation
| > | > | > |
| > | > | > | On 04/01/2014 23:26, Herbert Valerio Riedel wrote:
| > | > | > | > Hello,
| > | > | > | >
| > | > | > | > According to Note [Syntax of .cmm files],
| > | > | > | >
| > | > | > | > | There are two ways to write .cmm code:
| > | > | > | > |
| > | > | > | > |  (1) High-level Cmm code delegates the stack handling to
| > | > | > | > | GHC,
| > | > | and
| > | > | > | > |      never explicitly mentions Sp or registers.
| > | > | > | > |
| > | > | > | > |  (2) Low-level Cmm manages the stack itself, and must
| > | > | > | > | know
| > | about
| > | > | > | > |      calling conventions.
| > | > | > | > |
| > | > | > | > | Whether you want high-level or low-level Cmm is
| > | > | > | > | indicated by the presence of an argument list on a
| procedure.
| > | > | > | >
| > | > | > | > However, while working on integer-gmp I've been noticing
| > | > | > | > in integer-gmp/cbits/gmp-wrappers.cmm that even though all
| > | > | > | > Cmm
| > | > | > | procedures
| > | > | > | > have been converted to high-level Cmm, they still
| > | > | > | > reference the
| > | > | 'Sp'
| > | > | > | > register, e.g.
| > | > | > | >
| > | > | > | >
| > | > | > | >      #define GMP_TAKE1_RET1(name,mp_fun)       \
| > | > | > | >      name (W_ ws1, P_ d1)                      \
| > | > | > | >      {                                         \
| > | > | > | >        W_ mp_tmp1;                             \
| > | > | > | >        W_ mp_result1;                          \
| > | > | > | >                                                \
| > | > | > | >      again:                                    \
| > | > | > | >        STK_CHK_GEN_N (2 * SIZEOF_MP_INT);      \
| > | > | > | >        MAYBE_GC(again);                        \
| > | > | > | >                                                \
| > | > | > | >        mp_tmp1    = Sp - 1 * SIZEOF_MP_INT;    \
| > | > | > | >        mp_result1 = Sp - 2 * SIZEOF_MP_INT;    \
| > | > | > | >        ...                                     \
| > | > | > | >
| > | > | > | >
| > | > | > | > So is this valid high-level Cmm code? What's the proper
| > | > | > | > way to
| > | > | > | allocate
| > | > | > | > Stack (and/or Heap) memory from high-level Cmm code?
| > | > | > |
| > | > | > | Yes, this is technically wrong but luckily works.  I'd very
| > | > | > | much like to have a better solution, preferably one that
| > | > | > | doesn't add any extra overhead.
| > | > | > |
| > | > | > | The problem here is that we need to allocate a couple of
| > | > | > | temporary words and take their address; that's an unusual
| > | > | > | thing to do in Cmm, so it only occurs in a few places
| > | > | > | (mainly
| > | interacting with gmp).
| > | > | > | Usually if you want some temporary storage you can use local
| > | > | > | variables or some heap-allocated memory.
| > | > | > |
| > | > | > | Cheers,
| > | > | > | Simon
| > | > | > | _______________________________________________
| > | > | > | ghc-devs mailing list
| > | > | > | ghc-devs at haskell.org
| > | > | > | http://www.haskell.org/mailman/listinfo/ghc-devs
| > | > | >
| > | >
| >