[Haskell-cafe] Re: ANNOUNCE: Harpy -- run-time code generation library

Mon May 14 13:38:56 EDT 2007

[Relocated to haskell-cafe]

Dirk Kleeblatt wrote:
> apfelmus wrote:
>> Note that even currently, your operations cannot be strict in the
>> address a label refers to because this may be determined later than the
>> first use of the label. In other words, your example code
>>
>>   fac = do
> [...]
>> (1)   jmp  loopTest
> [...]
>> (2)   loopTest @@ cmp ecx (0 :: Word32)
> [...]
>> already shows that the function jmp that generates a jmp-instruction may
>> not be strict in the position it jumps to as the address behind loopTest
>> is only known later at line (2).
> 
> When generating code for (1), the label loopTest is used as an index
> into a map, to see whether there's a code position associated with it.
> If it is, the code position is used to compute the jump offset for the
> jmp instruction, if not (as in this example), a dummy value is placed in
> the code buffer, and Harpy remembers this position to be patched later
> on. At (2), the label is defined, and this leads to patching all
> instructions that have been emitted before this definition.
> 
> So, yes, the code position is only used after the definition of the
> label. But the "look up in a map"-part makes the jmp operation strict in
> the label parameter.

Ah, I slowly get what you mean. In short, the point is that you're
reinventing lazy evaluation because you're writing directly to a memory
buffer and writing a raw _|_ is impossible. So, you invent a scheme to
delay the evaluation of the _|_ until they are defined.

With raw writing, we indeed have the strictness

   writeByte buf _|_
 = _|_

which implies

   jmp _|_
 = _|_

I thought that you'd simply store the instructions in a list/monoid like
the Builder-Monoid from Data.Binary with the effect that

   jmp _|_
 /= _|_

as monadic action although it stores _|_ as instruction. Lazy evaluation
then takes care of filling the _|_-instructions with content in times of
need.

Here's an illustration with a fictional robodog toy assembly language

  import Control.Monad.RWS

  type Address = Int
  data Dogcode = Bark | Bite | Jump Address

  data CodeGen a = RWS () [Dogcode] Address a

  emitDogcode :: Dogcode -> CodeGen Address
  emitDogcode c =
      x <- get
      tell c
      put  (x+1)
      return x

  jump :: Label -> CodeGen Address
  jump = emitDogcode . Jump

  bark, bite :: CodeGen Address
  bark = emitDogcode Bark
  bite = emitDogcode Bite

  generateDogCode :: CodeGen a -> CodeBuffer
  generateDogCode c =
    let (_,instructionCount, code) = runRWST c () 0 in
       ... whatever to do with a list of instructions ...

Now, you magically get recursive do-notation without lifting a finger
because RWS is already an instance of MonadFix. For example, the
following works out of the box

  pluto :: CodeGen Address
  pluto = mdo
     bark
     jump loop
     bite
     loop <- bark
     bark
     jump loop

(The return value of type Address encapsulated in the monad is the line
number of the last instruction of pluto.)

If you really want to directly write to a memory buffer, I think that
you can still use lazy evaluation to fill in the _|_ by emitting a "jump
PATCHME" dummy instruction and build a stack (f.i. of type [Address]) of
addresses. The stack takes the role of the map you currently use, I
think that a stack is enough. The addresses therein may well be _|_.
After having written all instructions into the buffer, all the _|_ are
resolved and you can patch the "PATCHME"s in a second pass over the buffer.

However, because of omnipresent lazy evaluation, it is unclear whether
"directly" writing to a buffer instead of first building a monoid does
give a speed/space advantage. So, your current approach may well be
slower than a naïve "first (difference) list, then raw memory" approach.

> We could omit the map, and just remember where to patch the code, but
> then we'd need to call explicitly some function after code generation
> that does the patching. We had implemented this, but found the current
> solution easier to use, since backpatching is completely automatic and
> hidden from the user.

Huh, I thought that runCodeGen would do an additional backpatch pass as
transparently?

>> I also think that having liftIO in the CodeGen-monad is plain wrong. I
>> mean, CodeGen is a monad that generates code without any execution
>> taking place. The execution part is already handled by runCodeGen.
>> Having liftIO means that arbitrary Haskell programs can be intertwined
>> with assembly generation and I doubt that you want that.
> 
> Feel free to doubt, but this is exactly what we want. :-)
> 
> Also, note that runCodeGen runs the code _generation_, executing the
> generated code is done _within_ the CodeGen monad via the functions
> generated by callDecl (or the predefined functions in the Harpy.Call
> module). This is even more intertwined, but intentional.

Huh? That means that code gets executed during it's own generation? But
why do you mix separate concerns? I don't see what use this is besides
being an opportunity to mess up.

> Of course, again a different design is possible, making runCodeGen
> return a binary code object, that can be called from the IO monad. But
> then, the user has to care about releasing code buffers, and not to have
> unevaluated closures having code pointers to already released run-time
> generated code.

Huh, why does the user have to care? Shouldn't wrapping the raw memory
block into a Foreign.ForeignPtr do the job?

Regards,
apfelmus