adding isWHNF primop to 5.00.2 native code generator
Julian Seward (Intl Vendor)
v-julsew@microsoft.com
Wed, 1 Aug 2001 03:09:49 -0700
| Sigbjorn gave me this suggestion, which is based on DataToTagOp.
|=20
| \begin{code}
| primCode [res] IsHNF [arg]
| =3D let res' =3D amodeToStix res
| arg' =3D amodeToStix arg
| arg_info =3D StInd PtrRep arg'
| word_32 =3D StInd WordRep (StIndex PtrRep=20
| arg_info (StInt (-1)))
| masked_le32 =3D StPrim SrlOp [word_32, StInt 16]
| masked_be32 =3D StPrim AndOp [word_32, StInt 65535]
| #ifdef WORDS_BIGENDIAN
| ty_info =3D masked_le32
| #else
| ty_info =3D masked_be32
| #endif
| not_a_thunk =3D StPrim IntEqOp [ StPrim AndOp [ty_info,
StInt 0x10]
| , StInt 0x0
| ]
| -- ToDo: don't hardwire the value of=20
| _THUNK from InfoTables.h
| assign =3D StAssign IntRep res' not_a_thunk
| in
| returnUs (\ xs -> assign : xs)
| \end{code}=20
|=20
|=20
| I get different results with the version using the C macro=20
| and the native code version. In particular I get the wrong=20
| (unexpected result from the native code version).
|=20
| The C macro version works like this (uses these macros):
|=20
| #define closureFlags(c) (closure_flags[get_itbl(c)->type])
| -- in ghc/includes/InfoTables.h
|=20
| #define closure_THUNK(c) ( closureFlags(c) & _THU)
| -- in ghc/includes/InfoTables.h
|=20
| #define get_itbl(c) (INFO_PTR_TO_STRUCT((c)->header.info))
| -- in ghc/includes/ClosureMacros.h
|=20
| #define INFO_PTR_TO_STRUCT(info) ((StgInfoTable *)(info) - 1)
| -- in ghc/includes/ClosureMacros.
|=20
| The (-1) scares me a little bit, in INFO_PTR_TO_STRUCT.
|=20
| The array closure_flags[] is just a look up table, indexed by=20
| the closure type. So the key part of the code above is:
|=20
| get_itbl(c)->type
|=20
| In the native code version above, the lines below act like=20
| get_itbl(c):
|=20
| arg_info =3D StInd PtrRep arg'
| word_32 =3D StInd WordRep (StIndex PtrRep arg_info (StInt =
(-1)))
|=20
| After this I am lost. It seems to be grabbing the top or=20
| bottom 16 bits of word_32 (depending on endianness) and then=20
| ANDing those bits with 0x10 (which is the bit mask for _THU),=20
| and checking for 0.
|=20
| My feeling is that I should be finding out the "type" field=20
| from the closure and then switching on its value.=20
What you have started with looks plausible. If you can extract
enough info from the NCG, you can usually find the problem
without too much difficulty.
I suggest you make friends with -ddump-stix, so that you can
see directly the result of your extension to the primCode fn
above.
Then (if you are mad enough) the usual way to go about this is:
1. Write a plausible primCode case -- as you've already done.
2. Compile with -ddump-stix and see if it looks plausible
in context.
3. Read the assembly code (-ddump-asm) and relate it to the
Stix.
4. Try and relate (3) to the assembly code for the same
when compiled with -fvia-C. This usually helps.
(3) and (4) are a lot easier if you do it on x86 than sparc
because x86 code is a lot less verbose. You do need to be=20
clear about x86 addressing modes, tho.
| I don't understand the native code generator well enough. I=20
| am assuming that closures are laid out the same way as via-C,
Yes. Code from the NCG and -fvia-C is 100% interoperable.
=20
| in particular that there is going to be some stuff in the=20
| info table before the "type" field (this looks variable=20
| depending on how things are built). Basically I'm looking for=20
| a way to do the equivalent of "->type". Maybe there is a=20
| better way to do it.=20
Suggestion: find some other primop which uses the ->type field
and which is implemented in the NCG (not all are). That might
help.
Finally let me point you at ghc/utils/debugNCG, a small but
very useful program for debugging the NCG. I would never have
got it working as well as it does without it. This should
work -- but it might have slight bitrot -- it's very fragile
and makes lots of assumptions about gcc's output. I haven't
used it for a good few months.
J