Storage layout of integral types

Tue Jan 19 16:34:03 UTC 2021

Hi all,

I'm wondering what the supposed storage layout of integral types is.  In
particular for integral types with size less than the size of a word.  For
example, on a 64bit machine is a 32bit integer supposed to be written as a
whole word and therefore as 64 bits or just as 32bits in the payload of a
closure?

I'm asking because since commit be5d74ca I see differently aligned integers in
the payload of a closure on a 64bit big-endian machine.  For example, in the
following code an Int32 object is created which contains the actual integer in
the high part of the payload (the snippet comes from the add operator
GHC.Int.$fNumInt32_$c+_entry):

    Hp = Hp + 16;
    ...
    I64[Hp - 8] = GHC.Int.I32#_con_info;
    I32[Hp] = _scz7::I32;

whereas e.g. in function rts_getInt32 the opposite is assumed and the actual
integer is expected in the low part of the payload:

    HsInt32
    rts_getInt32 (HaskellObj p)
    {
        // See comment above:
        // ASSERT(p->header.info == I32zh_con_info ||
        //        p->header.info == I32zh_static_info);
        return (HsInt32)(HsInt)(UNTAG_CLOSURE(p)->payload[0]);
    }

The same seems to be the case for the interpreter and foreign calls (case
bci_CCALL) where integral arguments are passed in the low part of a whole word.

Currently, my intuition is that the payload of a closure for an integral type
with size smaller than WordSize is written as a whole word where the subword is
aligned according to the machines endianness.  Can someone confirm this?  If
that is indeed true, then rts_getInt32 seems to be correct but not the former.
Otherwise the converse seems to be the case.

Cheers,
Stefan