RFC: unsafeShrinkMutableByteArray#

Tue Jul 22 09:06:30 UTC 2014

On 13/07/14 14:15, Herbert Valerio Riedel wrote:
> On 2014-07-12 at 17:40:07 +0200, Simon Marlow wrote:
>> Yes, this will cause problems in some modes, namely -debug and -prof
>> that need to be able to scan the heap linearly.
>
> ...and I assume we don't want to fallback to a non-zerocopy mode for
> -debug & -prof in order avoid distorting the profiling measurements
> either?

I suppose that would be doable.  Not ideal, but doable.  In profiling 
mode you could arrange for the extra allocation to be assigned to 
CCS_OVERHEAD, so that it gets counted as profiling overhead.  You'd 
still have the time overhead of the copy though.

>> Usually we invoke the
>> OVERWRITING_CLOSURE() macro which overwrites the original closure with
>> zero words, but this won't work in your case because you want to keep
>> the original contents.  So you'll need a version of
>> OVERWRITING_CLOSURE() that takes the size that you want to retain, and
>> doesn't overwrite that part of the closure.  This is probably a good
>> idea anyway, because it might save some work in other places where we
>> use OVERWRITING_CLOSURE().
>
> I'm not sure I follow. What's the purpose of overwriting the original
> closure payload with zeros while in debug/profile mode? (and on what
> occasions that would be problematic for a MutableByteArray does it
> happen?)

Certain features of the RTS need to be able to scan the contents of the 
heap by linearly traversing the memory.  When there are gaps between 
heap objects, there needs to be a way to find the start of the next heap 
object, so currently when we overwrite an object with a smaller one we 
clear the payload with zeroes.  There are more efficient ways, such as 
overwriting with a special "gap" object, but since the times we need to 
do this are not performance critical, we haven't optimised it. 
Currently we need to do this

  * in debug mode, for heap sanity checking
  * in profiling mode, for biographical profiling

The macro that does this, OVERWRITING_CLOSURE() currently overwrites the 
whole payload of the closure with zeroes, whereas you want to retain 
part of the closure, so you would need a different version of this macro.

>> I am worried about sizeofMutableByteArray# though.  It wouldn't be
>> safe to call sizeofMutableByteArray# on the original array, just in
>> case it was evaluated after the shrink.  You could make things
>> slightly safer by having unsafeShrinkMutableByteArray# return the new
>> array, so that you have a safe way to call sizeofMutableByteArray#
>> after the shrink.  This still doesn't seem very satisfactory to me
>> though.
>
> ...as a somewhat drastic obvious measure, one could change the type-sig
> of sizeofMutableByteArray# to
>
>    ::  MutableByteArray# s a -> State# s -> (# State# s, Int# #)
>
> and fwiw, I could find only one use-site of sizeofMutableByteArray#
> inside ghc.git, so I'm wondering if that primitive is used much anyway.

I think that would definitely be better, if it is possible without too 
much breakage.  Once we have operations that change the size of an 
array, the operation that reads the size should be stateful.

> btw, is it currently safe to call/evaluate sizeofMutableByteArray# on
> the original MBA after a unsafeFreezeByteArray# was performed?

Probably safe, but better to avoid doing it if you can.

> Otoh, if we are to thread a MutableByteArray# through the call anyway,
> can't we just combine shrinking and freezing in one primop (as suggested
> below)?

I don't think this makes anything easier.  You still need to overwrite 
the unused part of the array, and sizeofMutableByteArray# is still 
dangerous.

Cheers,
Simon

> [...]
>
>>> PS: maybe unsafeShrinkMutableByteArray# could unsafe-freeze the
>>>       ByteArray# while at it (thus be called something like
>>>       unsafeShrinkAndFreezeMutableByteArray#), as once I know the final
>>>       smaller size I would freeze it anyway right after shrinking.