Wired-in data-constructors with UNPACKed fields

Mon Aug 18 22:01:17 UTC 2014

I see three alternatives.

1. Flatten out the BigNat thing.  You give good reasons why this would be bad.

2.  Take care to build a DCR that really does match the one you get when you compile the source module that declares the data type.  In principle, the representation does indeed depend on dynflags, so you need to know the flags with which the source module will be compiled. And that's reasonable: if we generate code for an unpacked constructor, GHC's wired-in knowledge must reflect that, and vice versa.  But you can probably write the code in such a way as to be mostly independent (eg explicit UNPACK rather than rely on -funbox-strict-fields), or assume that some things won't happen (e.g. souce module will not be compiled with -fomit-interface-pragmas).  See MkId.mkDataConRep.

3. Stop having Integer as a wired-in type.  For the most part it doesn't need to be; you won't see any mentions of 'integerTy' or 'integerTyCon' scattered about the compiler.  I believe that that the sole use is in CorePrep.cvtLitInteger, which wants to use the data constructor for small integers.

What is odd here is that for non-small integers we are careful to look up mkInteger in the environment (precisely so that it is not wired in).  Then we stash it in the CorePrepEnv, and pass it to cvtLitInteger.  What I don't understand is why we don't do exactly the same thing for S#, the data constructor for small integers.   (Add a new field to CorePrepEnv for the S# data constructor.)

If we did that, then the Integer type and the data constructor, would become "known-key" things, rather than "wired-in" things; and the former are MUCH easier to handle.

My recommendation would be to try (3) first.  Ian Lynagh (cc'd) may be able to comment about why the inconsistency above arose in the first place, and why we can't simply fix it.

Simon

| -----Original Message-----
| From: Herbert Valerio Riedel [mailto:hvriedel at gmail.com]
| Sent: 18 August 2014 08:56
| To: Simon Peyton Jones
| Subject: Re: Wired-in data-constructors with UNPACKed fields
| 
| Hello Simon,
| 
| On 2014-08-17 at 23:56:32 +0200, Simon Peyton Jones wrote:
| > You'll see that 'pcDataCon' in TysWiredIn ultimately calls
| > pcDataConWithFixity'.  And that builds a data constructor with a
| > NoDataConRep field, comment "Wired-in types are too simple to need
| > wrappers".
| >
| > But your wired-in type is NOT too simply to need a wrapper!  You'll
| > need to build a suitable DCR record (see DataCon.lhs), which will be
| > something of a nuisance for you, although you can doubtless re-use
| > utility functions that are currently used to build a DCR record.
| 
| Wouldn't I need to access to the current dynamic flags in order to be
| able to construct the effective DCR record? If so, I'm not sure I can
| access the dynflags while constructing a CAF (which I seem to need doing)
| 
| > Alternatively, just put a ByteArray# as the argument of JP# and JN#.
| > After all, you have Int# as the argument of SI#!
| 
| Well, there's a big difference between the Int# use and the BigNat use:
| 
| Int and Int# are isomorphic to each other. However, BigNat is a subset
| of what can be represented in a ByteArray#.
| 
| Also, BigNat is meant to be available as an abstract data type in its
| own right for user to use it as building-block for other data-types
| (Like e.g. a more efficient multi-constructor rational type in the style
| of 'Integer' or also an optimized 'Either Word BigNat'-isomorphic
| 'Natural' type I've got queued up for when integer-gmp2 is done).  For
| instance, there's a function for creating a BigNat out of a ByteArray#
| which makes sure all internal invariants are satisfied.
| 
| 
| However, should the task to wire-in BigNat turn out to be more pain than
| bearable: Since we now have explicitly bidirectional pattern synonyms, I
| have been considering to express the user-facing low-level interface to
| the 'Integer' type via such pattern synonyms (and hide the "real"
| 'data Integer = SI# Int# | Jp# ByteArray# | ..' type deeper, or maybe
| not even export it at all).
| 
| From a practical point, I'd like to get to a situation where code
| requiring to access the "medium-level" Integer representation (like some
| of Edward Kmett's packages, or some of the crypto-packages using
| 'Integer's to perform RSA calculations) doesn't need to know it's using
| integer-simple, integer-gmp2, or integer-xyz, as they'd all provide the
| same abstracted API.
| 
| 
| [...]
| 
| > |   data Integer  = SI#                Int#
| > |                 | Jp# {-# UNPACK #-} !BigNat
| > |                 | Jn# {-# UNPACK #-} !BigNat
| > |
| > |   data BigNat = BN# ByteArray#
| 
| Cheers,
|   hvr