Unique as special boxing type & hidden constructors

p.k.f.holzenspies at utwente.nl p.k.f.holzenspies at utwente.nl
Wed Aug 20 11:47:55 UTC 2014


Methinks a lot of the former performance considerations in Unique are out-dated (as per earlier discussion; direct use of unboxed ints etc.).


An upside of using an ADT for the types of uniques is that we don't actually need to reserve 8 bits for a Char (which is committing to neither the actual number of classes, nor the "nature" of real Chars in Haskell). Instead, we can make a bitmask dependent on the number of classes that we actually use and stick the tag on the least-significant side of the Unique, as opposed to the most-significant (as we do now).


We want to keep things working on 32-bits, but maybe a future of parallel builds is only for 64-bits. In this case, I would suggest that the 64-bit-case looks like this:


<thread_id_bits:8> <unique_id_bits:56-X> <tag_bits:X>


whereas the 32-bit case simply has


<unique_id_bits:32-X> <tag_bits:X>


Where X is dependent on the size of the UniqueClass-sum-type (to be introduced). This would be CPP-magic'd using ​WORD_SIZE_IN_BITS.


Ph.






________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 20 August 2014 13:01
To: Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org
Subject: RE: Unique as special boxing type & hidden constructors

Sounds like a good idea to me.   Would need to think about making sure that it all still worked, somehow, on 32 bit.

S

From: p.k.f.holzenspies at utwente.nl [mailto:p.k.f.holzenspies at utwente.nl]
Sent: 20 August 2014 11:31
To: Simon Peyton Jones; ghc-devs at haskell.org
Subject: RE: Unique as special boxing type & hidden constructors


Dear Simon, et al,



I seem to recall that the Unique(Supply) was an issue in parallelising GHC itself. There's a comment in the code (signed JSM) that there aren't any 64-bit bugs, if we have at least 32-bits for Ints and Chars fit in 8 characters. Then, there's bitmasks like 0x00FFFFFF to separate the "Int-part" from the "Char-part".



I was wondering; if we move Uniques to 64 bits, but use the top 16 (instead of the current 8) for *both* the tag (currently a Char, soon an sum-type) and the threadId of the supplying thread of a Unique, would that help?



Regards,

Philip









________________________________
From: Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>>
Sent: 18 August 2014 23:29
To: Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
Subject: RE: Unique as special boxing type & hidden constructors


1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed?

I think so, unless others yell to the contrary.


2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different "classes" of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like:



> data UniqueClass

>   = UniqDesugarer

>   | UniqAbsCFlattener

>   | UniqSimplStg

>   | UniqNativeCodeGen

>   ...

OK by me


3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and:

> instance Outputable Unique where

>   ppr = pprUnique

Please don’t change this.  If you want to change how pretty-printing of uniques works, and want to find all the call sites of pprUnique, it’s FAR easier to grep for pprUnique than to search for  all calls of ppr, and work out which are at type Unique!

(In my view) it’s usually much better not to use type classes unless you actually need overloading.

Simon

From: p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl> [mailto:p.k.f.holzenspies at utwente.nl]
Sent: 18 August 2014 14:50
To: Simon Peyton Jones; ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
Subject: RE: Unique as special boxing type & hidden constructors


Dear Simon, et al,



Looking at Unique, there are a few more design choices that may be outdated, and since I'm polishing things now, anyway, I figured I could update it on more fronts.



1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed?



2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different "classes" of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like:



> data UniqueClass

>   = UniqDesugarer

>   | UniqAbsCFlattener

>   | UniqSimplStg

>   | UniqNativeCodeGen

>   ...



and a public (i.e. exported) function:



> mkUnique :: UniqueClass -> Int -> Unique



? The benefit of this would be to have more (to my taste) self-documenting code and a greater chance that documentation is updated (the list of "unique supply characters" in the comments is currently outdated).



3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and:



> instance Outputable Unique where

>   ppr = pprUnique



Here pprUnique is exported and it is used in quite a few places where it's argument is unambiguously a Unique (so it's not to force the type) *and* "ppr" is used for all kinds of other types. I'm assuming this is an old choice making things marginally faster, but I would say cleaning up the API / namespace would now outweigh this margin.

​

I will also be adding Haddock-comments, so when this is done, a review would be most welcome (I'll also be doing some similar transformations to other long-since-untouched-code).



Regards,

Philip

















________________________________
Van: Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>>
Verzonden: maandag 18 augustus 2014 00:11
Aan: Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
Onderwerp: RE: Unique as special boxing type & hidden constructors

Re (1) I think this is a historical.  A newtype wrapping an Int should be fine.  I’d be ok with that change.

Re (2), I think your question is: why does module Unique export the data type Unique abstractly, rather than exporting both the data type and its constructor.  No deep reason here, but it guarantees that you can only *make* a unique from an Int by calling ‘mkUniqueGrimily’, which signals clearly that something fishy is going on.  And rightly so!

Simon

From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>
Sent: 15 August 2014 11:53
To: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
Subject: Unique as special boxing type & hidden constructors


Dear all,



I'm working with Alan to instantiate everything for Data.Data, so that we can do better SYB-traversals (which should also help newcomers significantly to get into the GHC code base). Alan's looking at the AST types, I'm looking at the basic types in the compiler.



Right now, I'm looking at Unique and two questions come up:



> data Unique = MkUnique FastInt



1) As someone already commented: Is there a specific reason (other than history) that this isn't simply a newtype around an Int? If we're boxing anyway, we may as well use the default Int boxing and newtype-coerce to the specific purpose of Unique, no?



2) As a general question for GHC hacking style; what is the reason for hiding the constructors in the first place?



I understand about abstraction and there are reasons for hiding, but there's a "public GHC API" and then there are all these modules that people can import at their own peril. Nothing is guaranteed about their consistency from version to version of GHC. I don't really see the point about hiding constructors (getting in the way of automatically deriving things) and then giving extra functions like (in the case of Unique):



> getKeyFastInt (MkUnique x) = x

> mkUniqueGrimily x = MkUnique (iUnbox x)



I would propose to just make Unique a newtype for an Int and making the constructor visible.

Regards,

Philip


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140820/6780e71d/attachment-0001.html>


More information about the ghc-devs mailing list