[Haskell-cafe] unique identity and name shadowing during type inference

Sat Jun 20 13:57:11 EDT 2009

Hello,

I am designing a type inference algorithm for a language with
arbitrary function overloading.  For various reasons (beyond the scope
of this email), it's impossible to know the full type of an overloaded
function, so each function is assigned a unique primitive type and the
inference algorithm gradually learns more information about the
primitive.  For example, if we declare an identity function

    f x = x

the algorithm will create a primitive type F, and record f :: F.  If
we use the function a few times,

    f 1
    f "blah"

the algorithm will infer

    F Int = Int
    F String = String

My question is: what's the best way to represent these unique
primitive types in Haskell?  A new type primitive needs to be created
whenever we process a function declaration.  Nested function
declarations produce a different primitive each time the parent is
invoked with different argument types.  These separate primitives can
escape if local functions are returned, so the inference algorithm
must be able to keep them separate and learn more about them after
their parent function is forgotten.

Here are a few ways I know of:

1. Thread a uniqueness generator monad through the whole algorithm.
I'd prefer to avoid this extra plumbing if possible.
2. Label primitives with the full context of how they were created.
If function f declares a nested function g, and f is called with Int
and Char, the primitives for g would be labeled with "f Int" and "f
Char" to keep them separate.  This is similar to lambda lifting.
3. Scary hacks involving makeStableName and unsafePerformIO.  Some
sort of context would have to be thrown around here to make sure GHC
doesn't merge the different makeStableName calls.

Unfortunately, method (2) is complicated by the fact that variable
names are not unique even in the internal representation (I'm using
the trick from [1]), so I'm not sure what the minimal unique "context"
would be.

Does anyone know other methods outside of (1), (2), or (3), or clean
ways of structuring (2) or (3)?

Thanks!
Geoffrey

[1]: http://www.haskell.org/~simonmar/bib/ghcinliner02_abstract.html