[jhc] Re: Ids in Jhc.
Lemmih
lemmih at gmail.com
Wed Feb 13 19:55:43 EST 2008
On Feb 14, 2008 1:49 AM, Lemmih <lemmih at gmail.com> wrote:
>
> On Feb 14, 2008 1:39 AM, Lemmih <lemmih at gmail.com> wrote:
> >
> > On Feb 14, 2008 1:23 AM, Lemmih <lemmih at gmail.com> wrote:
> > >
> > > On Feb 14, 2008 12:50 AM, John Meacham <john at repetae.net> wrote:
> > > > (I'm gonna forward this response to the jhc list too, if that's okay)
> > > >
> > > >
> > > > On Thu, Feb 14, 2008 at 12:26:49AM +0100, Lemmih wrote:
> > > > > I've been wondering about your use of Ids in Jhc.
> > > > > As far as I can see, the rules for Ids goes as following:
> > > > > * named ids are odd
> > > > > * unnamed ids are even
> > > > > * id '0' has special significance.
> > > > >
> > > > > Can you tell me a bit about how unnamed ids are used, why id '0' has
> > > > > special meaning, and how you generate new, unique unnamed ids.
> > > >
> > > > Yes. that is correct.
> > > >
> > > > named and unnamed ids behave the same way, except named ids get a pretty
> > > > name when printed out as opposed to just a numebr and are generally
> > > > derived from a user supplied name. Also, all 'top level' names are
> > > > always 'named' to ensure they remain globally unique.
> > > >
> > > > named Ids are odd, these are looked up in a table that is implemented
> > > > via a hash table written in c in the StringTable/ directory. due to an
> > > > intentional quirk of its implementation, it only generates odd, positive
> > > > values. Conversion between a number and its string form is constant
> > > > time in both directions.
> > > >
> > > > zero is a 'special' id in that it can be treated normally except it can
> > > > never appear in an expression, only in a binder. so \ v0 -> ... is okay,
> > > > but \ x -> .. v0 ... is not. This means you can always use 0 as a binder
> > > > when you know a value is not used without worrying about shadowing a
> > > > real definition.
> > > >
> > > >
> > > > unnamed ids are arbitrary positive even numbers, there is no particular
> > > > need to allocate them in a special way, they are _not_ globally unique
> > > > so you just need to ensure you don't shadow any existing variables when
> > > > assigning new ones. in general, this consists of having a set of 'in
> > > > scope' names (which is often something you need to keep track of anyway)
> > > > and you choose some value that is not in the 'in scope' set and add it
> > > > to the set. this ensures ids don't shadow each other, but not that they
> > > > are unique.
> > > >
> > > >
> > > > negative ids are used inside of a few specialized routines internally
> > > > and should never make it out into the world. for instance, in the
> > > > unification algorithm they are used to represent alpha-renamed values so
> > > > we don't need to scan for used names before hand, however the terms
> > > > returned have their original ids restored.
> > > >
> > > > hope this helps...
> > > >
> > > > It is quite bothersome to me that 'Id' is just an Int and not a
> > > > newtype... that is something I have been slowing getting the code ready
> > > > to fix for a while now..
> > >
> > > I'd like to fix it in another way, if you have no objects.
> > >
> > > By my count there are 120 unique atoms in base-1.0.hl. Those 120 atoms
> > > are saved ~2344 times each (there's a total of 281338 atoms).
> >
> > Oh, I should mention that I'm counting TVr's. Those a apparently
> > mostly module names.
>
> A count of atoms that are forced shows a different result: 258155
> saved atoms, 7390 unique atoms. Not as bad as the TVr count but still
> not good.
The worst case scenario is: 990666 saved atoms, 9980 unique atoms.
--
Cheers,
Lemmih
More information about the jhc
mailing list