[jhc] Atoms, Infos and unique ids.

Thu Feb 21 19:34:32 EST 2008

On Fri, Feb 22, 2008 at 12:53 AM, John Meacham <john at repetae.net> wrote:
> On Fri, Feb 22, 2008 at 12:13:10AM +0100, Lemmih wrote:
>  > On Thu, Feb 21, 2008 at 11:23 PM, John Meacham <john at repetae.net> wrote:
>  > >  If you want crazy obfuscated code in pursuit of performance, look at
>  > >  ghc. at least I try to stay mostly haskell 98 (well, haskell-prime beta
>  > >  is actually what I target). avoiding things like explicit unboxed types
>  > >  and try to keep all strangeness encapsulated behind abstract types with
>  > >  well defined interfaces. ghc lobs around Int#'s like candy :)
>  >
>  > I wanna make a quick response to this segment because I actually feel
>  > slightly insulted. I'm trying to get rid of a global mutable state and
>  > some C code that segfaults if used incorrectly, and you're accusing me
>  > of obfuscating the code?
>
>  Oh, no, I didn't mean to imply that at all. I _really_ appreciate the
>  work you are putting in to making jhc faster. what I meant to say is
>  that I know parts of jhc are _already_ obfuscated, but that as far as
>  haskell compilers go, it's not as bad as it could be. :)

Ah, okay.

>  As in, I endevour to make sure the APIs are clean and well founded. that
>  is what really matters when it comes to maintainability of code. The Id
>  implementation can be swapped out willy nilly once it is fully
>  abstracted, so worrying about how it actually is implemented seems
>  premature. In the end, once the APIs are stable, it is easy enough to
>  plug in a variety of implementations and just try em all out. That
>  should be true of any component of jhc if I did my design right.
>
>
>  > I'm proposing to associate names directly
>  > with variables instead of using a magic pointer. It would be the
>  > natural thing to do, completely valid Haskell98 code, and several
>  > times faster than the current approach.
>
>  Hmm... are you sure it would be faster? perhaps I don't fully understand
>  what you want to do, but Atoms were darn fast when I was benchmarking, I
>  could have broken them though. I mean, perhaps the speed benefit isn't
>  that useful for jhc... I use the same atom implementation in C projects
>  but I enjoy that in haskell-land I can hide the implementation behind a
>  newtype to make them fully safe. I heart haskell.

The problem is not atoms per se; it's generating ids from a global
store. When saving TVr's with associated names, the atom has to be
saved instead of just the id. Duplicating few strings may not sound
too serious but it does take its toll. It is relatively minor (but
still significant) when compiling base-1.0.hl (7% cpu, 8% memory
usage). However, it is dominating when compiling smaller pieces of
code such as HelloWorld.

-- 
Cheers,
 Lemmih