[Haskell-cafe] Where is Data.Atom ?

Sun Jul 2 20:17:02 EDT 2006

G'day all.

Quoting Brian Hulley <brianh at metamilk.com>:

> So perhaps my original spec is impossible to implement, though it is an open
> question whether some very clever encoding (with corresponding
> implementation of <) could be found which would lead to a better average
> performance (whatever that means).

For what it's worth:

    - It's not such a dumb idea to separate equality testing from
      less-than testing.  A (Unique,ByteString) pair makes the former
      fast and the latter pretty fast when it's needed.

    - Arithmetic coding preserves lexicographic ordering.  If you have
      a good statistical model for what sort of strings might be used
      as atoms, it may be possible to construct a representation of atoms
      which fit in machine 32- or 64-bit word most of the time.  (This is
      similar to the way that GHC represents Integers; use a machine word
      when it fits.)

    - Hash table implementations rely on using a cheap test first
      (comparing hash codes), going to a full comparison only if the
      cheap test doesn't rule out what you're looking for.

> As an aside, if the monad was removed then the result of atom "a" < atom "b"
> (atom :: String -> Atom) could not be determined by analysis of the program.
> It would depend on the evaluation order chosen by the compiler, but in a
> sense this doesn't matter because whatever the result is, it would be the
> same at any future time during the same run of the program so the use of
> Atoms as keys would still be safe. But is this still "functional"?

Questions of this nature produce much disagreement.  Once upon a time,
program arguments and environment variables were passed to a Haskell
program using a similar mechanism.  I think we got rid of this because
it's not "functional" if you consider programs that fork.

Cheers,
Andrew Bromage