clojure's data structures (was: Re: [Haskell-cafe] implementing python-style dictionary in Haskell)

Evan Laforge qdunkan at gmail.com
Tue Nov 18 16:34:41 EST 2008


This actually brings up something I was wondering about lately.  I
recently stumbled across a language called clojure, which is a
lisp-like that runs on the JVM.  The interesting thing is that
mutations go through a transactional mutable reference system, and the
other datastructures are all immutable.  The author implements an
immutable hash map with a trie on each 5 bits of the hash, so each
node potentially has 32 children.  This means that lookups are
effectively O(1) and alterations have to copy a maximum of 7 chunks of
32 pointers.  Extendable "vectors" are implemented similarly.

The hash tables sound basically like an IntMap on the hash code,
except as far as I know, IntMap's patricia tree splits up the bit
space adaptively instead of being hardcoded on 5 bit segments.  I'm
not sure what the performance difference would be.  I haven't run any
tests, but the clojure author claims his vector and hash map are quite
fast.  I suppose the extendable vector corresponds to the current crop
of ByteString-like variants, so we sort of already have that, though
in a kind of chaotic and inchoate way.

I'm also curious about how far the generalized trie SoC project got...
 it should be more accessible now that we have a mainline release with
ATs, right?


More information about the Haskell-Cafe mailing list