[Haskell-cafe] ANN: TernaryTrees-0.1.1.1 - An efficient ternary
tree implementation of Sets and Maps
wren ng thornton
wren at freegeek.org
Thu Jul 2 19:38:18 EDT 2009
Alex Mason wrote:
> TernaryTrees is a package that extends Data.Set ad Data.Map with some
> ternary tree structures, based on the article
> [http://www.pcplus.co.uk/node/3074/] .
For the string (or rather ByteString) version:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring-trie
Which has a number of other significant performance improvements (e.g.
node fusion, ByteString instead of String) and a highly expressive
interface. Because it uses ByteStrings it can trie any type which can be
serialized into a vector of bits[1], albeit indirectly.
The real trick with tries is not in just having them[2], it's in having
the right interface to make use of what they're good at. For example, if
I have multiple tries, I'd like to merge them without doing it element
by element[3]. Or if I know I'm going to be making a number of similar
queries, it'd be nice if I could cache my position in the trie[4] to
avoid repeating the work for the prefixes of all my queries[5]. Using
tricks like these leads to significant improvements over using them like
hashtables; tries aren't hashtables just like lists aren't arrays.
[1] Where the ByteString encodings are equal iff the values are equal.
Ideally, the encoding should ensure that for any given set of queries,
the discriminative bits are closest to the front. You can also use
lossless compression algorithms (e.g. Golomb codes) to shrink the size
of the keys if you happen to know their distribution; and if your
queries are already compressed, you don't need to decompress them to do
lookup!
[2] And they seem to be all the rage for people reinventing them these
days...
[3] Supported by Data.Trie.mergeBy for the general case, with unionL and
unionR for common use cases.
[4] Supported by Data.Trie.lookupBy
[5] In the general case, this is trie intersection. You can hack this
together with the bytestring-trie-0.1.4 interface, but an efficient
version is on my todo list. A special case of intersection is provided
by Data.Trie.submap.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list