[Haskell-cafe] Re: Paths to tree

apfelmus at quantentunnel.de apfelmus at quantentunnel.de
Tue Jan 30 12:39:09 EST 2007


John Ky wrote:

> I can't know, but it doesn't seem unreasonable that you intend to use
>> the ArcForest as a trie, i.e. an efficient implementation of a set of
>> paths which allows to look up quickly whether a given path (here of type
>> [String]) is in the set or not. So, we have
> 
> For a while, I was thinking what on Earth are you talking about, even while
> I continued reading the rest of the email, but it eventually clicked what
> you where trying to show me - which was something I didn't dare try until I
> got more familiar with Haskell.
> 
> Your examples got me started on dealing with these sorts of complex tree
> structures (or tries as you call them).  They made more sense as I spent
> more time reading and rereading them.

:) I think that the important point is that one can think of the trees
you had as things where one can insert and lookup (path,value)-pairs.
This suggests a lot of useful functions like 'insert', 'union' and
'singleton' together with corresponding laws like

  insert k v m == union (singleton k v) m -- left biased union

that are very handy for implementation.

> Now what about 'MapString v', how do we get this? Well, your
>> implementation corresponds to the choice
>>
>>   type MapString v = [(String,v)]
>>
>> But in our case, we can apply the same trick again!
>
> [...]
>
> That's quite beautiful, but I don't actually need to go that far.
> Question though, does taking the approach to this conclusion
> actually have real applications?

Well, besides providing an actual implementation of finite maps, it is
also one of the fastest available. So while 'MapString v' and 'Data.Map
String v' have the same purpose, 'MapString v' will be faster.

But in your case, I wouldn't bother about this now, because if it turns
out that you need to change the trie data structure again, the effort
spend in optimization would be wasted. Moreover, changing from
'Data.Map' to 'MapString' or similar is very transparent and therefore
can be done later because you only rely on the functions like
'unionWith' that are provided by both.

Also, the trick that currently reduces the problem of a finite map for
the list [k] to the problem of a finite map for k can be extended to
decompose arbitrary types. To get a finite map for either one of the
keys k1 or k2, you can take a pair of finite maps for the keys

   Either k1 k2 -> v  ^=  (k1 -> v, k2 -> v)

Similarly, a finite map for pair of keys (k1,k2) can be encoded as a
composition of finite maps

   (k1,k2) -> v       ^=  k1 -> (k2 -> v)

The paper has more on this.


> Now, we can build up our finite map for paths:
>>
>>    data MapPath v = TriePath (Maybe v) (MapString (MapPath v))
>>
>> because it (maybe) contains a value for the key '[] :: Path' and it
>> (maybe) contains a map of paths that is organized by their first String
>> element.
> 
> In my own code I had to diverge from your definition because for my needs,
> every node needed to contain a value (even if it was a default value).  I
> plan to later add other numerical values to every node so that I can
> traverse them and do calculations that feed up and trickle down the tree.

> type Path k = [k]
>
> data Trie k v = Trie v (Map k (Trie k v)) deriving Show

That's fine, adapt them recklessly to your task :)

> I did try to write my own insertWithInit called by fromPath (below),
> which I couldn't get working.  Branches went missing from the result.
> I had so much rouble figuring where in the function I forgot to do
> something.

I don't know an easy way to implement 'insertWithInit' that works with
default elements. The problem is that one has to create the default
nodes when inserting

                                            u
  insertWithInit v0 f ["a","b","c"] x $   /   \
                                        "a"   "b"
                                         v     w
 ==>
       u
     /   \
    "a"  "b"
     v    w
    /
   "b"
    v0
   /
  "c"
   x

while still guaranteeing that f only acts on the inserted value x. This
somehow breaks the intuition of inserting a single (key,value)-pair. If
you dispense with the 'With' part, you can outsource the creation of the
default nodes to 'fromPath' and employ 'union' to implement 'insertInit':

    insertInit :: (Ord k) =>
                   v -> Path k -> v -> Trie k v -> Trie k v
    insertInit vInit path v m =
        union m (fromPath vInit v path)

In fact, that's what you did for fromList'.


If there is a globally known default element, you also have the option
to actually stick with (Maybe v). For example, if you do calculations
with 'Int', you can do

   vdefault = 5
   withDefault :: Maybe Int -> Int
   withDefault Nothing   = vdefault
   withDefault (Maybe x) = x

   instance Num (Maybe Int) where
       x + y = Just $ withDefault x + withDefault y
       ...

One could also do with 'Trie k v = Trie (Either v) ...' but i don't
think that it's really worth it.


> At this point my head was about to explode, so I took a different approach
> using union called by fromList' (also below), which from my limited testing
> appears to work.  I also find the union function incredibly easy to
> understand. I only hope I got it right.

> union :: (Ord k) => Trie k v -> Trie k v -> Trie k v
> union (Trie k0 v0) (Trie k1 v1) = Trie k0 v
>  where
>    v = Map.unionWith union v0 v1

Well, once you found such a really concise function, it can only be
correct :)

While it is not relevant in your case, note that 'union' can be extended
to be applicable with the recursive trick. But the extension suggest
itself by noting that you used 'Map.unionWith' instead of 'Map.union'.
So, you could do

    unionWith :: (Ord k) => (v -> v -> v)
              -> Trie k v -> Trie k v -> Trie k v
    unionWith f (Trie k0 v0) (Trie k1 v1) = Trie (f k0 k1) v
        where
        v = Map.unionWith (unionWith f) v0 v1

    union = unionWith (\_ y -> y)



Regards,
apfelmus



More information about the Haskell-Cafe mailing list