[Haskell-cafe] Re: I just don't get it (data structures and OO)

apfelmus apfelmus at quantentunnel.de
Sun Jun 3 05:01:22 EDT 2007


Phlex wrote:
> Donald Bruce Stewart wrote:
>>
>> Imagine updating a node in a tree by just detaching and reattaching a
>> pointer.
>>
>>     [1]                         [1]
>>     / \                         / \
>>   [2] [3]     update node 5   [2] [3]
>>       / \     with value  7       / \
>>     [4] [5]                     [4]  *
>>
>> and share the rest of the structure. Since the rest isn't mutable
>> anyway, you can share all over.
> 
> That's precisely the thing i don't understand.
> In order to update node 3 with a new pointer, i need to mutate it, so i
> need to recreate it, and so on up to node 1.

Yes, that's correct, I think Dons shared a bit too much here :)

    [1]                               [1']
    / \                               /  \
  [2] [3]      update node 5        [2]  [3']
      / \                                /  \
    [4] [5]                            [4]  [5']

You have to recreate all nodes that point directly or indirectly to 5,
but you can share all the other nodes like 2 and 4 that have no forward
pointers to 5.

Note that creating new nodes is dead simple, there's no effort involved
on the programmer's part. Here's an example that rotates the top of a
binary tree:

  data Tree a = Leaf a | Node (Tree a) (Tree a)

  rotateRight :: Tree a -> Tree a
  rotateRight (Node (Node a b) c) = Node a (Node b c)

The top two nodes are recreated with the constructor Node but a,b and c
are shared.

> Now in this example, it's ok since that's a regular tree and the process
> can be automated, but when each node has a different type, it can become
> quite hairy.

Yes and no. The point is that if you can't automate it, you have to code
it by hand anyway which constitutes most of the hairiness. But I know
what you mean and there's a nice way to do that with multi-parameter
type classes.

Let's assume a data structure

  data Universe = Universe [Galaxy]
  data Galaxy   = Galaxy   [Planet]
  data Planet   = Planet   { name :: String, size :: Double }

The insight is that in order to reference a planet inside the universe
structure, you have to know the path from the top. In other words, you
have to know that's it's the 2nd planet from the 4th galaxy before you
look up its name. If you don't throw that information away, you can use
it to update it as well. In effect, the Universe behaves like a finite
map with composite keys.

  {-# OPTIONS_GHC -fglasgow-exts -#}
  import Prelude hiding (lookup)

  class Map map key a | map key -> a where
      lookup ::             key -> map -> Maybe a
      adjust :: (a -> a) -> key -> map -> map

  instance Map [a] Int a where
      lookup 0 [x]    = Just x
      lookup 0 _      = Nothing
      lookup k (x:xs) = lookup (k-1) xs
      lookup _ _      = Nothing

      adjust f 0 [x]    = [f x]
      adjust _ 0 xs     = error "Index out of range"
      adjust f k (x:xs) = x : adjust f (k-1) xs
      adjust f _ xs     = error "Index out of range"

  instance Map Universe Int Galaxy where
      lookup k (Universe gs)   = lookup k gs
      adjust f k (Universe gs) = Universe (adjust f k gs)

  instance Map Galaxy Int Planet where
      lookup k (Galaxy ps)   = lookup k ps
      adjust f k (Galaxy ps) = Galaxy (adjust f k ps)

  instance (Map m k m', Map m' k' a) => Map m (k,k') a where
      lookup (k,k') m   = lookup k m >>= lookup k'
      adjust f (k,k') m = adjust (adjust f k') k m


You can lookup the 2nd planet in the 4th galaxy with

  lookup (4,2) universe :: Maybe Planet

and you can update it via

  adjust (\planet -> planet {name = "Earth"}) (4,2) universe

Thanks to type-classes and overloading, you can still access single
galaxies with

  lookup 4 universe :: Maybe Galaxy



Regards,
apfelmus



More information about the Haskell-Cafe mailing list