[Haskell-cafe] Looking for advice on modelling a 3-way merge (with varying strategies)

Thu Nov 1 02:51:26 CET 2012

Hello,

I'm currently trying to implement a way 3 way merge, that is dispatched
on type. Essentially, I want to be able to say "I have this large data
type which is a product of various other types - here's how to merge
it".

By 3 way merge, I mean that I have a 'new' value, a 'current' value and
a 'common ancestor' value. So far I have:

  import Control.Applicative
  import Data.Monoid

  newtype Merge a = Merge { runMerge :: Either [String] a }

  instance Functor Merge where
    fmap f a = Merge $ fmap f $ runMerge a

  instance Applicative Merge where
    pure = Merge . Right
    (Merge (Right f)) <*> (Merge (Right a)) = Merge $ Right $ f a
    (Merge (Right _)) <*> (Merge (Left c))  = Merge $ Left $ c
    (Merge (Left c)) <*> (Merge (Left c'))  = Merge $ Left $ c <> c'
    (Merge (Left c)) <*> _                  = Merge $ Left $ c

  ok :: a -> Merge a
  ok = pure

  failMerge :: String -> Merge a
  failMerge x = Merge $ Left [x]

So far so good, so I can add my first merge strategy:

  mergeEq :: Eq a => String -> a -> a -> a -> Merge a
  mergeEq lbl new current ancestor | current == new      = ok new
                                   | current == ancestor = ok new
                                   | new == ancestor     = ok current
                                   | otherwise           = failMerge lbl

And we can make use of this to build more complicated merge strategies:

  data Person = Person { name :: String, surname :: String }
    deriving (Show)

  mergeEqOn :: Eq a => (b -> a) -> String -> b -> b -> b -> Merge a
  mergeEqOn l lbl n c a = mergeEq lbl (l n) (l c) (l a)

  mergePerson :: Person -> Person -> Person -> Merge Person
  mergePerson new current ancestor =
    Person <$> mergeEqOn name "name" new current ancestor
           <*> mergeEqOn surname "surname" new current ancestor

  >>> runMerge $ mergePerson Person { name="Steve", surname="Bobman" }
                             Person { name="Joe",   surname="Obman" }
                             Person { name="oe",    surname="Obman" }
  Left ["name"]

  >>> runMerge $ mergePerson Person { name="Steve", surname="Bobman" }
                             Person { name="Joe", surname="Obman" }
                             Person { name="Joe", surname="Obman" }
  Right (Person {name = "Steve", surname = "Bobman"})

Everything does exactly what I want, but it doesn't smell as good as I
expect from Haskell. Firstly, all my primitives or operations take 3
arguments that have to be threaded in a very specific pattern. If I
accidently call mergeEq ancestor current new, I've ran things in the
wrong order and I'm Gonna Have a Bad Time. So that's no good.

Secondly, is there a better way of labelled parts of a merge? I suspect
not, as at some point I need a human readable display, but I did wonder
about using a lens as a label, and then mapping lenses to human readable
names later -- but that will require a bit of newtyping in order to
provide the Eq instance, or maybe i make a HumanLens type which is a
lens and a name?).

Finally, I'm currently a little obsessed with the idea of building
applicatives from the composition or product of other applicatives - can
I use this trick here? I initially used 'Compose ((,) [String]) Maybe a'
but this means it's possible to have
(["Conflicts"], Just "But success!"), which is nonsense.

I'd love to hear your thoughts!

- ocharles