[Haskell-cafe] Painful Data Abstraction

Tue Apr 7 21:11:23 EDT 2009

I asked this earlier on #haskell, but it was quiet hour.  Here goes.

Suppose I have an abstract data type for prefix dictionaries of 
bytestrings.  I define a typeclass in a module PrefixDict.Class:

    module PrefixDict.Class where

    import qualified Data.ByteString.Char8 as B

    type Word = B.ByteString

    class PrefixDict d where
        empty :: d
        isEmpty :: d -> Bool
        insert :: Word -> d -> d
        delete :: d -> Word -> d
        hasWord :: d -> Word -> Bool
        hasPrefix :: d -> Word -> Bool

I then write several implementations, including a naive list 
implementation, and a trie implementation, in the modules 
PrefixDict.ListDict and PrefixDict.TrieDict, both looking something like 
this:

    module PrefixDict.ListDict where

    import PrefixDict.Class

    data ListDict = ListDict [Word]

    instance PrefixDict ListDict where
        ...

Now, there are many properties that should be satisfied by any 
implementation of PrefixDict, that are not easily captured in the type 
system.  Trying to be a good coder, I write up some generic QuickCheck 
properties, such as

    prop_insert_idempotent :: (Eq d, PrefixDict d) => d -> Word -> Bool
    prop_insert_idempotent d w = insert w d' == d'
      where d' = insert w d

The problem is this:  I can't think of a non-kludged way of checking 
that TrieDict and ListDict satisfy this abstract property without having 
to add explicit type signatures, e.g.

    quickCheck (prop_insert_idempotent :: TrieDict -> Word -> Bool)

I have many abstract properties, and I *really* don't like having to put 
an explicit signature on each one.

I came up with a hackishway that minimizes the number of signatures 
required, by checking all abstract properties in one, generic function, 
reflecting a type as a value, and using scoped type variables:

    testAbstractProperties :: forall d. (PrefixDict d) => d -> IO ()
    testAbstractProperties _ = do
      quickCheck (prop_insert_idempotent :: d -> Word -> Bool)
      quickCheck (prop_delete_works :: d -> Word -> Property)
      ...

I then call this function from a "test suite" function written for each 
module, e.g.

    testTrieDict :: IO ()
    testTrieDict = do
      testAbstractProperties (undefined :: TrieDict)
      testTrieDictSpecificProperties

This means I only have to write the type signatures on the properties in 
one place, but it is still less than ideal.

Is there a better way of doing this, or is Haskell simply not good at 
this kind of data abstraction?

Best,
Brad Larsen