idiom for different implementations of same idea
Hal Daume III
hdaume@ISI.EDU
Thu, 1 Aug 2002 14:34:00 -0700 (PDT)
Hi all,
I'm looking for some advice on what's the cleanest way to implement
something. The basic idea is that I have a task to solve, T. There are
many steps to solving this task, but they can be broken down into a small
list of elementary steps:
- prepareData
- initialize
- doThingOne
- doThingTwo
- getResults
where the main driver does something like:
prepareData
initialize
iterate until converged
doThingOne
doThingTwo
getResults
As is standard in my field (statistical natural langauge processing), I
have several models defined to perform task T (though only the first, most
basic one is implemented). Call these Model0, Model1, Model2, and so on.
All of these models, since they're solving the same basic task, have the
same basic types on their functions, something like (a bit simplified,
but):
prepareData :: Data () -> Data markup
initialize :: Data markup -> ST s table
doThingOne :: Data markup -> table -> ST s alignments
doThingTwo :: Data markup -> alignments -> ST s table
getResults :: Data markup -> table -> alignments -> String
Simple enough. Now, say I have three models (0-2) which implement these
functions with varying complexities (also of varying complexities in terms
of the types of 'markup', 'table' and 'alignments'). Each model has a
different idea of what these three types are.
Now, I want in my executable my user to be able to say "-model=0" and so
on in the command line and for it to use the appropriate model. Each of
these models will go in a separate module.
One way to do this would be to import all of the models qualified and then
if they choose Model0, pass to the "go" function Model0.prepareData,
Model0.initialize, etc. This is fine, simple, good. But it doesn't
enforce at all the types of the functions.
Another way to go would be to make a class, something like:
class Model model markup table alignments
| model -> markup, table, alignments where
prepareData :: model -> Data () -> Data markup
initialize :: model -> Data markup -> ST s table
doThingOne :: model -> Data markup -> table -> ST s alignments
doThingTwo :: model -> Data markup -> alignments -> ST s table
getResults :: model -> Data markup -> table -> alignments -> String
where the model type/parameter is essentially a dummy to tie everything
together. This could be implemented a bit more cleaning with the
definition:
data T a
and then the first parameter for all of those class functions changing
from "model" to "T model". Each model module would then have it's own
datatype, something like:
> (in module Model0:)
> data Model0
>
> instance Model Model0 Int Table Al where
> ...
This is another option; however I don't think it's very clean for two
reasons:
- there are a lot of fundeps; in the actual application there
are a few more type variables than I presented here, and it
gets very long :)
- the model parameter is used only to determine which model
we're using and doesn't actually do anything other than
satisfy the typechecker
There are probably a plethora of alternatives I haven't considered, but
I'm sure people have done something similar to this before and I'm curious
how they handled it...
Thanks for reading this far :)
- Hal
--
Hal Daume III
"Computer science is no more about computers | hdaume@isi.edu
than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume