[Haskell-beginners] Re: Memoization
apfelmus at quantentunnel.de
Mon Dec 1 05:02:08 EST 2008
abdullah abdul Khadir wrote:
> I am a student and we had an assignment in Haskell. The question, was
> given a string of the form "1-2+3*5-7+4-6+3" i.e., any sequence of integers
> as well as some operators between them we had to find a maximum possible
> value for the expression as well as the expression itself . So for maxval
> "1-2+3*5-7+4-6+3" it is (76,"(1-((2+3)*((5-(7+4))-(6+3))))"). The function
> we had to write was maxval :: String -> (Int,String). For further details on
> the question, have a look at our sir's web page
> I solved the question, we had to use memoization, and submitted the
> solution. It is given below. Now the problem is I am just wondering if it
> can be solved in a better manner. Translation : Is there some way in Haskell
> to do it in a more simpler way and as well as to reduce the number of lines
> of the program.
Yes, your solution can be made more beautiful. Let me show how.
First of all, we should separate *parsing* the input into a list of
numbers and operations from *computing* the result.
maxval :: String -> (Int, String)
maxval = compute . parse
In other words, parse extracts the numbers and arithmetic operations
from the input. One example implementation is
type Op = Char
parse :: String -> ([Int], [Op])
parse  = (,)
parse s = let (n,s2) = parseInt s in
case s2 of
 -> ([n],)
op:s3 -> let (ns,ops) = parse s3 in (n:ns,op:ops)
parseInt :: String -> (Int, String)
parseInt s = (n, s')
(digits, s') = span isDigit s
n = foldl (\x c -> 10*x + fromDigit c) 0 digits
fromDigit c = ord c - ord '0'
but it's more complicated then necessary. We should at least use the
reads functions from the Prelude . And in any case, *parser
combinators* are the best way to parse something. But for now, the
straightforward way above shall suffice.
Second, we can considerably clarify things by defining new types. Our
first abstraction is the *expression*
type Expr = (Int,String)
value :: Expr -> Int
value = fst
which consists of a value and its textual representation. For instance,
are expressions. We can combine two expression by applying one of our
arithmetic operations both to the value and the textual representation
applyExpr :: Op -> Expr -> Expr -> Expr
applyExpr op (x,ex) (y,ey) =
(f op x y, "(" ++ ex ++ [op] ++ ey ++ ")")
f '+' = (+)
f '-' = (-)
f '*' = (*)
Our main algorithm will choose maximal and minimal values from a set of
possible expressions. Therefore, we introduce the following type
data MinMax = M Expr Expr deriving (Show)
which represents a range of values by recording expressions of minimal
and maximal value.
maxexpr :: MinMax -> Expr
maxexpr (M _ e) = e
We can merge two such ranges ("union") by choosing the lower first and
the higher second part:
merge :: MinMax -> MinMax -> MinMax
merge (M x y) (M x2 y2) = M emin emax
emin = if value x < value x2 then x else x2
emax = if value y > value y2 then y else y2
merges :: [MinMax] -> MinMax
merges = foldr1 merge
fromExpr :: Expr -> MinMax
fromExpr e = M e e
Now, we also want to apply arithmetic expressions to these ranges. For
'+','-' and '*', the following function does the right thing:
applyMinMax :: Op -> MinMax -> MinMax -> MinMax
applyMinMax op (M x y) (M x2 y2) =
merges [fromExpr (applyExpr op z z2) | z<-[x,y], z2<-[x2,y2]]
With these preliminaries, we can now express the algorithm. The main
ingredient is a function (f :: Int -> Int -> MinMax) that calculates
the range of possible values for expression that only utilize numbers
between the positions i and j in the list. And as common in dynamic
programming, we employ a memo table to store the intermediate results.
compute :: ([Int], [Op]) -> Expr
compute (xs,ops) = maxexpr (f 1 n)
n = length xs
f = memoize n f'
f' i j
| i == j = let x = xs !! (i-1)
e = (x,show x)
in fromExpr e
| otherwise = merges [applyMinMax (ops !! (k-1))
(f i k) (f (k+1) j)
| k <- [i..(j-1)]]
Where exactly is the memo table? It's hidden in
memoize :: Int -> (Int -> Int -> a) -> (Int -> Int -> a)
memoize n f = \i j -> table ! (i,j)
table = array ((1,1),(n,n))
[((i,j), f i j) | i<-[1..n], j<-[1..n]]
which takes a function of two arguments from 1 to n and tabulates its
values in an array. (You need to import Data.Array for the arrays.) In
other words, f tabulates the results of f' which in turn uses the
tabulated values returned by f to compute its results. Thanks to lazy
evaluation, this "tabulate the result before it's available" works.
To summarize, the key points of the new solution are
* Parse input.
* Memoization is a simple higher order function.
But there is more. Namely, there are many different ways to implement
the memoization. I used an array, you were asked to use a linked list.
The former is O(1) the latter O(n). There is a way to do it with plain
trees but still O(1), see also section 3 of
Richard Bird and Ralf Hinze.
"Trouble Shared is Trouble Halved"
And there is even more! Namely, we knew that the problem is an instance
of dynamic programming, we knew the algorithm before implementing it.
But how to find the algorithm in the first place? Well, the usual answer
is "by thinking hard". However, there are very systemic ways to derive
dynamic programming algorithms from just the problem specification! In a
sense, much of the work of R. Bird centers this topic. The book "Algebra
is one of the cornerstones.
The systematic derivation of dynamic programming algorithms has been
rediscovered in a more direct but less general fashion
More information about the Beginners