[Haskell-cafe] Haskell Quiz Solution - Haskell Newbie Requesting Review

Brandon Moore brandonm at yahoo-inc.com
Fri Nov 10 00:06:50 EST 2006


Justin Bailey wrote:
> As part of the Ruby Quiz in Haskell solutions appearing on the wiki
> recently, I just a solution to Ruby Quiz #100 - create a bytecode
> interpreter for a simple expression language.
>
> Like I said, the code below parses simple integer arithmetic
> expressions and generates byte codes for a hypothetical stack-based
> intepreter that would evaluate the expressions. To run it, save it as
> a literate haskell file and run "interpret_tests". That just shows
> correctness, though. Other output can be obtained by running
> "compile_tests" (shows bytes for all tests), "generate_tests"
> (symbolic bytecodes for all tests), and "eval_tests" (evaluate ASTs
> for all tests).
>
> To see the AST generated for a example expression, try something like
> 'parse "2-2-2"'.
>
> I'm just learning Haskell (about a month in) and if anyone has time
> and desire to critique the code below, I'd love to hear it. I come
> from an OOP (primarily C# & Ruby) background, so I'm really interested
> in getting a handle on the functional/Haskell "way" of coding. Thanks
> for any feedback!

Looks nice, especially if you're just getting started.

The overall structure looks good, I've just made a bunch of
little changes to the details. Mostly I found repeated patterns
to replace with library functions or extract as helper functions.

In particular, you often wrote
fun (x:xs) = <stuff with x> : fun xs
fun [] = []
instead of fun xs = map <stuff> xs

Getting a little fancier, defining the fold over your expression type
captures the recursion pattern in eval and generate. It's fairly
handy for defining constant folding too, if you want that.

Haskell is very easy to refactor. You pull out some functions, the
code shrinks, you see more subtle patterns, you pull out more functions.
Eventually you start noticing and reifying patterns between those
functions, like a star off the main sequence burning the products
of the last round of fusion, until finally the "degeneracy pressure"
or fixed boilerplate of introducing and calling a new abstraction
stops the collapse. And Haskell has very little syntactic overhead,
no classes, keywords on function definitions, or even block delimiters
 - it's like neutronium instead of that bloated electron-degenerate 
matter :)
I wonder, what's the programming equivalent of a black hole?

Happy Hacking

Brandon

\begin{code}
import Text.ParserCombinators.Parsec hiding (parse)
import qualified Text.ParserCombinators.Parsec as P (parse)
import Text.ParserCombinators.Parsec.Expr
import Data.Bits
import Data.Int

-- Represents various operations that can be applied
-- to expressions.
data Op = Plus | Minus | Mult | Div | Pow | Mod | Neg
 deriving (Show, Eq)

-- Represents expression we can build - either numbers or expressions
-- connected by operators. This structure is the basis of the AST built
-- when parsing
data Expression = Val Integer
  | Statement Op Expression Expression
 deriving (Show)
foldExpression val stmt = f
  where f (Val n) = val n
        f (Statement op l r) = stmt op (f l) (f r)

-- Define the byte codes that can be generated.
data Bytecode = NOOP
  | CONST Integer | LCONST Integer
  | ADD | SUB | MUL | POW | DIV | MOD
  | SWAP
 deriving (Show)

-- Using imported Parsec.Expr library, build a parser for expressions.
expr :: Parser Expression
expr = buildExpressionParser table factor <?> "expression"
 where
 -- Recognizes a factor in an expression
   factor  = between (char '(') (char ')') expr
         <|> number
         <?> "simple expression"
 -- Recognizes a number
   number  = fmap (Val . read) (many1 digit) <?> "number"
 -- Specifies operator, associativity, precendence, and constructor to 
execute
 -- and built AST with.
   table =
       [[prefix "-" (Statement Mult (Val (-1)))],
        [binary "^" Pow AssocRight],
        [binary "*" Mult AssocLeft,
         binary "/" Div AssocLeft,
         binary "%" Mod AssocLeft],
        [binary "+" Plus AssocLeft,
         binary "-" Minus AssocLeft]]
       where
         binary s op assoc
             = Infix (do{ string s; return (Statement op)}) assoc
         prefix s f
             = Prefix (do{ string s; return f})

-- Parses a string into an AST, using the parser defined above
parse s = case P.parse expr "" s of
 Right ast -> ast
 Left e -> error $ show e

-- Take AST and evaluate (mostly for testing)
eval = foldExpression id evalOp
evalOp op = case op of
   Mult -> (*)
   Minus -> (-)
   Plus -> (+)
   Div -> div
   Pow -> (^)
   Mod -> mod

-- Takes an AST and turns it into a byte code list
generate = foldExpression generateVal (\op l r -> l ++ r ++ generateOp op)
    where generateVal n = if abs n > 2^(2*8)-1
                             then [CONST n]
                             else [LCONST n]
          generateOp op = case op of
              Plus -> [ADD]
              Minus -> [SUB]
              Mult -> [MUL]
              Div -> [DIV]
              Mod -> [MOD]
              Pow -> [POW]

-- Takes a statement and converts it into a list of actual bytes to
-- be interpreted
compile s = toBytes (generate $ parse s)

-- Convert a list of byte codes to a list of integer codes. If LCONST or 
CONST
-- instruction are seen, correct byte representantion is produced
toBytes xs = concatMap instToBytes xs
instToBytes instr = case instr of
   NOOP -> [0]
   (CONST n) -> 1 : toByteList 2 (fromInteger n)
   (LCONST n) -> 2 : toByteList 4 (fromInteger n)
   ADD -> [0x0a]
   SUB -> [0x0b]
   MUL -> [0x0c]
   POW -> [0x0d]
   DIV -> [0x0e]
   MOD -> [0x0f]
   SWAP -> [0x0a]

-- Convert a number into a list of 8-bit bytes (big-endian/network byte 
order).
-- Make sure final list is size elements long
toByteList :: Int -> Int -> [Int]
toByteList size n = reverse $ take size (toByteList' n)
   where toByteList' a = (a .&. 255) : toByteList' (a `shiftR` 8)

fromBytes xs = foldl (\accum byte -> (accum `shiftL` 8) .|. byte) 0 xs

-- The stack machine for binary bytecodes
interpret [] [] = error "no result produced"
interpret (s1:s) [] = s1
interpret s (o:xs) | o < 10 = interpret (fromBytes (take (o*2) xs):s) 
(drop (o*2) xs)
interpret (s1:s2:s) (o:xs)
 | o == 16 = interpret (s2:s1:s) xs
 | otherwise = interpret (evalOpCode o s2 s1:s) xs

evalOpCode o = case o of
                 10 -> (+)
                 11 -> (-)
                 12 -> (*)
                 13 -> (^)
                 14 -> div
                 15 -> mod

-- All tests defined by the quiz, with the associated values they should 
evaluate to.
test1 = [(2+2, "2+2"),
         (2-2, "2-2"),
         (2*2, "2*2"),
         (22, "22"),
         (2 `div` 2, "2/2"),
         (2 `mod` 2, "2%2"),
         (3 `mod` 2, "3%2")]

test2 = [(2+2+2, "2+2+2"),
         (2-2-2, "2-2-2"),
         (2*2*2, "2*2*2"),
         (22^2, "22^2"),
         (4 `div` 2 `div` 2, "4/2/2"),
         (7`mod`2`mod`1, "7%2%1")]

test3 = [(2+2-2, "2+2-2"),
         (2-2+2, "2-2+2"),
         (2*2+2, "2*2+2"),
         (22+2, "22+2"),
         (4 `div` 2+2, "4/2+2"),
         (7`mod`2+1, "7%2+1")]

test4 = [(2+(2-2), "2+(2-2)"),
         (2-(2+2), "2-(2+2)"),
         (2+(2*2), "2+(2*2)"),
         (2*(2+2), "2*(2+2)"),
         (2^(2+2), "2^(2+2)"),
         (4 `div` (2+2), "4/(2+2)"),
         (7`mod`(2+1), "7%(2+1)")]

test5 = [(-2+(2-2), "-2+(2-2)"),
         (2-(-2+2), "2-(-2+2)"),
         (2+(2*(-2)), "2+(2*-2)")]

test6 = [((3 `div` 3)+(8-2), "(3/3)+(8-2)"),
         ((1+3) `div` (2 `div` 2)*(10-8), "(1+3)/(2/2)*(10-8)"),
         ((1*3)*4*(5*6), "(1*3)*4*(5*6)"),
         ((10`mod`3)*(2+2), "(10%3)*(2+2)"),
         -- (2^(2+(3 `div` 2)2), "2^(2+(3/2)2)"), -- maybe *2 at the end?
         ((10 `div` (2+3)*4), "(10/(2+3)*4)"),
         (5+((5*4)`mod`(2+1)), "5+((5*4)%(2+1))")]

suite = [test1, test2, test3, test4, test5, test6]

suiteResults doCase = [doCase val stmt | batch <- suite, (val, stmt) <- 
batch]

checkResult fun expected arg | result == expected = "Passed: " ++ show arg
                             | otherwise = "Failed: " ++ show arg ++ "(" 
++ show result ++ ")"
  where result = fun arg
showResult fun _ arg = (arg, fun arg)

-- Evaluates the tests and makes sure the expressions match the expected 
values
eval_tests = suiteResults (checkResult (eval . parse))

-- Takes all the tests and displays symbolic bytes codes for each
generate_tests = suiteResults (showResult (generate . parse))

-- Takes all tests and generates a list of bytes representing them
compile_tests = suiteResults (showResult compile)

-- Execute the binary bytecode for each test, and make sure the results 
are expected
interpret_tests = suiteResults (checkResult (fromIntegral . interpret [] 
. compile))
\end{code}


More information about the Haskell-Cafe mailing list