[Haskell-cafe] Roman to Decimal Algorithms

Sun Jun 7 05:59:34 EDT 2009

I recently played in a code golf Roman to Decimal challenge (in Perl,
Python, Ruby and PHP). In playing this game, I found some interesting
short algorithms for converting from Roman Numerals to Decimal. Because
I think some of these are new, I'd like to present them here, in case
they are not really new after all, or in case there are problems with
these algorithms that I've overlooked.

I'd like to eventually write up a "rosetta code" article comparing the
implementation of these algorithms in various languages, including
Haskell. Since I'm only a very occasional Haskell programmer, I thought
it best to get feedback from Haskell experts before inflicting any of
my Haskell code on a wider audience. Hence this post.

To keep this post reasonably brief, note that these algorithms are for
"modern" Roman Numerals only, limited to the range 1-3999, and with no
error checking. That is, the code below assumes that the input is always
a well formed Roman Numeral. I have tested each algorithm against every
modern Roman Numeral in the range 1-3999.

As a starting point, note this "HaskellWiki" function:

romanToInt :: String -> Int
romanToInt = fst
              . foldr (\p (t,s) -> if p >= s then (t+p,p) else (t-p,p)) (0,0)
              . map (fromJust . flip lookup (zip "IVXLCDM" [1,5,10,50,100,500,1000]))

taken from http://haskell.cs.yale.edu/haskellwiki/Roman_numerals.
I'm going to essentially duplicate the functionality of this code.

{-# OPTIONS_GHC -fglasgow-exts -Wall #-}
import Data.Char (toUpper)

rtoa :: Char -> Int
rtoa 'M' = 1000
rtoa 'D' =  500
rtoa 'C' =  100
rtoa 'L' =   50
rtoa 'X' =   10
rtoa 'V' =    5
rtoa 'I' =    1
rtoa r   = error $ "Invalid rtoa char:" ++ show r

urtoa :: Char -> Int
urtoa = rtoa . toUpper

romanToInt :: String -> Int
romanToInt = foldl1 (\t n -> t+n-t`mod`n*2) . map urtoa

The essential difference between this solution and the HaskellWiki
one is the use of the "running total" for state rather than the
"previous value". I see this as an improvement mainly because it
is shorter -- though you might argue that it is less clear.

An alternative way to express the romanToInt function is:

romanToInt = foldl (\t c -> t+(urtoa c)-t`mod`(urtoa c)*2) 0

I'm open to persuasion as to which is better Haskell style.

Since I'm not playing golf anymore, the rtoa function above, though
hardly short, seemed to me to be the simplest and clearest way to
express converting a single Roman Numeral to its corresponding
arabic number. Again, suggestions for the "best"/most efficient
way to do this in Haskell are most welcome.

A common, and often winning technique, in golf is to perform these
sorts of conversions by concocting a "magic formula". For fun, I
rewrote rtoa using a magic formula I used in the golf game:

rtoa c = 10^(205558`mod`(ord c)`mod`7)`mod`9995

I'm not suggesting that magic formulae are useful outside of golf and
this second rtoa function, though shorter, is much less clear. I might
add that this particular magic formula appears to be less useful in
Haskell golf than the other languages because `mod` is five times
longer than the % operator of the other languages. :)

By way of explanation, notice that this formula:

 205558`mod`(ord c)`mod`7

maps I->0, X->1, C->2, M->3, V->4, L->5, D->6 as shown below:

 Roman   m      10^m   10^m`mod`9995
 -----   -      -----  -------------
   M     3       1000      1000
   D     6    1000000       500
   C     2        100       100
   L     5     100000        50
   X     1         10        10
   V     4      10000         5
   I     0          1         1

Noticing this, you can replace the 205558`mod`(ord c)`mod`7 magic
formula with a function that returns a string index (index() in
Perl and Python). I am sometimes overwhelmed by the quantity and
richness of all the functions in the GHC Haskell libraries.
I eventually found a Haskell solution that seemed to work:

{-# OPTIONS_GHC -fglasgow-exts -XOverloadedStrings -Wall #-}
import Data.ByteString.Char8 (elemIndex)
import Data.Maybe (fromJust)

rtoa c = 10^(fromJust (elemIndex c "IXCMVLD"))`mod`9995

I got this to work by trial and error and have no clue what this
"-XOverloadedStrings" and "Data.ByteString.Char8" business really
means. If there is a better Haskell way of finding the numeric
index of a particular character in a string, please let me know.

Alternatively, you could write the rtoa function using an approach
taken from the original HaskellWiki solution:

rtoa = fromJust . flip lookup (zip "IVXLCDM" [1,5,10,50,100,500,1000])

What is your recommendation as to the "best"/most efficient way
of writing the rtoa function in Haskell?

Finally, here is an example complete test program.
Suggestions for improving the style of this code are welcome.

{-# OPTIONS_GHC -fglasgow-exts -Wall #-}
import Data.Char (toUpper)
import Data.List (concat, intersperse)

rtoa :: Char -> Int
rtoa 'M' = 1000
rtoa 'D' =  500
rtoa 'C' =  100
rtoa 'L' =   50
rtoa 'X' =   10
rtoa 'V' =    5
rtoa 'I' =    1
rtoa r   = error $ "Invalid rtoa char:" ++ show r

urtoa :: Char -> Int
urtoa = rtoa . toUpper

-- one is derived from http://haskell.cs.yale.edu/haskellwiki/Roman_numerals
one :: String -> Int
one = fst . foldr (\p (t,s) -> if p >= s then (t+p,p) else (t-p,p)) (0,0) . map urtoa

two :: String -> Int
two = foldl1 (\t n -> t+n-t`mod`n*2) . map urtoa

three :: String -> Int
three = foldl (\t c -> t+(urtoa c)-t`mod`(urtoa c)*2) 0

myshow :: String -> (String -> Int) -> String -> String
myshow capt fn val = capt ++ val ++ " " ++ (show (fn val))

testdata :: [String]
testdata =  [ "I", "xiv", "DXVI", "CMLXIII" ]

main :: IO ()
main = do
  putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "one  : " one  c))  testdata))
  putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "two  : " two c))   testdata))
  putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow "three: " three c)) testdata))

Cheers,
/-\

      Need a Holiday? Win a $10,000 Holiday of your choice. Enter now.http://us.lrd.yahoo.com/_ylc=X3oDMTJxN2x2ZmNpBF9zAzIwMjM2MTY2MTMEdG1fZG1lY2gDVGV4dCBMaW5rBHRtX2xuawNVMTEwMzk3NwR0bV9uZXQDWWFob28hBHRtX3BvcwN0YWdsaW5lBHRtX3BwdHkDYXVueg--/SIG=14600t3ni/**http%3A//au.rd.yahoo.com/mail/tagline/creativeholidays/*http%3A//au.docs.yahoo.com/homepageset/%3Fp1=other%26p2=au%26p3=mailtagline