[Haskell-cafe] Re: Proper round-trip HughesPJ/Parsec for Doubles?
Christian Maeder
Christian.Maeder at dfki.de
Fri Feb 26 07:18:28 EST 2010
Andy Gimblett schrieb:
> Hi Christian,
[...]
>> It may make sense to use something like readMaybe (which is missing in
>> the Prelude) instead of "read" to allow the parser to fail more nicely.
>
> It seems to be kicking up reasonable errors as it is, e.g.:
>
> *Main> parse aFloat "" "2e-h"
> Left (line 1, column 4):
> unexpected "h"
> expecting digit
yes, this is fine, because you reject "-h", but suppose it was passed to
read (due to a programming error).
> I haven't seen any uncaught exceptions propagating, if that's what
> you're worried about...?
Yes, "read" will always work for you. But you could use a parser
*almost* as simple as
many1 $ oneOf "NaInfity+-.eE0123456789"
and ask "readMaybe" if the parsed String can be read as Double.
> So here's what I have now:
>
> float' :: TokenParser st -> GenParser Char st Double
> float' t =
> do n <- maybeChar "-"
> spaces
> fs <- choice [symbol t "NaN",
> symbol t "Infinity",
> do whole <- many1 digit
> frac <- option "" $ do char '.'
> ds <- many1 digit
> return $ '.' : ds
> ex <- option "" $ do choice [char 'e', char 'E']
> s <- maybeChar "+-"
> ds <- many1 digit
> return $ concat ["e", s, ds]
> return $ concat [whole, frac, ex]
> ]
> whiteSpace t
> return $ read $ n ++ fs
> where maybeChar :: String -> GenParser Char st String
> maybeChar as = option "" (choice (map char as) >>= \a -> return
> [a])
I would omit handling of spaces (that's a separate lexing step). It's
enough to be able to parse those numbers, that are possible results of
"show" (for round-trip).
"symbol t" could be replaced by "(try . string)" in order to get rid of
the TokenParser (that I don't like).
Spaces following an initial minus sign are quite unusual and rather
indicate that the sign does not belong to number, but that the sign is a
separate operation.
>>> You can also break it immediately before do, which I think is
>>> sometimes more clear.
>>
>> If not an extra space is added following "do" this leads to an "odd"
>> indentation of at least one line.
>
> I'm curious: which line in the above is indented oddly? Oh, wait: you
> don't mean odd as in "strange", do you? You mean odd as in "not even"?
> So, e.g. the "spaces" line starts at column 5? What's wrong with that?
Right, again a matter of taste.
Cheers Christian
P.S. below is my parser for tptp numbers (as comparison).
It rejects leading zeros, but allows an initial + sign.
"fmap read real" would work if the input does not start with +
(or is NaN or Infinity).
-- | does not allow leading zeros
natural :: Parser String
natural = string "0" <|> many1 digit
decimal :: Parser String
decimal = do
s <- option "" $ string "+" <|> string "-"
ds <- natural
return $ s ++ ds
real :: Parser String
real = do
d <- decimal
f <- option "" $ do
p <- char '.'
n <- many1 digit
return $ p : n
e <- option "" $ do
x <- char 'e' <|> char 'E'
g <- decimal
return $ x : g
return $ d ++ f ++ e
More information about the Haskell-Cafe
mailing list