[Haskell-cafe] Re: Proper round-trip HughesPJ/Parsec for Doubles?

Thu Feb 25 07:15:07 EST 2010

Hi Christian,

On 24 Feb 2010, at 13:24, Christian Maeder wrote:

> I hope you don't mind if I make some style comments to your "final"  
> version.

Not at all - thanks!

> 1. break the line after "do"
> (to avoid a layout change when change name or arguments of float' or
> rename the variable "e")

I'm not convinced by this; perhaps while editing the code it's useful,  
but those changes don't happen very often, and when they do, any half- 
decent editor ought to be able to handle making the change  
consistently.  I do sometimes drop the do to the next line, but  
usually in order to keep things within 80 columns.  I think this is  
somewhat a matter of personal taste though.  More on this at the end...

> 2. The "t :: TokenParser st" is only used for the white spaces.
> This should be done separately (use "lexeme" from the TokenParser if  
> you
> really need to). Just using "spaces" is also an alternative.

OK - but what I'm trying to do here is create something I can use as a  
drop-in replacement for float from Text.ParserCombinators.Parsec.Token  
- in which case it shouldn't be done separately, I think?

> 3. "liftCtoS" is only applied to '-', so an "optSign" would suffice.
>  optSign = option "" $ fmap (: []) (char '-')

Agreed - although I resurrect it later as maybeChar (see below),  
matching against a choice of characters (to handle +/-) or returning  
"" if empty.

> (read also allows a capital 'E' and a '+' before the exponent, but no
> initial '+' sign.

OK: didn't catch this because show doesn't (it seems) ever write them  
like that.  Thanks.

> The decimal point is optional.

Same comment.  :-)  Fixed below, although I remove this optionality  
for my application (for now) because (I think) I want to be explicit  
about int vs float...

> Also "NaN" and "Infinity" can be read, both possibly preceded by a  
> '-' sign followed by
> spaces. But you may restrict yourself to the possible outputs of show,
> which would include "NaN" and "Infinity", though.)

OK.  Indeed, it seems an initial '-' can be followed by spaces for  
other cases, e.g. "- 2e4", so have implemented that more general  
form.  Adding the NaN and Infinity cases gives us another level of  
indent, and pushes us close enough to 80 columns that I've dropped the  
outermost do to the next line.

> It may make sense to use something like readMaybe (which is missing in
> the Prelude) instead of "read" to allow the parser to fail more  
> nicely.

It seems to be kicking up reasonable errors as it is, e.g.:

*Main> parse aFloat "" "2e-h"
Left (line 1, column 4):
unexpected "h"
expecting digit

I haven't seen any uncaught exceptions propagating, if that's what  
you're worried about...?

> Btw I observed the following problem with read (that readMaybe would
> also not solve). http://hackage.haskell.org/trac/ghc/ticket/3897

Ah, well that's out of scope for me, I fear.  :-)

So here's what I have now:

float' :: TokenParser st -> GenParser Char st Double
float' t =
   do n <- maybeChar "-"
      spaces
      fs <- choice [symbol t "NaN",
                    symbol t "Infinity",
                    do whole <- many1 digit
                       frac <- option "" $ do char '.'
                                              ds <- many1 digit
                                              return $ '.' : ds
                       ex <- option "" $ do choice [char 'e', char 'E']
                                            s <- maybeChar "+-"
                                            ds <- many1 digit
                                            return $ concat ["e", s, ds]
                       return $ concat [whole, frac, ex]
                   ]
      whiteSpace t
      return $ read $ n ++ fs
   where maybeChar :: String -> GenParser Char st String
         maybeChar as = option "" (choice (map char as) >>= \a ->  
return [a])

>> You can also break it immediately before do, which I think is
>> sometimes more clear.
>
> If not an extra space is added following "do" this leads to an "odd"
> indentation of at least one line.

I'm curious: which line in the above is indented oddly?  Oh, wait: you  
don't mean odd as in "strange", do you?  You mean odd as in "not  
even"?  So, e.g. the "spaces" line starts at column 5?  What's wrong  
with that?

Cheers!

-Andy

--
Andy Gimblett
http://gimbo.org.uk/