Haskell report typo (bad law for readsPrec)

Patrik Jansson patrikj@cs.chalmers.se
Mon, 18 Jun 2001 10:48:04 +0200 (MET DST)


In short: change the eq. for readsPrec in appendix D.4

In detail:

From "The Haskell 98 Report: Derived Instances: D.4"

...
   The function readsPrec d s accepts a precedence level d (a number from
   0 to 10) and a string s, and attempts to parse a value from the front
   of the string, returning a list of (parsed value, remaining string)
   pairs. If there is no successful parse, the returned list is empty. It
   should be the case that

     fst (head (readsPrec d (showsPrec d x r))) == x

   That is, readsPrec should be able to parse the string produced by
   showsPrec, and should deliver the value that showsPrec started with.
...

Note that the equation stated is not true unless the string r starts with
some token-ending combination.

Counter-examples:

> testreadshow :: (Show a, Read a) => a -> String -> a
> testreadshow x r = fst (head (readsPrec 0 (showsPrec 0 x r)))

A simple one that just fails: the problem is that "Nothingx" can't be
parsed.

> bad1 :: Maybe Int
> bad1  = testreadshow Nothing "x"
> test1 = bad1 == Nothing

Using this we can make the law really false by inventing a datatype
where one constructor is a prefix of another constructor:

> data T = Bad | Badger deriving (Show, Read, Eq)
> bad2  = testreadshow Bad "ger"
> test2 = bad2 == Bad

Actually there is a predefined type with lots of constructors (Int)
where many values are prefixes of others:

> bad3 :: Int
> bad3  = testreadshow 1 "2"
> test3 = bad3 == 1


> main = do print (bad3, test3)
>           print (bad2, test2)
>           print (bad1, test1)

OK, so the falsity of the claim is proved - what could we write instead?

The sentence explaining the equation is correct:

   ... readsPrec should be able to parse the string produced by
   showsPrec, and should deliver the value that showsPrec started with.

Thus a simple way out is to remove the equation and the preceding
sentence leaving the paragraph as:
[Alternative A:]

...
   The function readsPrec d s accepts a precedence level d (a number from
   0 to 10) and a string s, and attempts to parse a value from the front
   of the string, returning a list of (parsed value, remaining string)
   pairs. If there is no successful parse, the returned list is empty.
   Function readsPrec should be able to parse the string produced by
   showsPrec, and should deliver the value that showsPrec started with.
...

Another simple alternative is to weaken the the equation by replacing
the variable r by the empty string:
[Alternative B:]

...
   The function readsPrec d s accepts a precedence level d (a number from
   0 to 10) and a string s, and attempts to parse a value from the front
   of the string, returning a list of (parsed value, remaining string)
   pairs. If there is no successful parse, the returned list is empty. It
   should be the case that

     fst (head (readsPrec d (showsPrec d x ""))) == x

   That is, readsPrec should be able to parse the string produced by
   showsPrec, and should deliver the value that showsPrec started with.
...

[Alternative C:]
A third alternative would be to keep the equation unchanged but to add
the requirement that the string variable r should start with a "token
ending sequence of characters". But what does "token" mean? As an
explanation it might be OK, but the report should be pretty formal and
self-contained so I think this is might be undesirable.

A more complicated alternative would be to state (in words or in
equations) a more detailed correctness requirement, but I think that
would lead too far. The Haskell Show and Read classes are not detailed
enough to capture this - we would need a standardized level of
tokenizing (lexing) and this is clearly out of the scope for any
report changes at this stage.

My vote is for [Alternative B] - weaken the equation, but keep it as a
help to understand the interplay between showsPrec and readsPrec.

/Patrik Jansson