Robustness of instance Read Char
Peter Thiemann
thiemann@informatik.uni-freiburg.de
15 Oct 2001 12:45:16 +0200
Folks,
my code has unwillingly been forced to read a large string generated
by show. This turned out to be a robustness test because the effect is
a stack overflow (with Hugs as well as with GHC) and, of course, this
error happened in a CGI script.
If you want to try the effect yourself, just take a file "foo" of,
say, 150k and type this into you hungry Hugs prompt:
readFile "foo" >>= \s -> putStr (read (show foo))
Digging down into the prelude code (taken from Hugs's prelude file),
you find this:
> instance Read Char where
> readsPrec p = readParen False
> (\r -> [(c,t) | ('\'':s,t) <- lex r,
> (c,"\'") <- readLitChar s ])
> readList = readParen False (\r -> [(l,t) | ('"':s, t) <- lex r,
> (l,_) <- readl s ])
> where readl ('"':s) = [("",s)]
> readl ('\\':'&':s) = readl s
> readl s = [(c:cs,u) | (c ,t) <- readLitChar s,
> (cs,u) <- readl t ]
which means that the parser reading this string has the ability to
fail and to backtrack *at every single character*. While this might be
useful in the general case, it certainly causes our little one-line
program to die.
Unfortunately, in my real program, the String is embedded in a data
type which is deriving Read, so that writing the specific instance of
read is a major pain. Two things would help me in this situation:
1. some kind-hearted maintainer of a particularly well-behaved Haskell
implementation might put in a more efficient definition in the
instance Read Char (or convince me that backtracking inside of
reading a String is a useful gadget). The following code will do:
readListChar :: String -> [(String, String)]
readListChar =
return . readListChar' . dropWhile isSpace
readListChar' ('\"':rest) =
readListChar'' rest
readListChar'' ('\"':rest) =
("",rest)
readListChar'' rest =
let (c, s') = head (readLitChar rest)
(s, s'') = readListChar'' s'
in (c:s, s'')
{- clearly, taking the head should be guarded and a proper error
message generated -}
2. provide a way of locally replacing the offending instance of Read
with something else. [urgh, a language extension]
Any suggestions or comments?
-Peter