Strange bug (in ghc?)

Hampus Ram
Sat, 14 Jun 2003 16:19:29 +0200


the stangest bug I've ever experienced has manifested in my code and I
badly need some help finding it.

I'm currently writing a small lexer generator using template haskell. It
is supposed to fit in nicely with Happy and is therefore of the
"threaded monadic" type. However it does not work, or at lest not when
the number of regexps get too big and I compile the code. Using it with
ghci is never a problem but when compiling (no matter which optimisation
level I use) it often hangs in the middle of lexing (it is not the
generation of code that is the problem, but the generated code itself).

Given the very same input it seems to hang (eating 100% cpu but without
doing anything visible) at very different stages. Sometimes it outputs
almost all tokens and sometimes it doesn't read more that one or two.
This is strange behaviour from a pure function but may be due to

To add more strangeness to it all the following code does not work (the
function lexer is the TH generated function, imports omitted):

main = do [arg] <- getArgs
          str <- readFile arg
          putStrLn $ show $ lexer cont str 0 0

cont t s c l 
     = case t of 
            Token.To_Error -> error ""
            Token.To_Eof -> []
            _ -> t : lexer cont s c l

However if you change "case t of" to the more esoteric 
"case unsafePerformIO (putStrLn (show t) >> return t) of " it works like 
a charm. The same if you plug it into Happy. Using it with normal 
parameters it does not work, but generating it with -d does the trick. 
Something strange with lazyness comes into mind but I can't think of what. 
The second thing I can think of is the compiler infering strange types
somehow. All TH generated code is untyped, but that should not give any

What can it be? Is it some error in ghc, perhaps something connected
with template haskell (I do have a slightly different version that
generates a .hs file instead and that has not failed me yet)? Do anybody
know what's wrong or can give me some pointers to how I can find out? I
do not appreciate reading external core or so since the problem only
occurs for large lexers (which currently generates a few thousand
lines since it doesn't use tables but functions).

It's the very same with both ghc 6.0 and yesterdays CVS version and
-dcore-lint does not say anything.

I should perhaps also say that "hangs" means that it hasn't produced any
output after it has run for about one hour at ~100% cpu. That seems to me
as quite a bit too much time for one to suspect that it just takes a lot
of time to produce output...

/Hampus - very confused

If you need more information (such as intenals of "lexer") just say so,
i think this letter is big enough now :)


"Det är aldrig försent att ge upp"