File Header Pragmas in Lexer

Alan & Kim Zimmerman alan.zimm at gmail.com
Wed Oct 29 19:54:14 UTC 2014


Ok, to answer my own question, I changed nested_comment to

    nested_comment :: P (RealLocated Token) -> Action
    nested_comment cont span buf len = do
      input <- getInput
      go (reverse $ lexemeToString buf len) (1::Int) input

It now starts off with the already lexed part.


On Wed, Oct 29, 2014 at 9:04 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
wrote:

> As part of my ongoing efforts to round-trip source code, I have bumped
> into an issue around file header pragmas, e.g.
>
>     {-# LANGUAGE PatternSynonyms #-}
>     {-# Language DeriveFoldable #-}
>     {-# options_ghc -w #-}
>
>
> In normal mode, when not called from headerInfo, the file header pragmas
> are lexed enough to generate a warning about an invalid pragma if enabled,
> and then lexed to completion and returned as an `ITblockComment` if
> `Opt_KeepRawTokenStream` is enabled.
>
> The relevant Alex rule is
>
>     <0> {
>       -- In the "0" mode we ignore these pragmas
>       "{-#"  $whitechar* $pragmachar+ / { known_pragma fileHeaderPrags }
>                          { nested_comment lexToken }
>     }
>
> The problem is that the tokens returned are
>
>     ITblockComment " PatternSynonyms #"
>     ITblockComment " DeriveFoldable #"
>     ITblockComment " -w #"
>
> It is not possible to reproduce the original comment from these.
>
>
> It looks like nested comment ignores what has been lexed so far
>
>     nested_comment :: P (RealLocated Token) -> Action
>     nested_comment cont span _str _len = do
>     ...
>
> So my question is, is there any way to make the returned comment include
> the prefix part? Perhaps be a specific variation of nested_comment that
> uses str and len.
>
> Alan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141029/96156884/attachment.html>


More information about the ghc-devs mailing list