[Haskell-cafe] help w/ improving custom Stream for parsec

Maurizio Vitale mrz.vtl at gmail.com
Thu Apr 9 15:57:14 UTC 2015


I'll take a look (I did in the past, but for some reason decided that a
separate preprocessor was not the way I wanted to go)

If you know the answer, otherwise I'll do my own research:
If cpphs expands #defines in pass 2, how does pass 1 know what to
include/skip?
E.g.
#define P
#ifdef P
#include "foo.h"
#endif

and even if only expansion is done in pass2 (but definition takes effect in
pass1), you'd still have problems w/
#define FILE "foo.h"
#include FILE

so it looks to me that full macro proecessing must be done in the first
pass.

On Thu, Apr 9, 2015 at 8:22 AM, Malcolm Wallace <malcolm.wallace at me.com>
wrote:

> In cpphs (a Haskell implementation of the C preprocessor), there are two
> passes: the first pass splits the input into lines, and interprets #if and
> #include directives, leaving all other lines either untouched, or
> eliminated when in the wrong branch of a #if conditional.  Its output is a
> list of lines, each of which is paired with its original source position
> (i.e. file and line number).  Because #includes are transitive, the state
> for the first pass contains a little stack of contexts, that is, the file
> that triggered the inclusion, and its line number, so that once the
> included file is finished, the annotations can be popped to return to their
> enclosing position.
>
> The second pass accepts the [(Posn,String)] emitted by the first pass, and
> interprets/expands #define macros and their usage sites.  It uses the
> positional annotations to give better warning/error messages in the case
> that, for instance, a #define is syntactically not-well-formed.  Macro
> expansion needs to treat the input as words rather than lines, but the
> macro definitions obviously are lexed and parsed differently, so it is
> useful to identify them on separate lines before splitting the non-#define
> lines into words.
>
> Regards,
>     Malcolm
>
> On 9 Apr 2015, at 16:01, Maurizio Vitale wrote:
>
> > Thanks Malcolm,
> >
> > I did consider the two pass approach (and actually having pass 1
> returning a stream of tokens annotated with position information)
> > I'm keeping that option open, especially because for speed one might
> implement the first pass with Attoparsec and the rest with parsec.
> > How would you keep track of macro expansions and source positions in
> order to provide nice error messages?
> > Do you know of anything on hackage that does something similar (either
> the two pass  or the custom stream approach)?
> >
> > Again, thanks. I'm still playing with alternatives before implementing
> the real language (which, for the curious, is SystemVerilog) so my barrier
> to trying out and benchmark different approaches is at this moment very low.
> > And the real goal is to learn Haskell,don't care much if I'll have a
> full Verilog parser/elaborator; so playing with alternatives is very much
> useful.
> > The language has also interesting features that make compiling separate
> files in parallel very challenging, so that's another area I want to play
> with before being too invested.
> >
> >
> > On Thu, Apr 9, 2015 at 2:05 AM, Malcolm Wallace <malcolm.wallace at me.com>
> wrote:
> > I think what you really need is a two-pass parser.  The first parser
> consumes the input stream, and copies it to the output stream with files
> inserted, and macros expanded.  The second parser consumes the
> already-preprocessed stream, and does whatever you like with it.
> >
> > Regards,
> >     Malcolm
> >
> > On 7 Apr 2015, at 17:25, Maurizio Vitale wrote:
> >
> > > I need a custom stream that supports insertion of include files and
> expansions of macros.
> > > I also want to be able to give nice error messages (think of clang
> macro-expansion backtrace), so I cannot use the standard trick of
> concatenating included files and expanded macros to the current input with
> setInput/getInput (I think I can't maybe there's a way of keeping a more
> complex "position" and since the use in producing an error backtrac is
> rare, it migth be worth exploring; if anybody has ideas here, I'm listening)
> > >
> > > Assuming I need a more compelx stream, this is what I have (Macro and
> File both have a string argument, but it will be more compicated, a list of
> expansions for Macro for instance).
> > >
> > > Is there a better way for doing this?
> > > What are the performance implications with backtracking? I'll be
> benchmarking it, but if people see obvious problems, let me know.
> > >
> > > Thanks a lot,
> > >   Maurizio
> > >
> > > {-# LANGUAGE FlexibleInstances #-}
> > > {-# LANGUAGE FlexibleContexts #-}
> > > {-# LANGUAGE InstanceSigs #-}
> > > {-# LANGUAGE MultiParamTypeClasses #-}
> > >
> > > module Parsing where
> > >
> > > import Text.Parsec
> > >
> > > type Parser s m = ParsecT s () m
> > >
> > > data VStream = File String | Macro String deriving Show
> > >
> > > newtype StreamStack = StreamStack [VStream] deriving Show
> > >
> > > instance (Monad m) ⇒ Stream VStream m Char where
> > >   uncons ∷ VStream -> m (Maybe (Char, VStream))
> > >   uncons (File (a:as)) = return $ Just (a, File as)
> > >   uncons (File []) = return Nothing
> > >   uncons (Macro (a:as)) = return $ Just (a, File as)
> > >   uncons (Macro []) = return Nothing
> > >
> > >
> > >
> > > instance (Monad m) => Stream StreamStack  m Char where
> > >   uncons (StreamStack []) = return Nothing
> > >   uncons (StreamStack (s:ss)) =
> > >     case uncons s of
> > >      Nothing → uncons $ StreamStack ss
> > >      Just Nothing → uncons $ StreamStack ss
> > >      Just (Just (c, File s')) → return $ Just (c, StreamStack (File
> s': ss))
> > >      Just (Just (c, Macro s')) → return $ Just (c, StreamStack (Macro
> s':ss))
> > >
> > > _______________________________________________
> > > Haskell-Cafe mailing list
> > > Haskell-Cafe at haskell.org
> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150409/35096e6e/attachment.html>


More information about the Haskell-Cafe mailing list