arrows
Wolfgang Jeltsch
wolfgang@jeltsch.net
29 Jun 2002 23:43:54 +0200
On Saturday, 2002-06-08, 00:52, CEST, Ashley Yakeley wrote:
> At 2002-06-07 06:18, Wolfgang Jeltsch wrote:
> > Of course! My Parser type.
>
> What's its implementation?
Hello again,
at the time you asked me the question above, my Parser type wasn't
implemented yet. Until now I worked hard (in my free time) on an
implementation for it. Unfortunately, I didn't find one that is Haskell
98 compatible although I thought this would have to be possible. My
implementation uses explicit universal quantification.
You can view the source code of my Parsing module via the following URI:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/seaweed/software/
Haskell/independent/Seaweed/Core/Parsing.hs?rev=HEAD&
content-type=text/vnd.viewcvs-markup
The parser module uses a morphism/arrow module I've written based on
Ross Paterson's module Arrow. The source code can be viewed via the
following URI:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/seaweed/software/
Haskell/independent/Seaweed/Core/Categories.hs?rev=HEAD&
content-type=text/vnd.viewcvs-markup
The parser module provides most of its operations via class instances.
It follows a short description of the meaning of certain member
variables for parsers:
(>>=): sequencing of parsers
return: returning values
fail: generation of always failing parsers
mzero: always failing parser
mplus: parsing using alternative parsers
fmap: applying a function to the result of a parser
(>>>): providing high-level tokens and parsing using these tokens
identity: reading a single token
pure: reading a single token and applying a function to it
Let me explain the implementation in short. A parser is represented by a
"monad state transformer" similar to what is described by StateT from
Andy Gill's Haskell Monad Template Library. A type ParserState is
introduced to describe the state of such transformers.
To simplify things a bit, let's take a simpler Parser type which doesn't
use monad state transformers but simple state transformers instead. This
makes the type ParserState simpler. You can think of a parser state as
an infinite list of substate-token pairs now. The tokens denote the
input stream. The substates may be used to encapsulate parser states of
a subordinate parsing process. When a token is read, the corresponding
token-substate pair is discarded.
Let me explain the use of substates in detail. If a parser is executed,
an input stream in form of an infinite list is provided. (I don't allow
finite lists here to make some things a bit easier an more elegant. You
can easily produce an infinite list from a finite one by applying
something like (++ repeat Nothing) . map Just to the finite list.) The
input stream is converted to a parser state by combining each token with
a () substate and the parser's state transformer is applied to the
resulting state.
Now let's combine two parsers parserOne and parserTwo with the >>>
operator and call the result combinationParser. This means that
combinationParser uses parserOne to produce tokens parserTwo consumes.
In order to realize this, combinationParser invokes parserOne repeatedly
and pairs each parsed token with the state parserOne was in before this
token was parsed. Then it runs parserTwo with the list of the calculated
pairs as its initial state. That means that the parser states of
parserOne get the substates of parserTwo's parser states.
Encapsulating parserOne's states in parserTwo's states is needed only
for the following reason: When combinationParser has parsed its value,
it has to be in the state parserOne was in after parsing the last token
parserTwo has consumed. If the states of parserOne wouldn't be included
in parserTwo's states as substates, combinationParser would have no way
of determining the state it has to be in after finishing. It would only
now the tokens parserTwo has left but not the lower-level tokens the
multiple invocations of parserOne have left.
Let me finish for now. I'd like to hear your comments.
> Ashley Yakeley, Seattle WA
Wolfgang