[Haskell-cafe] A very nontrivial parser
Andrew Coppin
andrewcoppin at btinternet.com
Wed Jul 4 16:41:59 EDT 2007
Well, I eventually got it to work correctly... (!)
My goal is to be able to stack multiple parsers one on top of the other
- but be able to *change* the stack half way through parsing if needed.
This I eventually succeeded in doing. The external interface is fairly
simple, but the type signatures are NOT. (!!)
My basic idea was to abstract the data source that a parser gets its
data from:
class Source s where
empty :: s x -> Bool
fetch :: s x -> (x, s x)
instance Source [] where -- Nice syntax... :-S
empty = null
fetch xs = (head xs, tail xs)
Now I can define a parser type. But... uh... there's a slight glitch.
What I *want* to say is
Parser state in out = ...
But what I ended up with is
newtype Parser state src x y = Parser ((state, src x) -> (state, src
x, y))
I then make Parser a monad, write some functions to get/set the state
parameter, and
token_get :: (Source src) => Parser state src x x
token_get = Parser (\(state, tokens) -> let (t,ts) = fetch tokens in
(state, ts, t))
Anyway, all of that more or less works. Then I begin the utterly
psychopathic stuff:
data Stack state0 src0 t0 t1 = ...
instance Source (Stack state0 src0 t0) where ...
stacked :: st0 -> Parser st0 src0 t0 [t1] -> st1 -> Parser st1 (Stack
st0 src0 t0) t1 x -> Parser st9 src0 t0 x
By this point, my brain is in total agony! >_<
But, almost unbelievably, all this psychotic code actually *works*...
(Well, there were a few bugs, they're fixed now.)
Essentially, I have the "stacked" function, where if I do
x <- stacked foo parser1 bar parser2
y <- parser3
then it runs parser2, but it uses parser1 to transform the data first.
Which is what I actually wanted in the first place... Most critically,
when parser2 *stops* demanding tokens, parser3 is run, picking up from
where parser1 left off. (Confused yet? Wait til you see the code to
implement this insanity!)
One problem remains... That pesky source type. Every time I mention a
parser, I have to say what kind of course object it reads from - even
though all parsers work with *any* source object! (That's the whole
point of the Source class.) I really want to get rid of this. (See, for
example, the type signature for "stacked". Yuck!) Also, every time I
write a simple parser, I get a compile-time error saying something about
a "monomorphism restriction" or something... If I add an explicit type
it goes away, but it's very annoying to keep typing things like
test7 :: (Source src) => Parser state src Int Int
and so forth. And I can't help thinking if I could just get *rid* of
that stupid source type in the signature, there wouldn't be a problem...
Anybody have a solution to this?
More information about the Haskell-Cafe
mailing list