Things and limitations...

Tom Pledger Tom.Pledger@peace.com
Tue, 15 May 2001 16:50:02 +1200


Juan Carlos Arevalo Baeza writes:
 :
 |     First, about classes of heavily parametric types. Can't be done, I 
 | believe. At least, I haven't been able to. What I was trying to do (as an 
 | exercise to myself) was reconverting Graham Hutton and Erik Meijer's 
 | monadic parser library into a class. Basically, I was trying to convert the 
 | static:
 | 
 | ---
 | newtype Parser a = P (String -> [(a,String)])
 | item :: Parser Char
 | force :: Parser a -> Parser a
 | first :: Parser a -> Parser a
 | papply :: Parser a -> String -> [(a,String)]
 | ---
 | 
 | ---
 | class (MonadPlus (p s v)) => Parser p where
 |      item :: p s v v
 |      force :: p s v a -> p s v a
 |      first :: p s v a -> p s v a
 |      papply :: p s v a -> s -> [(a,s)]
 | ---
 | 
 |     I have at home the actual code I tried to make work, so I can't just 
 | copy/paste it, but it looked something like this. Anyway, this class would 
 | allow me to define parsers that parse any kind of thing ('s', which was 
 | 'String' in the original lib), from which you can extract any kind of 
 | element ('v', which was 'Char') and parse it into arbitrary types (the 
 | original parameter 'a'). For example, with this you could parse, say, a 
 | recursive algebraic data structure into something else.
 | 
 |     Nhc98 wouldn't take it. I assume this is NOT proper Haskell. The 
 | questions are: Is this doable? If so, how? Is this not recommendable? If 
 | not, why?

I did something similar recently, but took the approach of adding more
parameters to newtype Parser, rather than converting it into a class.
Here's how it begins:

    type    Indent          = Int
    type    IL a            = [(a, Indent)]
    newtype Parser a m b    = P (Indent -> IL a -> m (b, Indent, IL a))

    instance Monad m => Monad (Parser a m) where
        return v            = P (\ind inp -> return (v, ind, inp))
        (P p) >>= f         = P (\ind inp -> do (v, ind', inp') <- p ind inp
                                                let (P p') = f v
                                                p' ind' inp')
        fail s              = P (\ind inp -> fail s)

    instance MonadPlus m => MonadPlus (Parser a m) where
        mzero               = P (\ind inp -> mzero)
        (P p) `mplus` (P q) = P (\ind inp -> (p ind inp `mplus` q ind inp))

    item                   :: MonadPlus m => Parser a m a
    item                    = P p
                            where
                                p ind []            = mzero
                                p ind ((x, i):inp)
                                    | i < ind       = mzero
                                    | otherwise     = return (x, ind, inp)

This differs from Hutton's and Meijer's original in these regards:

  - It's generalised over the input token type: the `a' in
    `Parser a m b' is not necessarily Char.

  - It's generalised over the MonadPlus type in which the result is
    given: the `m' in `Parser a m b' is not necessarily [].

  - It's specialised for parsing with a layout rule: there's an
    indentation level in the state, and each input token is expected
    to be accompanied by an indentation level.

You could try something similar for your generalisations:

    newtype Parser ct r = P (ct -> [(r, ct)])
    -- ct: collection of tokens, r: result

    instance SuitableCollection ct => Monad (Parser ct)
        where ...

    instance SuitableCollection ct => MonadPlus (Parser ct)
        where ...

    item   :: Collects ct t => Parser ct t
    force  :: Parser ct r -> Parser ct r
    first  :: Parser ct r -> Parser ct r
    papply :: Parser ct r -> ct -> [(r, ct)]

The `SuitableCollection' class is pretty hard to define, though.
Either it constrains its members to be list-shaped, or it prevents you
from reusing functions like `item'.  Hmmm... I think I've just
stumbled across your reason for treating Parser as a class.

When the input isn't list-shaped, is the activity still called
parsing?  Or is it a generalised fold (of the input type) and unfold
(of the result type)?

Regards,
Tom