[Haskell-cafe] Parsec: using two different parser for the same
string
Dan Weston
westondan at imageworks.com
Wed Aug 5 19:33:34 EDT 2009
Of course, since ParsecT s u m is a functor, feel free to use fmap
instead of parsecMap. Then you don't need to import from Text.Parsec.Prim.
And in hindsight, I might prefer the name (<:>) or cons to (<>) for the
first function, but now I'm just obsessing. :)
Dan
Dan Weston wrote:
> I think parsecMap does the job here:
>
> -----------------------
> import Text.ParserCombinators.Parsec hiding ((<|>))
> import Text.Parsec.Prim(parsecMap)
> import Control.Applicative((<|>))
> import Control.Arrow((|||),(&&&))
>
> -- Tagged (:)
> (<>) :: Either Char Char -> Either String String -> Either String String
> Left a <> Left b = Left (a:b)
> Left a <> Right b = Left (a:b)
> Right a <> Left b = Left (a:b)
> Right a <> Right b = Right (a:b)
>
> -- Tagged concat
> stringParser :: [Either Char Char] -> Either String String
> stringParser = foldr (<>) (Right "")
>
> -- Parse Integer if properly tagged, keeping unparsed string
> maybeToInteger :: Either String String -> (Maybe Integer, String)
> maybeToInteger = (const Nothing ||| Just . read) &&& (id ||| id)
>
> -- Tagged-choice parser
> intOrStringParser = parsecMap (maybeToInteger . stringParser)
> $ many1 (parsecMap Right digit <|> parsecMap Left (noneOf ";)"))
>
> -- Parse between parentheses
> intOrStringListParser = between (char '(')
> (char ')')
> (sepBy1 intOrStringParser (char ';'))
> -----------------------
>
> Then you get a tagged version of each string, along with the string itself:
>
> *P> parseTest intOrStringListParser $ "(1;2w4;8;85)"
> [(Just 1,"1"),(Nothing,"2w4"),(Just 8,"8"),(Just 85,"85")]
>
> There may be some parsecMap-fold fusion optimization possible, though I
> haven't looked into that.
>
> Dan
>
> Paul Sujkov wrote:
>> Hi everybody,
>>
>> suppose I have two different parsers: one just reads the string, and
>> another one parses some values from it. E.g.:
>>
>> parseIntList :: Parser [Integer]
>> parseIntList = do
>> char '('
>> res <- liftM (map read) (sepBy1 (many1 digit) (char ';'))
>> char ')'
>> return res
>>
>> parseIntString :: Parser String
>> parseIntString = manyTill anyChar eof
>>
>> so for some input like this - "(1;2;3;4)" - I will have two different
>> result:
>>
>> *Parlog> parseTest parseIntList "(1;2;3;4)"
>> [1,2,3,4]
>> *Parlog> parseTest parseIntString "(1;2;3;4)"
>> "(1;2;3;4)"
>>
>> but the thing that I actually want is something like Parser ([Integer],
>> String) - results from both parsers at a time, no matter whether one of
>> them fails or not:
>>
>> *Parlog> parseTest parseIntListAndString "(1;2;3;4)"
>> ([1,2,3,4], "(1;2;3;4)")
>>
>> it is impossible at first sight, because first parser to use will
>> consume all the input, and there will be nothing to parse for the second one
>>
>> Parsec contains "choice" function, but it is implemented via <|> and
>> that is mplus - so it tries second alternative only if the first one
>> fails. Is it possible to use two parsers for the same string (with
>> try-like backtracking, no input actually consumed till the second parser
>> finishes)? I can assume only dirty hacks with the GenParser internals -
>> manual position storing and backtracking - but that is obviously not good
>>
>> however, my first attempt to solve the problem was kind a like that: to
>> parse string to String, and then to use it as an input for the next
>> level parse call:
>>
>> parseIntListAndString :: Parser ([Integer], String)
>> parseIntListAndString = do
>> str <- parseIntString
>> return (res str, str)
>> where res str = case (parse parseIntList "" str) of
>> Left err -> []
>> Right val -> val
>>
>> but the problems with such a method began when I switched from Parser to
>> GenParser with user state: function parseIntList have to update the
>> state, but it can't have the same state as the parseIntListAndString any
>> more: it has it's own. I can explicitly pass the state from
>> parseIntListAndString to parseIntList, but I see no suitable way for the
>> parseIntList to update it. I can return the updated state value from the
>> parseIntList function, and call setState on a result - but it seems
>> rather ugly to mee. However, if nothing else will do, that is an alternative
>>
>> it is of course possible to use two different parsers sequentially, but
>> it is also very ineffective: I need to use such multiple parsing on a
>> relatively small substring of the actual input, so little backtracking
>> would be a much nicier approach. Any suggestions?
>>
>> --
>> Regards, Paul Sujkov
>>
>
More information about the Haskell-Cafe
mailing list