Skip for ReadP/ReadPrec

Zemyla zemyla at gmail.com
Mon Aug 21 16:34:42 UTC 2017


It occurs to me that, when it comes to the ReadP/ReadPrec parser
combinators in base, that a common use pattern is to use "look", parse
the next value and the number of characters taken from the string, and
then use "get" repeatedly to skip ahead that number.

Something along the lines of:

readSkip :: (String -> Maybe (a, Int)) -> ReadP a
readSkip prs = do
    s <- look
    case prs s of
        Nothing -> pfail
        Just (a, n) -> let
            go i | i <= 0 = return a
            go i = get >> go (i - 1)
            in n `seq` go n

It's also the sort of thing that munch, munch1, and skipSpaces do a
lot, and as skipSpaces at the very least is common, it should be
optimized.

My thought was adding a constructor to the internal P type, like so:

data P a = ... -- existing cases
    | Skip {-# UNPACK #-} !Int (P a)

And a smart constructor, such as:

skipP :: Int -> P a -> P a
skipP n p | n `seq` p `seq` False = undefined
skipP n p | n <= 0 = p
skipP _ Fail = Fail
skipP n (Skip m p) = Skip (m + n) p
skipP n p = Skip n p

Skips would be combined in the (>>=) and (<|>) functions:

    Skip n p >>= f = skipP n (p >>= f)

    Skip m p <|> Skip n q = case compare m n of
      LT -> skipP m (p <|> skipP (n - m) q)
      EQ -> skipP m (p <|> q)
      GT -> skipP n (skipP (m - n) p <|> q)
    Skip m p <|> Get f = Get $ \c -> skipP (m - 1) p <|> f c -- and
similarly backwards
    Skip m p <|> Look f = Look $ \s -> Skip m p <|> f s -- and
similarly backwards

This would also allow for an optimization in the Look + Get case:

    Look fl <|> Get fg = Look $ \s -> case s of
        [] -> fl []
        c:_ -> fl s <|> skipP 1 (fg c)

The only thing that would be exported would be an actual skip function:

skip :: Int -> ReadP ()
skip n | n `seq` False = undefined
skip n = R $ \c -> skipP n $ c ()

And the Skip constructor can be used instead of the "discard"-like
functions in munch, munch1, (<++), and skipSpaces.

A "skip" function could also be included in
Text.ParserCombinators.ReadPrec, but since that module is imported
unqualified more often, it might collide with user-defined functions;
even if it isn't exported, it can be emulated with

skip :: Int -> ReadPrec ()
skip n = readP_to_Prec $ const $ ReadP.skip n

Incidentally, if the invariant of the list in the Final constructor
being non-empty is supposed to be enforced, then shouldn't the
constructor for it be something along the lines of:

    Final (a, String) [(a, String)]

?


More information about the Libraries mailing list