[Haskell-cafe] attpoarsec and recursive parsing
Olaf Klinke
olf at aatal-apotheke.de
Sat Nov 23 17:05:00 UTC 2024
On Fri, 2024-11-22 at 11:05 +0100, PICCA Frederic-Emmanuel wrote:
> Hello, I end up with this solution
>
> in order to parse this data type.
>
> If you have some advice in order to simplify or beautify the Parser part do not hesitate.
>
> Cheers
>
> Fred
>
> data MaskLocation = MaskLocation Text
> | MaskLocation'Tmpl Text
> | MaskLocation'Or MaskLocation MaskLocation
> deriving (Eq, Generic, Show)
> deriving anyclass (FromJSON, ToJSON)
>
> instance FieldEmitter MaskLocation where
> fieldEmitter (MaskLocation t) = t
> fieldEmitter (MaskLocation'Tmpl t) = t
> fieldEmitter (MaskLocation'Or l r) = fieldEmitter l <> " | " <> fieldEmitter r
>
> instance FieldParsable MaskLocation where
> fieldParser = do
> let loc :: Text -> MaskLocation
> loc t = if "{scannumber:" `Data.Text.isInfixOf` t
> then MaskLocation'Tmpl (strip t)
> else MaskLocation (strip t)
>
> t <- takeTill (== '|')
> if t == ""
> then fail "MaskLocation is Empty"
> else do mc <- peekChar
> case mc of
> Nothing -> pure $ loc t
> Just '|' -> do
> _ <- char '|' -- extract the '|' char
> MaskLocation'Or (loc t) <$> fieldParser
> Just c -> fail ("MaskLocation " <> [c])
The peekChar seems suspicious. That is what backtracking parsers
abstract over. Attoparsec always backtracks, in other parser libraries
there is the try combinator. You match on the '|' char using peekChar
and then later discard it by the _ <- char '|' parser fragment.
Also note that MaskLocation is a tree (with two different types of
leaves), but your fieldParser will only ever produce trees that have a
single leaf in the left branch. These trees are therefore equivalent to
non-empty linked lists. Therefore you might as well use sepBy1:
type P = Data.Attoparsec.Text.Parser
loc :: Text -> Either Text Text
loc t = if ("{scannumber:" `Data.Text.isInfixOf` t)
then Left (strip t)
else Right (strip t)
-- note: either id id ~ fieldEmitter
locationOrTempl :: P (Either Text Text)
locationOrTempl = fmap loc (takeWhile1 (/= '|'))
sep :: P Text
sep = fromString "|" -- IsString instance of P
maskLocations :: P [Either Text Text]
maskLocations = locationOrTempl `sepBy1` sep
Olaf
More information about the Haskell-Cafe
mailing list