[Haskell-beginners] Parsec, parsing 'free text'

Stephen Tetley stephen.tetley at gmail.com
Sun Mar 11 00:33:04 CET 2012


Hi Franco

The best "solution" is really to work out a grammar of text strings
and write simpler productions that handle it.

Otherwise you can treat it as a "lexing" problem but then the results
get messy as you have found out.

It's a bit late in the UK and I though I've looked at the code I
haven't worked out an answer yet, I'll have a proper look tomorrow if
no one else has answered but here is my first step, this is a "lexing"
solution but written directly rather than with Parsec. It is easier to
write a "lexing" solution this as two mutually recursive functions for
the lexer states - consuming free text, or consuming a format string.


data Text1 = FreeText String | Formatted String
  deriving (Eq,Ord,Show)

type Text = [Text1]


-- The type of /accumulator/.
type Acc  = ShowS

-- We want to grow Strings from the right.
snoc :: Acc -> Char -> Acc
snoc ss c = ss . (c:)

toString :: Acc -> String
toString = ($ "")

empty :: Acc
empty = id


runText :: String -> Text
runText = text empty


-- Minor problem - generates empty FreeText if the accumulator is
-- empty, this can be easily fixed at some loss of clarity.
--
text :: Acc -> String -> Text
text ac []       = [FreeText (toString ac)]
text ac ('<':cs) = FreeText (toString ac) : formatted empty cs
text ac (c:cs)   = text (ac `snoc` c) cs


formatted :: Acc -> String -> Text
formatted _  []       = error "missing terminator for formatting"
formatted ac ('>':cs) = Formatted (toString ac) : text empty cs
formatted ac (c:cs)   = formatted (ac `snoc` c) cs


demo01 = runText "[ someconditions | this is some <red - formatted> text.]"



More information about the Beginners mailing list