[Haskell-cafe] parsing exercise
Sebastian Fischer
fischer at nii.ac.jp
Sun Jan 23 10:39:39 CET 2011
On Sun, Jan 23, 2011 at 4:31 PM, Chung-chieh Shan
<ccshan at post.harvard.edu>wrote:
> Maybe Text.Show.Pretty.parseValue in the pretty-show package can help?
>
That's what I was looking for, thanks!
On Sun, Jan 23, 2011 at 5:23 PM, Stephen Tetley <stephen.tetley at gmail.com>
wrote:
> I don't think you can do this "simply" as you think you would always
> have to build a parse tree.
Isn't it enough to maintain a stack of open parens, brackets, char- and
string-terminators and escape chars? Below is my attempt at solving the
problem without an expression parser.
In practice, if you follow the skeleton syntax tree style you might
> find "not caring" about the details of syntax is almost as much work
> as caring about them. I've tried a couple of times to make a skeleton
> parser that does paren nesting and little else, but always given up
> and just used a proper parser as the skeleton parser never seemed
> robust.
>
Indeed I doubt that the implementation below is robust and it's too tricky
to be easily maintainable. I include it for reference. Let me know if you
spot an obvious mistake..
Sebastian
splitTLC :: String -> [String]
splitTLC = parse ""
type Stack = String
parse :: Stack -> String -> [String]
parse _ "" = []
parse st (c:cs) = next c st $ parse (updStack c st) cs
next :: Char -> Stack -> [String] -> [String]
next c [] xs = if c==',' then [] : xs else c <: xs
next c (_:_) xs = c <: xs
infixr 0 <:
(<:) :: Char -> [String] -> [String]
c <: [] = [[c]]
c <: (x:xs) = (c:x):xs
updStack :: Char -> Stack -> Stack
updStack char stack =
case (char,stack) of
-- char is an escaped character
(_ ,'\\':xs) -> xs -- the next character is not
-- char is the escape character
('\\', xs) -> '\\':xs -- push it on the stack
-- char is the string terminator
('"' , '"':xs) -> xs -- closes current string literal
('"' , ''':xs) -> ''':xs -- ignored inside character
('"' , xs) -> '"':xs -- opens a new string
-- char is the character terminator
(''' , ''':xs) -> xs -- closes current character literal
(''' , '"':xs) -> '"':xs -- ignored inside string
(''' , xs) -> ''':xs -- opens a new character
-- parens and brackets
(_ , '"':xs) -> '"':xs -- are ignored inside strings
(_ , ''':xs) -> ''':xs -- and characters
('(' , xs) -> '(':xs -- new opening paren
(')' , '(':xs) -> xs -- closing paren
('[' , xs) -> '[':xs -- opening bracket
(']' , '[':xs) -> xs -- closing bracket
-- other character don't modify the stack (ignoring record syntax)
(_ , xs) -> xs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110123/92d6d3a3/attachment.htm>
More information about the Haskell-Cafe
mailing list