[Haskell-cafe] please comment on my parser, can I do this cleaner?

Thomas Hartman tphyahoo at gmail.com
Tue Jun 9 14:29:09 EDT 2009


All I want to do is split on commas, but not the commas inside () or <> tags.

I have been wanting to master parsec for a long time and this simple
exercise looked like a good place to start.

The code below does the right thing. Am I missing any tricks to make
this simpler/neater?

Thanks, thomas.

thartman at ubuntu:~/perlArena>cat splitEm.
splitEm.hs   splitEm.hs~  splitEm.pl   splitEm.pl~
thartman at ubuntu:~/perlArena>cat splitEm.hs
{-# LANGUAGE ScopedTypeVariables #-}
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Char
import Text.PrettyPrint (vcat, render, text)
import Data.List.Split hiding (sepBy, chunk)
import Text.ParserCombinators.Parsec.Token

import Debug.Trace
import Debug.Trace.Helpers

-- this works, but is there a neater/cleaner way?
main = ripInputsXs (toEof splitter) "splitter" [ goodS, badS ]

-- I need a way to split on commas, but not the commas inside '<>' or
'()' characters
goodS = "<*2>FOO<2,1>,<*3>(SigB<8:0:2>,BAR),<*2>Siga<2:0>,Sigb<8,7,6,5,0>"
badS = "<*2)FOO<2,1>,<*3>(SigB<8:0:2>,BAR),<*2>Siga<2:0>,Sigb<8,7,6,5,0>"
-- the first < matches a ), so reject this


splitter = do
  chunks :: [String] <- toEof (many chunk)
  let pieces = map concat $ splitOn [","] chunks
  return pieces -- chunks
  where
    atom = string ","
           <|> ( many1 $ noneOf "()<>" )
    chunk = parenExpr <|>  atom
    parenExpr :: GenParser Char st [Char]
    parenExpr = let paren p = betweenInc (char '(' ) (char ')' ) p
                                <|> betweenInc (char '<' ) (char '>' )
p
                in paren $ option "" $ do ps <- many1 $ parenExpr <|> atom
                                          return . concat $ ps

betweenInc o' c' p' = do
  o <- o'
  p <- p'
  c <- c'
  return $ [o] ++ p ++ [c]

toEof p' = do
        r <- p'
        eof
        return r






ripInputs prs prsName xs = mapM_ (putStrLn . show . parse prs prsName ) xs
ripInputsXs prs prsName xs = mapM_ (putStrLn . showXs . parse prs prsName ) xs
  where showXs v = case v of
          Left e -> show e
          Right xs -> render . vcat . map text $ xs


More information about the Haskell-Cafe mailing list