[Haskell-beginners] Parsing a file
Roger Mason
rmason at mun.ca
Wed Jan 26 12:10:10 UTC 2022
Hello,
Warning: long post.
I've worked my way through various parsing tutorials using either Parsec
or ReadP. I reached a point where I need to try parsing one of the
types of file for which I'm writing the parser. I have written parsers
for various parts of this file:
eqatoms.out:
======================================================
Species : 1 (Si)
atom 1 is equivalent to atom(s)
1 2 3
atom 2 is equivalent to atom(s)
1 2 3
atom 3 is equivalent to atom(s)
1 2 3
Species : 2 (O)
atom 1 is equivalent to atom(s)
1 2 3 4 5 6
atom 2 is equivalent to atom(s)
1 2 3 4 5 6
atom 3 is equivalent to atom(s)
1 2 3 4 5 6
atom 4 is equivalent to atom(s)
1 2 3 4 5 6
atom 5 is equivalent to atom(s)
1 2 3 4 5 6
atom 6 is equivalent to atom(s)
1 2 3 4 5 6
======================================================
These are my imports:
======================================================
import qualified Text.Parsec as Parsec
import Text.Parsec ((<?>))
import Control.Applicative
import Control.Monad.Identity (Identity)
======================================================
These are my parsers:
======================================================
:{
species :: Parsec.Parsec String () (String,String)
species = do
--Parsec.char 'S'
Parsec.string "Species"
Parsec.spaces
Parsec.char ':'
Parsec.spaces
digits <- Parsec.many1 Parsec.digit
Parsec.spaces
Parsec.char '('
id <- Parsec.many1 Parsec.letter
return (id,digits)
:}
:{
atom = do
Parsec.spaces
Parsec.string "atom"
Parsec.spaces
digits <- Parsec.digit
return digits
:}
:{
equivalents = do
Parsec.spaces
digits <- Parsec.digit
return digits
:}
======================================================
Some simple tests:
======================================================
src = "oops"
Parsec.parse species src "Species : 1 (Si)"
Right ("Si","1")
src = "Parsing_File/eqatoms.out"
Parsec.parse atom src "atom 5 is equivalent to atom(s)"
Right '5'
src = "Parsing_File/eqatoms.out"
Parsec.parse (Parsec.many1 equivalents) src " 1 2 3 4 5 6"
: Right "123456"
======================================================
So, the individual parsers work as intended. However, parsing an actual
file does not work.
I define a function to return the file contents:
======================================================
:{
input = do
eqatoms <- readFile "Parsing_File/eqatoms.out"
return eqatoms
:}
======================================================
A test shows that my reader works:
======================================================
input
: Species : 1 (Si)\n atom 1 is equivalent to atom(s)\n 1 2
3\n atom 2 is equivalent to atom(s)\n 1 2 3\n atom 3 is
equivalent to atom(s)\n 1 2 3\n\nSpecies : 2 (O)\n atom 1 is
equivalent to atom(s)\n 1 2 3 4 5 6\n atom 2 is
equivalent to atom(s)\n 1 2 3 4 5 6\n atom 3 is
equivalent to atom(s)\n 1 2 3 4 5 6\n atom 4 is
equivalent to atom(s)\n 1 2 3 4 5 6\n atom 5 is
equivalent to atom(s)\n 1 2 3 4 5 6\n atom 6 is
equivalent to atom(s)\n 1 2 3 4 5 6\n
======================================================
I attempt to parse the input:
======================================================
:{
main = do
eqatoms <- readFile "Parsing_File/eqatoms.out"
Parsec.parse species "test species" eqatoms
return
:}
Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity|
<interactive>:250:3: error:
,* Couldn't match type `Either Parsec.ParseError' with `IO'
Expected type: IO (String, String)
Actual type: Either Parsec.ParseError (String, String)
,* In a stmt of a 'do' block:
Parsec.parse species "test species" eqatoms
In the expression:
do eqatoms <- readFile "Parsing_File/eqatoms.out"
Parsec.parse species "test species" eqatoms
return
In an equation for `main':
main
= do eqatoms <- readFile "Parsing_File/eqatoms.out"
Parsec.parse species "test species" eqatoms
return
<interactive>:251:3: error:
,* Couldn't match expected type `IO b'
with actual type `a0 -> m0 a0'
,* Probable cause: `return' is applied to too few arguments
In a stmt of a 'do' block: return
In the expression:
do eqatoms <- readFile "Parsing_File/eqatoms.out"
Parsec.parse species "test species" eqatoms
return
In an equation for `main':
main
= do eqatoms <- readFile "Parsing_File/eqatoms.out"
Parsec.parse species "test species" eqatoms
return
,* Relevant bindings include
main :: IO b (bound at <interactive>:248:3)
======================================================
Can someone please help me to get this to work?
Thanks for reading to the end of this very long post.
Roger
More information about the Beginners
mailing list