From rmason at mun.ca Wed Jan 26 12:10:10 2022 From: rmason at mun.ca (Roger Mason) Date: Wed, 26 Jan 2022 08:40:10 -0330 Subject: [Haskell-beginners] Parsing a file Message-ID: Hello, Warning: long post. I've worked my way through various parsing tutorials using either Parsec or ReadP. I reached a point where I need to try parsing one of the types of file for which I'm writing the parser. I have written parsers for various parts of this file: eqatoms.out: ====================================================== Species : 1 (Si) atom 1 is equivalent to atom(s) 1 2 3 atom 2 is equivalent to atom(s) 1 2 3 atom 3 is equivalent to atom(s) 1 2 3 Species : 2 (O) atom 1 is equivalent to atom(s) 1 2 3 4 5 6 atom 2 is equivalent to atom(s) 1 2 3 4 5 6 atom 3 is equivalent to atom(s) 1 2 3 4 5 6 atom 4 is equivalent to atom(s) 1 2 3 4 5 6 atom 5 is equivalent to atom(s) 1 2 3 4 5 6 atom 6 is equivalent to atom(s) 1 2 3 4 5 6 ====================================================== These are my imports: ====================================================== import qualified Text.Parsec as Parsec import Text.Parsec (()) import Control.Applicative import Control.Monad.Identity (Identity) ====================================================== These are my parsers: ====================================================== :{ species :: Parsec.Parsec String () (String,String) species = do --Parsec.char 'S' Parsec.string "Species" Parsec.spaces Parsec.char ':' Parsec.spaces digits <- Parsec.many1 Parsec.digit Parsec.spaces Parsec.char '(' id <- Parsec.many1 Parsec.letter return (id,digits) :} :{ atom = do Parsec.spaces Parsec.string "atom" Parsec.spaces digits <- Parsec.digit return digits :} :{ equivalents = do Parsec.spaces digits <- Parsec.digit return digits :} ====================================================== Some simple tests: ====================================================== src = "oops" Parsec.parse species src "Species : 1 (Si)" Right ("Si","1") src = "Parsing_File/eqatoms.out" Parsec.parse atom src "atom 5 is equivalent to atom(s)" Right '5' src = "Parsing_File/eqatoms.out" Parsec.parse (Parsec.many1 equivalents) src " 1 2 3 4 5 6" : Right "123456" ====================================================== So, the individual parsers work as intended. However, parsing an actual file does not work. I define a function to return the file contents: ====================================================== :{ input = do eqatoms <- readFile "Parsing_File/eqatoms.out" return eqatoms :} ====================================================== A test shows that my reader works: ====================================================== input : Species : 1 (Si)\n atom 1 is equivalent to atom(s)\n 1 2 3\n atom 2 is equivalent to atom(s)\n 1 2 3\n atom 3 is equivalent to atom(s)\n 1 2 3\n\nSpecies : 2 (O)\n atom 1 is equivalent to atom(s)\n 1 2 3 4 5 6\n atom 2 is equivalent to atom(s)\n 1 2 3 4 5 6\n atom 3 is equivalent to atom(s)\n 1 2 3 4 5 6\n atom 4 is equivalent to atom(s)\n 1 2 3 4 5 6\n atom 5 is equivalent to atom(s)\n 1 2 3 4 5 6\n atom 6 is equivalent to atom(s)\n 1 2 3 4 5 6\n ====================================================== I attempt to parse the input: ====================================================== :{ main = do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return :} Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| :250:3: error: ,* Couldn't match type `Either Parsec.ParseError' with `IO' Expected type: IO (String, String) Actual type: Either Parsec.ParseError (String, String) ,* In a stmt of a 'do' block: Parsec.parse species "test species" eqatoms In the expression: do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return In an equation for `main': main = do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return :251:3: error: ,* Couldn't match expected type `IO b' with actual type `a0 -> m0 a0' ,* Probable cause: `return' is applied to too few arguments In a stmt of a 'do' block: return In the expression: do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return In an equation for `main': main = do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return ,* Relevant bindings include main :: IO b (bound at :248:3) ====================================================== Can someone please help me to get this to work? Thanks for reading to the end of this very long post. Roger From fa-ml at ariis.it Wed Jan 26 12:58:41 2022 From: fa-ml at ariis.it (Francesco Ariis) Date: Wed, 26 Jan 2022 13:58:41 +0100 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: Hello Roger, Il 26 gennaio 2022 alle 08:40 Roger Mason ha scritto: > I attempt to parse the input: > ====================================================== > :{ > main = do > eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > :} > > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > :250:3: error: > ,* Couldn't match type `Either Parsec.ParseError' with `IO' > Expected type: IO (String, String) > Actual type: Either Parsec.ParseError (String, String) > ,* In a stmt of a 'do' block: > Parsec.parse species "test species" eqatoms > In the expression: > do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > In an equation for `main': > main > = do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return The problem is with this line > Parsec.parse species "test species" eqatoms `parse` returns an Either, so you should pattern match on its `Left` and `Right` (using `case` or the `either` function). This has to be done inside a `let` too, because parse is a pure function. Does that help? —F From rmason at mun.ca Wed Jan 26 13:32:04 2022 From: rmason at mun.ca (Roger Mason) Date: Wed, 26 Jan 2022 10:02:04 -0330 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: Hello Francesco, Thanks for your response. Francesco Ariis writes: > > The problem is with this line > >> Parsec.parse species "test species" eqatoms > > `parse` returns an Either, so you should pattern match on > its `Left` and `Right` (using `case` or the `either` function). > This has to be done inside a `let` too, because parse is a pure > function. > Does that help? I'll need to check exactly how to use case for this, but before I do I have this question. =Parsec.parse species "test species" "this that"= worked fine in my tests. Why has `parse` changed changed its return type when invoked as =Parsec.parse species "test species" eqatoms= That is confusing (and off putting) and makes it hard to test ones code. Thanks, Roger From fa-ml at ariis.it Wed Jan 26 14:43:43 2022 From: fa-ml at ariis.it (Francesco Ariis) Date: Wed, 26 Jan 2022 15:43:43 +0100 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: Il 26 gennaio 2022 alle 10:02 Roger Mason ha scritto: > I'll need to check exactly how to use case for this, but before I do I > have this question. > > =Parsec.parse species "test species" "this that"= worked fine in my > tests. Why has `parse` changed changed its return type when invoked as > > =Parsec.parse species "test species" eqatoms= > > That is confusing (and off putting) and makes it hard to test ones code. They work fine because — I suspect — you ran them *inside ghci* (which is totally fine). When you are dealing with main (or any function inside the IO monad) you need as usual to make things typecheck by providing the correct data. Simple quiz: do you understand why main :: IO () main = "prova" does nor work (nor compile) while main :: IO () main = putStrLn "prova" does? If you are not sure read the “I/O” section from Real World Haskell [1] or any introductory material about Haskell and IO. Keep learning & fire here again if you are not sure! —F [1] http://book.realworldhaskell.org/read/io.html From toad3k at gmail.com Wed Jan 26 17:29:40 2022 From: toad3k at gmail.com (David McBride) Date: Wed, 26 Jan 2022 12:29:40 -0500 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: These two pieces of code are not equivalent. :{ input = do eqatoms <- readFile "Parsing_File/eqatoms.out" return eqatoms :} :{ main = do eqatoms <- readFile "Parsing_File/eqatoms.out" Parsec.parse species "test species" eqatoms return :} You will have to do something like :{ main = do eqatoms <- readFile "Parsing_File/eqatoms.out" let res = Parsec.parse species "test species" eqatoms putStrLn (show res) return :} And you will eventually clean it up by using a case statement to distinguish between success and error. On Wed, Jan 26, 2022 at 7:41 AM Roger Mason wrote: > Hello, > > Warning: long post. > > I've worked my way through various parsing tutorials using either Parsec > or ReadP. I reached a point where I need to try parsing one of the > types of file for which I'm writing the parser. I have written parsers > for various parts of this file: > > eqatoms.out: > ====================================================== > > Species : 1 (Si) > atom 1 is equivalent to atom(s) > 1 2 3 > atom 2 is equivalent to atom(s) > 1 2 3 > atom 3 is equivalent to atom(s) > 1 2 3 > > Species : 2 (O) > atom 1 is equivalent to atom(s) > 1 2 3 4 5 6 > atom 2 is equivalent to atom(s) > 1 2 3 4 5 6 > atom 3 is equivalent to atom(s) > 1 2 3 4 5 6 > atom 4 is equivalent to atom(s) > 1 2 3 4 5 6 > atom 5 is equivalent to atom(s) > 1 2 3 4 5 6 > atom 6 is equivalent to atom(s) > 1 2 3 4 5 6 > > ====================================================== > > These are my imports: > ====================================================== > import qualified Text.Parsec as Parsec > > import Text.Parsec (()) > > import Control.Applicative > > import Control.Monad.Identity (Identity) > ====================================================== > > These are my parsers: > ====================================================== > :{ > species :: Parsec.Parsec String () (String,String) > species = do > --Parsec.char 'S' > Parsec.string "Species" > Parsec.spaces > Parsec.char ':' > Parsec.spaces > digits <- Parsec.many1 Parsec.digit > Parsec.spaces > Parsec.char '(' > id <- Parsec.many1 Parsec.letter > return (id,digits) > :} > > :{ > atom = do > Parsec.spaces > Parsec.string "atom" > Parsec.spaces > digits <- Parsec.digit > return digits > :} > > :{ > equivalents = do > Parsec.spaces > digits <- Parsec.digit > return digits > :} > ====================================================== > > Some simple tests: > ====================================================== > src = "oops" > Parsec.parse species src "Species : 1 (Si)" > > Right ("Si","1") > > src = "Parsing_File/eqatoms.out" > Parsec.parse atom src "atom 5 is equivalent to atom(s)" > > Right '5' > > src = "Parsing_File/eqatoms.out" > Parsec.parse (Parsec.many1 equivalents) src " 1 2 3 4 5 6" > > : Right "123456" > ====================================================== > > So, the individual parsers work as intended. However, parsing an actual > file does not work. > > I define a function to return the file contents: > ====================================================== > :{ > input = do > eqatoms <- readFile "Parsing_File/eqatoms.out" > return eqatoms > :} > ====================================================== > > A test shows that my reader works: > ====================================================== > input > > : Species : 1 (Si)\n atom 1 is equivalent to atom(s)\n 1 2 > 3\n atom 2 is equivalent to atom(s)\n 1 2 3\n atom 3 is > equivalent to atom(s)\n 1 2 3\n\nSpecies : 2 (O)\n atom 1 is > equivalent to atom(s)\n 1 2 3 4 5 6\n atom 2 is > equivalent to atom(s)\n 1 2 3 4 5 6\n atom 3 is > equivalent to atom(s)\n 1 2 3 4 5 6\n atom 4 is > equivalent to atom(s)\n 1 2 3 4 5 6\n atom 5 is > equivalent to atom(s)\n 1 2 3 4 5 6\n atom 6 is > equivalent to atom(s)\n 1 2 3 4 5 6\n > > ====================================================== > > I attempt to parse the input: > ====================================================== > :{ > main = do > eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > :} > > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > Prelude Parsec Text.Parsec Control.Applicative Control.Monad.Identity| > :250:3: error: > ,* Couldn't match type `Either Parsec.ParseError' with `IO' > Expected type: IO (String, String) > Actual type: Either Parsec.ParseError (String, String) > ,* In a stmt of a 'do' block: > Parsec.parse species "test species" eqatoms > In the expression: > do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > In an equation for `main': > main > = do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > > :251:3: error: > ,* Couldn't match expected type `IO b' > with actual type `a0 -> m0 a0' > ,* Probable cause: `return' is applied to too few arguments > In a stmt of a 'do' block: return > In the expression: > do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > In an equation for `main': > main > = do eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > ,* Relevant bindings include > main :: IO b (bound at :248:3) > ====================================================== > > Can someone please help me to get this to work? > > Thanks for reading to the end of this very long post. > Roger > _______________________________________________ > Beginners mailing list > Beginners at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmason at mun.ca Thu Jan 27 12:21:01 2022 From: rmason at mun.ca (Roger Mason) Date: Thu, 27 Jan 2022 08:51:01 -0330 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: Hello Francesco, Thanks for your help. Francesco Ariis writes: > > Simple quiz: do you understand why > > main :: IO () > main = "prova" > That returns a String rather than an "IO something". > does nor work (nor compile) while > > main :: IO () > main = putStrLn "prova" > > does? putStrLn returns an "IO something" and so is campatible with the type of main. Thank you for the help & encouragement. Roger From rmason at mun.ca Thu Jan 27 12:25:02 2022 From: rmason at mun.ca (Roger Mason) Date: Thu, 27 Jan 2022 08:55:02 -0330 Subject: [Haskell-beginners] Parsing a file In-Reply-To: References: Message-ID: Hello David, David McBride writes: > These two pieces of code are not equivalent. > > :{ > input = do > eqatoms <- readFile "Parsing_File/eqatoms.out" > return eqatoms > :} > > :{ > main = do > eqatoms <- readFile "Parsing_File/eqatoms.out" > Parsec.parse species "test species" eqatoms > return > :} Thank you for the help. Now I understand better the error that I made. Roger