[Haskell-beginners] calling inpure functions from pure code

Fri Oct 12 15:39:41 CEST 2012

There's a better option in my opinion.  Use the monad transformer
capability of the parser you are using (I'm assuming you are using parsec
for parsing).

If you check the hackage docs for parsec you'll see that the ParsecT is an
instance of MonadIO.  That means at any point during the parsing you can go
liftIO $ <any IO action> and use the result in your parsing.  Here's an
example of what that would might look like.

import Control.Monad.IO.Class
import Control.Monad (when)
import Text.Parsec
import Text.Parsec.Char

parseTvStuff :: (MonadIO m) => ParsecT String u m (Char,Maybe ())
parseTvStuff = do
  string "tvshow:"
  c <- anyChar
  morestuff <- if c == 'x'
    then fmap Just $ liftIO $ putStrLn "run an http request, parse the
result, and store the result in morestuff as a maybe"
    else return Nothing
  return (c,morestuff)

So you will run an http request if you get back something that seems like
it could be worth further parsing.  Then you just parse that stuff with a
separate parser and store it in your data structure and continue parsing
the rest of the first page with the original parser if you wish.

On Fri, Oct 12, 2012 at 9:28 AM, Emmanuel Touzery <etouzery at gmail.com>wrote:

> Hi,
>
>
>  when parsing the string representing a page, you could
>> save all the links you encounter.
>>
>> After the parsing you would load the linked pages and start
>> again parsing.
>>
>> You would redo this until no more links are returned or a
>> maximum deepness is reached.
>>
>
> Thanks for the tip. That sounds much more reasonable than what I
> mentioned. It seems a bit "spaghetti" to me though in a way (but maybe I
> just have to get used to the Haskell way).
>
> To be more specific about what I want to do: I want to parse TV programs.
> On the first page I have the daily listing for a channel. start/end hour,
> title, category, and link or not.
> To fully parse one TV program I can follow the link if it's present and
> get the extra info which is there (summary, pictures..).
>
> So the first scheme that comes to mind is a method which takes the DOM
> tree of the daily page and returns the list of programs for that day.
>
> Instead, what I must then do, is to return the incomplete programs: the
> data object would have the link filled in, if it's available, but the
> summary, picture... would be empty.
> Then I have a "second pass" in the caller function, where for programs
> which have a link, I would fetch the extra page, and call a second
> function, which will fill in the extra data (thankfully if pictures are
> present I only store their URL so it would stop there, no need for a third
> pass for pictures).
>
> It annoys me that the first function returns "incomplete" objects... It
> somehow feels wrong.
>
> Now that I mentioned my problem with more details, maybe you can think of
> a better way of doing that?
>
> And otherwise I guess this is the policy when writing Haskell code:
> absolutely avoid spreading impure/IO tainted code, even if it maybe
> negatively affects the general structure of the program?
>
> Thanks again for the tip though! That's definitely what I'll do if nothing
> better is suggested. It is actually probably the best way to do that if you
> want to separate IO from "pure" code.
>
> Emmanuel
>
>
> ______________________________**_________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/**mailman/listinfo/beginners<http://www.haskell.org/mailman/listinfo/beginners>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20121012/9c0718ab/attachment.htm>