[Haskell-beginners] calling inpure functions from pure code

Emmanuel Touzery etouzery at gmail.com
Fri Oct 12 15:28:39 CEST 2012


Hi,

> when parsing the string representing a page, you could
> save all the links you encounter.
>
> After the parsing you would load the linked pages and start
> again parsing.
>
> You would redo this until no more links are returned or a
> maximum deepness is reached.

Thanks for the tip. That sounds much more reasonable than what I 
mentioned. It seems a bit "spaghetti" to me though in a way (but maybe I 
just have to get used to the Haskell way).

To be more specific about what I want to do: I want to parse TV 
programs. On the first page I have the daily listing for a channel. 
start/end hour, title, category, and link or not.
To fully parse one TV program I can follow the link if it's present and 
get the extra info which is there (summary, pictures..).

So the first scheme that comes to mind is a method which takes the DOM 
tree of the daily page and returns the list of programs for that day.

Instead, what I must then do, is to return the incomplete programs: the 
data object would have the link filled in, if it's available, but the 
summary, picture... would be empty.
Then I have a "second pass" in the caller function, where for programs 
which have a link, I would fetch the extra page, and call a second 
function, which will fill in the extra data (thankfully if pictures are 
present I only store their URL so it would stop there, no need for a 
third pass for pictures).

It annoys me that the first function returns "incomplete" objects... It 
somehow feels wrong.

Now that I mentioned my problem with more details, maybe you can think 
of a better way of doing that?

And otherwise I guess this is the policy when writing Haskell code: 
absolutely avoid spreading impure/IO tainted code, even if it maybe 
negatively affects the general structure of the program?

Thanks again for the tip though! That's definitely what I'll do if 
nothing better is suggested. It is actually probably the best way to do 
that if you want to separate IO from "pure" code.

Emmanuel



More information about the Beginners mailing list