[Haskell-cafe] Converting wiki pages into pdf
mukesh tiwari
mukeshtiwari.iiitm at gmail.com
Thu Sep 8 21:57:41 CEST 2011
I tried to use the PDF-generation facilities . I wrote a script which
generates the rendering url . When i am pasting rendering url in
browser its generating the download file but when i am trying to get
the tags , its empty. Could some one please tell me what is wrong with
code.
Thank You
Mukesh Tiwari
import Network.HTTP
import Text.HTML.TagSoup
import Data.Maybe
parseHelp :: Tag String -> Maybe String
parseHelp ( TagOpen _ y ) = if ( filter ( \( a , b ) -> b == "Download
a PDF version of this wiki page" ) y ) /= []
then Just $ "http://en.wikipedia.org" ++ ( snd $
y !! 0 )
else Nothing
parse :: [ Tag String ] -> Maybe String
parse [] = Nothing
parse ( x : xs )
| isTagOpen x = case parseHelp x of
Just s -> Just s
Nothing -> parse xs
| otherwise = parse xs
main = do
x <- getLine
tags_1 <- fmap parseTags $ getResponseBody =<< simpleHTTP
( getRequest x ) --open url
let lst = head . sections ( ~== "<div class=portal id=p-coll-
print_export>" ) $ tags_1
url = fromJust . parse $ lst --rendering url
putStrLn url
tags_2 <- fmap parseTags $ getResponseBody =<< simpleHTTP
( getRequest url )
print tags_2
More information about the Haskell-Cafe
mailing list