[Haskell-cafe] Converting wiki pages into pdf
mukesh tiwari
mukeshtiwari.iiitm at gmail.com
Thu Sep 8 14:49:46 CEST 2011
Is it possible to automate this process rather than manually clicking
and downloading using Haskell ?
Thank You
Mukesh Tiwari
On Thu, Sep 8, 2011 at 6:11 PM, Max Rabkin <max.rabkin at gmail.com> wrote:
> This doesn't answer your Haskell question, but Wikpedia has
> PDF-generation facilities ("Books"). Take a look at
> http://en.wikipedia.org/wiki/Help:Book (for single articles, just use
> the "download PDF" option in the sidebar).
>
> --Max
>
> On Thu, Sep 8, 2011 at 14:34, mukesh tiwari
> <mukeshtiwari.iiitm at gmail.com> wrote:
> > Hello all
> > I am trying to write a Haskell program which download html pages from
> > wikipedia including images and convert them into pdf . I wrote a
> > small script
> >
> > import Network.HTTP
> > import Data.Maybe
> > import Data.List
> >
> > main = do
> > x <- getLine
> > htmlpage <- getResponseBody =<< simpleHTTP ( getRequest x ) --
> > open url
> > --print.words $ htmlpage
> > let ind_1 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
> > tails $ htmlpage ) $ "<!-- content -->"
> > ind_2 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
> > tails $ htmlpage ) $ "<!-- /content -->"
> > tmphtml = drop ind_1 $ take ind_2 htmlpage
> > writeFile "down.html" tmphtml
> >
> > and its working fine except some symbols are not rendering as it
> > should be. Could some one please suggest me how to accomplish this
> > task.
> >
> > Thank you
> > Mukesh Tiwari
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110908/55d296ae/attachment.htm>
More information about the Haskell-Cafe
mailing list