[Haskell-cafe] Converting wiki pages into pdf

Max Rabkin max.rabkin at gmail.com
Thu Sep 8 14:41:20 CEST 2011


This doesn't answer your Haskell question, but Wikpedia has
PDF-generation facilities ("Books"). Take a look at
http://en.wikipedia.org/wiki/Help:Book (for single articles, just use
the "download PDF" option in the sidebar).

--Max

On Thu, Sep 8, 2011 at 14:34, mukesh tiwari
<mukeshtiwari.iiitm at gmail.com> wrote:
> Hello all
> I am trying to write a Haskell program which download html pages from
> wikipedia   including images and convert them into pdf . I wrote a
> small script
>
> import Network.HTTP
> import Data.Maybe
> import Data.List
>
> main = do
>        x <- getLine
>        htmlpage <-  getResponseBody =<< simpleHTTP ( getRequest x ) --
> open url
>        --print.words $ htmlpage
>        let ind_1 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
> tails $ htmlpage ) $ "<!-- content -->"
>            ind_2 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
> tails $ htmlpage ) $ "<!-- /content -->"
>            tmphtml = drop ind_1 $ take ind_2  htmlpage
>        writeFile "down.html" tmphtml
>
> and its working fine except some symbols are not rendering as it
> should be. Could some one please suggest me how to accomplish this
> task.
>
> Thank you
> Mukesh Tiwari
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



More information about the Haskell-Cafe mailing list