[Haskell-cafe] Simple HTTP lib for Windows?
Daniel McAllansmith
dm.maillists at gmail.com
Sat Jan 27 16:04:52 EST 2007
On Sunday 28 January 2007 09:14, Neil Mitchell wrote:
> Hi Alistair,
>
> > > Is there a simple way to get the contents of a webpage using Haskell on
> > > a Windows box?
> >
> > This isn't exactly what you want, but it gets you partway there. Not
> > sure if LineBuffering or NoBuffering is the best option. Line
> > buffering should be fine for just text output, but if you request a
> > binary object (like an image) then you have to read exactly the number
> > of bytes specified, and no more.
>
> This works great for haskell.org, unfortunately it doesn't work as
> well with the rest of the web universe.
>
> With www.google.com I get: Program error: <handle>: IO.hGetChar:
> illegal operation
>
> With www.slashdot.org I get: 501 Not Implemented returned
>
> www.msnbc.msn.com works fine.
>
> Any ideas why?
At the very least it's missing the HTTP version on the request line, and you
almost always need to send a Host header.
For a start you could try changing client to:
client server port page = do
h <- connectTo server (PortNumber port)
hSetBuffering h NoBuffering
putStrLn "send request"
hPutStrLn h ("GET " ++ page ++ " HTTP/1.1\r")
hPutStrLn h ("Host: " ++ server ++ "\r")
hPutStrLn h "\r"
hPutStrLn h "\r"
putStrLn "wait for response"
readResponse h
putStrLn ""
Note that I haven't tried this, or the rest of Alistair code at all, so the
usual 30 day money back guarantee doesn't apply. It certainly won't handle
redirects.
> Are there any alternatives to read in a file off the
> internet (i.e. wget but as a library)
The http library sort of works most of the time, but there are several bugs
that cause it to fail on many 'in the wild' webservers.
HXT has a wrapper around a command line invocation of cURL. It works better.
There is still a problem with redirects, but thats an easy enough fix.
I doubt that it would be very easy to extract it from the surrounding HXT
framework though.
It would be nice to have a binding to libcurl.
Daniel
More information about the Haskell-Cafe
mailing list