[Haskell-cafe] playing around with network.curl

Michael Litchard michael at schmong.org
Wed Oct 20 21:31:37 EDT 2010


I'm afraid I'll need a more complete example. When I try to modify the
code above (after correcting the conditional tests), I get the
following:
* About to connect() to github.com port 443 (#0)
*   Trying 207.97.227.239... * connected
* Connected to github.com (207.97.227.239) port 443 (#0)
* found 142 certificates in /etc/ssl/certs/ca-certificates.crt
* 	 server certificate verification SKIPPED
* 	 common name: *.github.com (matched)
* 	 server certificate expiration date OK
* 	 server certificate activation date OK
* 	 certificate public key: RSA
* 	 certificate version: #3
* 	 subject: O=*.github.com,OU=Domain Control Validated,CN=*.github.com
* 	 start date: Fri, 11 Dec 2009 05:02:36 GMT
* 	 expire date: Thu, 11 Dec 2014 05:02:36 GMT
* 	 issuer: C=US,ST=Arizona,L=Scottsdale,O=GoDaddy.com\,
Inc.,OU=http://certificates.godaddy.com/repository,CN=Go Daddy Secure
Certification Authority,serialNumber=07969287
* 	 compression: NULL
* 	 cipher: AES-128-CBC
* 	 MAC: SHA1
> GET /session HTTP/1.1
Host: github.com
Accept: */*

< HTTP/1.1 302 Found
< Server: nginx/0.7.67
< Date: Thu, 21 Oct 2010 01:20:13 GMT
< Content-Type: text/html; charset=utf-8
< Connection: keep-alive
< Status: 302 Found
< Location: http://github.com/session
< X-Runtime: 0ms
< Content-Length: 91
* Added cookie _github_ses="BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f"
for domain github.com, path /, expire 1577865600
< Set-Cookie: _github_ses=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f;
path=/; expires=Wed, 01 Jan 2020 08:00:00 GMT; HttpOnly
< Cache-Control: no-cache
<
* Ignoring the response-body
* Connection #0 to host github.com left intact
* Issue another request to this URL: 'http://github.com/session'
* About to connect() to github.com port 80 (#1)
*   Trying 207.97.227.239... * connected
* Connected to github.com (207.97.227.239) port 80 (#1)
> GET /session HTTP/1.1
Host: github.com
Accept: */*
Cookie: _github_ses=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f

< HTTP/1.1 302 Found
< Server: nginx/0.7.67
< Date: Thu, 21 Oct 2010 01:20:13 GMT
< Content-Type: text/html; charset=utf-8
< Connection: keep-alive
< Status: 302 Found
< Location: http://github.com/login
< X-Runtime: 1ms
< Content-Length: 89
* Replaced cookie
_github_ses="BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5"
for domain github.com, path /, expire 1577865600
< Set-Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5;
path=/; expires=Wed, 01 Jan 2020 08:00:00 GMT; HttpOnly
< Cache-Control: no-cache
<
* Ignoring the response-body
* Connection #1 to host github.com left intact
* Issue another request to this URL: 'http://github.com/login'
* Re-using existing connection! (#1) with host github.com
* Connected to github.com (207.97.227.239) port 80 (#1)
> GET /login HTTP/1.1
Host: github.com
Accept: */*
Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5

< HTTP/1.1 302 Found
< Server: nginx/0.7.67
< Date: Thu, 21 Oct 2010 01:20:13 GMT
< Content-Type: text/html; charset=utf-8
< Connection: keep-alive
< Status: 302 Found
< Location: https://github.com/login
< X-Runtime: 0ms
< Content-Length: 90
< Cache-Control: no-cache
<
* Ignoring the response-body
* Connection #1 to host github.com left intact
* Issue another request to this URL: 'https://github.com/login'
* Re-using existing connection! (#0) with host github.com
* Connected to github.com (207.97.227.239) port 443 (#0)
> GET /login HTTP/1.1
Host: github.com
Accept: */*
Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5

< HTTP/1.1 200 OK
< Server: nginx/0.7.67
< Date: Thu, 21 Oct 2010 01:20:13 GMT
< Content-Type: text/html; charset=utf-8
< Connection: keep-alive
< Status: 200 OK
< ETag: "00d0a01a93453f641cecb212b6487496"
< X-Runtime: 10ms
< Content-Length: 15622
* Added cookie csrf_id="f3ff252d5245dc4da10f0c75badbd10c" for domain
github.com, path /, expire 0
< Set-Cookie: csrf_id=f3ff252d5245dc4da10f0c75badbd10c; path=/
< Cache-Control: private, max-age=0, must-revalidate
<
* Connection #0 to host github.com left intact
*** Exception: Failed to log in: CurlOK -- HTTP/1.1 302 Found
*Main> * Closing connection #1
* Closing connection #0


Here's what the code looks like. Notice there's a condition test for a
Status of 302, but that test fails where it should have succeeded.
Removing the CurlFollowLOcation constructor fixes that problem, but I
am left with the problem I began with.


> import Network.Curl
> import System (getArgs)
> import Text.Regex.Posix

> -- | Standard options used for all requests. Uncomment the @CurlVerbose@
> -- option for lots of info on STDOUT.
> opts = [ CurlCookieJar "cookies" , CurlVerbose True, CurlFollowLocation True]

> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
>   CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST

> main = withCurlDo $ do
>   -- Get username and password from command line arguments (will cause
>   -- pattern match failure if incorrect number of args provided).
>   [user, pass] <- getArgs

>   -- Initialize curl instance.
>   curl <- initialize
>   setopts curl opts

>   -- POST request to login.
>   r <- do_curl_ curl "https://github.com/session" (loginOptions user pass)
>     :: IO CurlResponse
>   if respCurlCode r /= CurlOK || respStatus r /= 302
>     then error $ "Failed to log in: "
>                ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
>     else do
>       -- GET request to fetch account page.
>       r <- do_curl_ curl ("https://github.com/account") method_GET
>         :: IO CurlResponse
>       if respCurlCode r /= CurlOK || respStatus r /= 302
>         then error $ "Failed to retrieve account page: "
>                    ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
>         else putStrLn $ extractToken $ respBody r
> -- | Extracts the token from GitHub account HTML page.
> extractToken body = head' "GitHub token not found" xs
>   where
>     head' msg l = if null l then error msg else head l
>     (_,_,_,xs)  = body =~ "github\\.token (.+)"
>                :: (String, String, String,[String])

On Wed, Oct 20, 2010 at 4:33 PM, Paulo Tanimoto <ptanimoto at gmail.com> wrote:
> Hi Michael,
>
> On Wed, Oct 20, 2010 at 6:19 PM, Michael Litchard <michael at schmong.org> wrote:
>> I'm using this tutorial as a guide
>> http://flygdynamikern.blogspot.com/2009/03/extended-sessions-with-haskell-curl.html
>>
>> github has changed since this was posted, but I have managed a
>> successful login. Now I am faced with dealing with a re-direct.
>> I found this constructor
>> CurlFollowLocation Bool
>>
>> on this page
>> http://hackage.haskell.org/packages/archive/curl/1.3.5/doc/html/Network-Curl-Opts.html
>>
>> It seems to do what I want, But I am not clear on how to use it. Could
>> someone provide an example?
>>
>> Thanks. End goal is to snarf the cookie that establishes the session.
>
> I think it is what you're probably expecting.  Just add
> CurlFollowLocation True to the list of options.  Those get "applied"
> by setopts in main.
>
>> opts = [ CurlCookieJar "cookies", CurlFollowLocation True ]
>
> I've used that before with no problems.  Do you need a complete example?
>
> Paulo
>


More information about the Haskell-Cafe mailing list