[Haskell-cafe] playing around with network.curl

Paulo Tanimoto ptanimoto at gmail.com
Thu Oct 21 00:13:50 EDT 2010


On Wed, Oct 20, 2010 at 8:31 PM, Michael Litchard <michael at schmong.org> wrote:
> I'm afraid I'll need a more complete example. When I try to modify the
> code above (after correcting the conditional tests), I get the
> following:
> * About to connect() to github.com port 443 (#0)
> *   Trying 207.97.227.239... * connected
> * Connected to github.com (207.97.227.239) port 443 (#0)
> * found 142 certificates in /etc/ssl/certs/ca-certificates.crt
> *        server certificate verification SKIPPED
> *        common name: *.github.com (matched)
> *        server certificate expiration date OK
> *        server certificate activation date OK
> *        certificate public key: RSA
> *        certificate version: #3
> *        subject: O=*.github.com,OU=Domain Control Validated,CN=*.github.com
> *        start date: Fri, 11 Dec 2009 05:02:36 GMT
> *        expire date: Thu, 11 Dec 2014 05:02:36 GMT
> *        issuer: C=US,ST=Arizona,L=Scottsdale,O=GoDaddy.com\,
> Inc.,OU=http://certificates.godaddy.com/repository,CN=Go Daddy Secure
> Certification Authority,serialNumber=07969287
> *        compression: NULL
> *        cipher: AES-128-CBC
> *        MAC: SHA1
>> GET /session HTTP/1.1
> Host: github.com
> Accept: */*
>
> < HTTP/1.1 302 Found
> < Server: nginx/0.7.67
> < Date: Thu, 21 Oct 2010 01:20:13 GMT
> < Content-Type: text/html; charset=utf-8
> < Connection: keep-alive
> < Status: 302 Found
> < Location: http://github.com/session
> < X-Runtime: 0ms
> < Content-Length: 91
> * Added cookie _github_ses="BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f"
> for domain github.com, path /, expire 1577865600
> < Set-Cookie: _github_ses=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f;
> path=/; expires=Wed, 01 Jan 2020 08:00:00 GMT; HttpOnly
> < Cache-Control: no-cache
> <
> * Ignoring the response-body
> * Connection #0 to host github.com left intact
> * Issue another request to this URL: 'http://github.com/session'
> * About to connect() to github.com port 80 (#1)
> *   Trying 207.97.227.239... * connected
> * Connected to github.com (207.97.227.239) port 80 (#1)
>> GET /session HTTP/1.1
> Host: github.com
> Accept: */*
> Cookie: _github_ses=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--884981fc5aa85daf318eeff084d98e2cff92578f
>
> < HTTP/1.1 302 Found
> < Server: nginx/0.7.67
> < Date: Thu, 21 Oct 2010 01:20:13 GMT
> < Content-Type: text/html; charset=utf-8
> < Connection: keep-alive
> < Status: 302 Found
> < Location: http://github.com/login
> < X-Runtime: 1ms
> < Content-Length: 89
> * Replaced cookie
> _github_ses="BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5"
> for domain github.com, path /, expire 1577865600
> < Set-Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5;
> path=/; expires=Wed, 01 Jan 2020 08:00:00 GMT; HttpOnly
> < Cache-Control: no-cache
> <
> * Ignoring the response-body
> * Connection #1 to host github.com left intact
> * Issue another request to this URL: 'http://github.com/login'
> * Re-using existing connection! (#1) with host github.com
> * Connected to github.com (207.97.227.239) port 80 (#1)
>> GET /login HTTP/1.1
> Host: github.com
> Accept: */*
> Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5
>
> < HTTP/1.1 302 Found
> < Server: nginx/0.7.67
> < Date: Thu, 21 Oct 2010 01:20:13 GMT
> < Content-Type: text/html; charset=utf-8
> < Connection: keep-alive
> < Status: 302 Found
> < Location: https://github.com/login
> < X-Runtime: 0ms
> < Content-Length: 90
> < Cache-Control: no-cache
> <
> * Ignoring the response-body
> * Connection #1 to host github.com left intact
> * Issue another request to this URL: 'https://github.com/login'
> * Re-using existing connection! (#0) with host github.com
> * Connected to github.com (207.97.227.239) port 443 (#0)
>> GET /login HTTP/1.1
> Host: github.com
> Accept: */*
> Cookie: _github_ses=BAh7BzoRbG9jYWxlX2d1ZXNzMCIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--e10506e0f6935897cafe4f56774e20aa35e579a5
>
> < HTTP/1.1 200 OK
> < Server: nginx/0.7.67
> < Date: Thu, 21 Oct 2010 01:20:13 GMT
> < Content-Type: text/html; charset=utf-8
> < Connection: keep-alive
> < Status: 200 OK
> < ETag: "00d0a01a93453f641cecb212b6487496"
> < X-Runtime: 10ms
> < Content-Length: 15622
> * Added cookie csrf_id="f3ff252d5245dc4da10f0c75badbd10c" for domain
> github.com, path /, expire 0
> < Set-Cookie: csrf_id=f3ff252d5245dc4da10f0c75badbd10c; path=/
> < Cache-Control: private, max-age=0, must-revalidate
> <
> * Connection #0 to host github.com left intact
> *** Exception: Failed to log in: CurlOK -- HTTP/1.1 302 Found
> *Main> * Closing connection #1
> * Closing connection #0
>
>
> Here's what the code looks like. Notice there's a condition test for a
> Status of 302, but that test fails where it should have succeeded.
> Removing the CurlFollowLOcation constructor fixes that problem, but I
> am left with the problem I began with.
>
>
>> import Network.Curl
>> import System (getArgs)
>> import Text.Regex.Posix
>
>> -- | Standard options used for all requests. Uncomment the @CurlVerbose@
>> -- option for lots of info on STDOUT.
>> opts = [ CurlCookieJar "cookies" , CurlVerbose True, CurlFollowLocation True]
>
>> -- | Additional options to simulate submitting the login form.
>> loginOptions user pass =
>>   CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST
>
>> main = withCurlDo $ do
>>   -- Get username and password from command line arguments (will cause
>>   -- pattern match failure if incorrect number of args provided).
>>   [user, pass] <- getArgs
>
>>   -- Initialize curl instance.
>>   curl <- initialize
>>   setopts curl opts
>
>>   -- POST request to login.
>>   r <- do_curl_ curl "https://github.com/session" (loginOptions user pass)
>>     :: IO CurlResponse
>>   if respCurlCode r /= CurlOK || respStatus r /= 302
>>     then error $ "Failed to log in: "
>>                ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
>>     else do
>>       -- GET request to fetch account page.
>>       r <- do_curl_ curl ("https://github.com/account") method_GET
>>         :: IO CurlResponse
>>       if respCurlCode r /= CurlOK || respStatus r /= 302
>>         then error $ "Failed to retrieve account page: "
>>                    ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
>>         else putStrLn $ extractToken $ respBody r
>> -- | Extracts the token from GitHub account HTML page.
>> extractToken body = head' "GitHub token not found" xs
>>   where
>>     head' msg l = if null l then error msg else head l
>>     (_,_,_,xs)  = body =~ "github\\.token (.+)"
>>                :: (String, String, String,[String])
>
> On Wed, Oct 20, 2010 at 4:33 PM, Paulo Tanimoto <ptanimoto at gmail.com> wrote:
>> Hi Michael,
>>
>> On Wed, Oct 20, 2010 at 6:19 PM, Michael Litchard <michael at schmong.org> wrote:
>>> I'm using this tutorial as a guide
>>> http://flygdynamikern.blogspot.com/2009/03/extended-sessions-with-haskell-curl.html
>>>
>>> github has changed since this was posted, but I have managed a
>>> successful login. Now I am faced with dealing with a re-direct.
>>> I found this constructor
>>> CurlFollowLocation Bool
>>>
>>> on this page
>>> http://hackage.haskell.org/packages/archive/curl/1.3.5/doc/html/Network-Curl-Opts.html
>>>
>>> It seems to do what I want, But I am not clear on how to use it. Could
>>> someone provide an example?
>>>
>>> Thanks. End goal is to snarf the cookie that establishes the session.
>>
>> I think it is what you're probably expecting.  Just add
>> CurlFollowLocation True to the list of options.  Those get "applied"
>> by setopts in main.
>>
>>> opts = [ CurlCookieJar "cookies", CurlFollowLocation True ]
>>
>> I've used that before with no problems.  Do you need a complete example?
>>
>> Paulo
>>
>

I don't think this fixes the problem, but shouldn't the first link be
"https://github.com/login" instead of "session"?


More information about the Haskell-Cafe mailing list