[Haskell-cafe] Contributing to http-conduit

Myles C. Maxfield myles.maxfield at gmail.com
Fri Feb 3 05:32:22 CET 2012


Alright. After reading the spec, I have these questions / concerns:

   - The spec supports the "Max-Age" cookie attribute, which Web.Cookies
   doesn't.
      - I see two possible solutions to this. The first is to have
      parseSetCookie take a UTCTime as an argument which will represent the
      current time so it can populate the setCookieExpires field by adding the
      Max-Age attribute to the current time. Alternatively, that function can
      return an IO SetCookie so it can ask for the current time by
itself (which
      I think is inferior to taking the current time as an argument). Note that
      the spec says to prefer Max-Age over Expires.
      - Add a field to SetCookie of type Maybe DiffTime which represents
      the Max-Age attribute
   - Cookie code should be aware of the Public Suffix
List<http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat>
as
   a part of its domain verification. The cookie code only needs to be able to
   tell if a specific string is in the list (W.Ascii -> Bool)
      - I propose making an entirely unrelated package, public-suffix-list,
      with a module Network.PublicSuffixList, which will expose this
function, as
      well as functions about parsing the list itself. Thoughts?
   - Web.Cookie doesn't have a "secure-only" attribute. Adding one in is
   straightforward enough.
   - The spec describes cookies as a property of HTTP, not of the World
   Wide Web. Perhaps "Web.Cookie" should be renamed? Just a thought; it
   doesn't really matter to me.

As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
"Storage Model" what fields a Cookie has. Here is my proposal for the
functions it will expose:

   - receiveSetCookie :: SetCookie -> Req.Request m -> UTCTime -> Bool ->
   CookieJar -> CookieJar
      - Runs the algorithm described in section 5.3 "Storage Model"
      - The UTCTime is the current-time, the Bool is whether or not the
      caller is an HTTP-based API (as opposed to JavaScript or anything else)
   - updateCookieJar :: Res.Response a -> Req.Request m -> UTCTime ->
   CookieJar -> (CookieJar, Res.Response a)
      - Applies "receiveSetCookie" to a Response. The output CookieJar is
      stripped of any Set-Cookie headers.
      - Specifies "True" for the Bool in receiveSetCookie
   - computeCookieString :: Req.Request m -> CookieJar -> UTCTime -> Bool
   -> (W.Ascii, CookieJar)
      - Runs the algorithm described in section 5.4 "The Cookie Header"
      - The UTCTime and Bool are the same as in receiveSetCookie
   - insertCookiesIntoRequest :: Req.Request m -> CookieJar -> UTCTime ->
   (Req.Request m, CookieJar)
      - Applies "computeCookieString" to a Request. The output cookie jar
      has updated last-accessed-times.
      - Specifies "True" for the Bool in computeCookieString
   - evictExpiredCookies :: CookieJar -> UTCTime -> CookieJar
      - Runs the algorithm described in the last part of section 5.3
      "Storage Model"

This will make the relevant part of 'http' look like:

    go count req'' cookie_jar'' = do
        now <- liftIO $ getCurrentTime
        let (req', cookie_jar') = insertCookiesIntoRequest req''
(evictExpiredCookies cookie_jar'' now) now
        res' <- httpRaw req' manager
        let (cookie_jar, res) = updateCookieJar res' req' now cookie_jar'
        case getRedirectedRequest req' (responseHeaders res) (W.statusCode
(statusCode res)) of
            Just req -> go (count - 1) req cookie_jar
            Nothing -> return res

I plan to not allow for a user-supplied cookieFilter function. If they want
that functionality, they can re-implement the redirection-following logic.

Any thoughts on any of this?

Thanks,
Myles

On Wed, Feb 1, 2012 at 5:19 PM, Myles C. Maxfield
<myles.maxfield at gmail.com>wrote:

> Nope. I'm not. The RFC is very explicit about how to handle cookies. As
> soon as I'm finished making sense of it (in terms of Haskell) I'll send
> another proposal email.
>  On Feb 1, 2012 3:25 AM, "Michael Snoyman" <michael at snoyman.com> wrote:
>
>> You mean you're *not* making this proposal?
>>
>> On Wed, Feb 1, 2012 at 7:30 AM, Myles C. Maxfield
>> <myles.maxfield at gmail.com> wrote:
>> > Well, this is embarrassing. Please disregard my previous email. I should
>> > learn to read the RFC *before* submitting proposals.
>> >
>> > --Myles
>> >
>> >
>> > On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield
>> > <myles.maxfield at gmail.com> wrote:
>> >>
>> >> Here are my initial ideas about supporting cookies. Note that I'm using
>> >> Chrome for ideas since it's open source.
>> >>
>> >> Network/HTTP/Conduit/Cookies.hs file
>> >> Exporting the following symbols:
>> >>
>> >> type StuffedCookie = SetCookie
>> >>
>> >> A regular SetCookie can have Nothing for its Domain and Path
>> attributes. A
>> >> StuffedCookie has to have these fields set.
>> >>
>> >> type CookieJar = [StuffedCookie]
>> >>
>> >> Chrome's cookie jar is implemented as (the C++ equivalent of) Map
>> W.Ascii
>> >> StuffedCookie. The key is the "eTLD+1" of the domain, so lookups for
>> all
>> >> cookies for a given domain are fast.
>> >> I think I'll stay with just a list of StuffedCookies just to keep it
>> >> simple. Perhaps a later revision can implement the faster map.
>> >>
>> >> getRelevantCookies :: Request m -> CookieJar -> UTCTime -> (CookieJar,
>> >> Cookies)
>> >>
>> >> Gets all the cookies from the cookie jar that should be set for the
>> given
>> >> Request.
>> >> The time argument is whatever "now" is (it's pulled out of the
>> function so
>> >> the function can remain pure and easily testable)
>> >> The function will also remove expired cookies from the cookie jar
>> (given
>> >> what "now" is) and return the filtered cookie jar
>> >>
>> >> putRelevantCookies :: Request m -> CookieJar -> [StuffedCookie] ->
>> >> CookieJar
>> >>
>> >> Insert cookies from a server response into the cookie jar.
>> >> The first argument is only used for checking to see which cookies are
>> >> valid (which cookies match the requested domain, etc, so site1.comcan't set
>> >> a cookie for site2.com)
>> >>
>> >> stuffCookie :: Request m -> SetCookie -> StuffedCookie
>> >>
>> >> If the SetCookie's fields are Nothing, fill them in given the Request
>> from
>> >> which it originated
>> >>
>> >> getCookies :: Response a -> ([SetCookie], Response a)
>> >>
>> >> Pull cookies out of a server response. Return the response with the
>> >> Set-Cookie headers filtered out
>> >>
>> >> putCookies :: Request a -> Cookies -> Request a
>> >>
>> >> A wrapper around renderCookies. Inserts some cookies into a request.
>> >> Doesn't overwrite cookies that are already set in the request
>> >>
>> >> These functions will be exported from Network.HTTP.Conduit as well, so
>> >> callers can use them to re-implement redirection chains
>> >> I won't implement a cookie filtering function (like what
>> Network.Browser
>> >> has)
>> >>
>> >> If you want to have arbitrary handling of cookies, re-implement
>> >> redirection following. It's not very difficult if you use the API
>> provided,
>> >> and the 'http' function is open source so you can use that as a
>> reference.
>> >>
>> >> I will implement the functions according to RFC 6265
>> >> I will also need to write the following functions. Should they also be
>> >> exported?
>> >>
>> >> canonicalizeDomain :: W.Ascii -> W.Ascii
>> >>
>> >> turns "..a.b.c..d.com..." to "a.b.c.d.com"
>> >> Technically necessary for domain matching (Chrome does it)
>> >> Perhaps unnecessary for a first pass? Perhaps we can trust users for
>> now?
>> >>
>> >> domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
>> >>
>> >> Does the first domain match against the second domain?
>> >> If so, return the prefix of the first that isn't in the second
>> >>
>> >> pathMatches :: W.Ascii -> W.Ascii -> Bool
>> >>
>> >> Do the paths match?
>> >>
>> >> In order to implement domain matching, I have to have knowledge of
>> >> the Public Suffix List so I know that sub1.sub2.pvt.k12.wy.us can set
>> a
>> >> cookie for sub2.pvt.k12.wy.us but not for k12.wy.us (because
>> pvt.k12.wy.us
>> >> is a "suffix"). There are a variety of ways to implement this.
>> >>
>> >> As far as I can tell, Chrome does it by using a script (which a human
>> >> periodically runs) which parses the list at creates a .cc file that is
>> >> included in the build.
>> >>
>> >> I might be wrong about the execution of the script; it might be a build
>> >> step. If it is a build step, however, it is suspicious that a build
>> target
>> >> would try to download a file...
>> >>
>> >> Any more elegant ideas?
>> >>
>> >> Feedback on any/all of the above would be very helpful before I go off
>> >> into the weeds on this project.
>> >>
>> >> Thanks,
>> >> Myles C. Maxfield
>> >>
>> >> On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman <michael at snoyman.com>
>> >> wrote:
>> >>>
>> >>> Thanks, looks great! I've merged it into the Github tree.
>> >>>
>> >>> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
>> >>> <myles.maxfield at gmail.com> wrote:
>> >>> > Ah, yes, you're completely right. I completely agree that moving the
>> >>> > function into the Maybe monad increases readability. This kind of
>> >>> > function
>> >>> > is what the Maybe monad was designed for.
>> >>> >
>> >>> > Here is a revised patch.
>> >>> >
>> >>> >
>> >>> > On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman <
>> michael at snoyman.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
>> >>> >> <myles.maxfield at gmail.com> wrote:
>> >>> >> > the fromJust should never fail, beceause of the guard statement:
>> >>> >> >
>> >>> >> >     | 300 <= code && code < 400 && isJust l'' && isJust l' =
>> Just $
>> >>> >> > req
>> >>> >> >
>> >>> >> > Because of the order of the && operators, it will only evaluate
>> >>> >> > fromJust
>> >>> >> > after it makes sure that the argument isJust. That function in
>> >>> >> > particular
>> >>> >> > shouldn't throw any exceptions - it should only return Nothing.
>> >>> >> >
>> >>> >> > Knowing that, I don't quite think I understand what your concern
>> is.
>> >>> >> > Can
>> >>> >> > you
>> >>> >> > elaborate?
>> >>> >>
>> >>> >> You're right, but I had to squint really hard to prove to myself
>> that
>> >>> >> you're right. That's the kind of code that could easily be broken
>> in
>> >>> >> future updates by an unwitting maintainer (e.g., me). To protect
>> the
>> >>> >> world from me, I'd prefer if the code didn't have the fromJust.
>> This
>> >>> >> might be a good place to leverage the Monad instance of Maybe.
>> >>> >>
>> >>> >> Michael
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120202/681b054a/attachment.htm>


More information about the Haskell-Cafe mailing list