[Haskell-cafe] Contributing to http-conduit

Michael Snoyman michael at snoyman.com
Fri Feb 3 07:53:50 CET 2012


Looks good to me too. I agree with Aristid: let's make the change to
cookie itself. Do you want to send a pull request? I'm also
considering making the SetCookie constructor hidden like we have for
Request, so that if in the future we realize we need to add some other
settings, it doesn't break the API.

Chris: I would recommend compiling it into the module. Best bet would
likely being converting the source file to Haskell source.

Michael

On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
<myles.maxfield at gmail.com> wrote:
> Alright. After reading the spec, I have these questions / concerns:
>
> The spec supports the "Max-Age" cookie attribute, which Web.Cookies doesn't.
>
> I see two possible solutions to this. The first is to have parseSetCookie
> take a UTCTime as an argument which will represent the current time so it
> can populate the setCookieExpires field by adding the Max-Age attribute to
> the current time. Alternatively, that function can return an IO SetCookie so
> it can ask for the current time by itself (which I think is inferior to
> taking the current time as an argument). Note that the spec says to prefer
> Max-Age over Expires.
> Add a field to SetCookie of type Maybe DiffTime which represents the Max-Age
> attribute
>
> Cookie code should be aware of the Public Suffix List as a part of its
> domain verification. The cookie code only needs to be able to tell if a
> specific string is in the list (W.Ascii -> Bool)
>
> I propose making an entirely unrelated package, public-suffix-list, with a
> module Network.PublicSuffixList, which will expose this function, as well as
> functions about parsing the list itself. Thoughts?
>
> Web.Cookie doesn't have a "secure-only" attribute. Adding one in is
> straightforward enough.
> The spec describes cookies as a property of HTTP, not of the World Wide Web.
> Perhaps "Web.Cookie" should be renamed? Just a thought; it doesn't really
> matter to me.
>
> As for Network.HTTP.Conduit.Cookie, the spec describes in section 5.3
> "Storage Model" what fields a Cookie has. Here is my proposal for the
> functions it will expose:
>
> receiveSetCookie :: SetCookie -> Req.Request m -> UTCTime -> Bool ->
> CookieJar -> CookieJar
>
> Runs the algorithm described in section 5.3 "Storage Model"
> The UTCTime is the current-time, the Bool is whether or not the caller is an
> HTTP-based API (as opposed to JavaScript or anything else)
>
> updateCookieJar :: Res.Response a -> Req.Request m -> UTCTime -> CookieJar
> -> (CookieJar, Res.Response a)
>
> Applies "receiveSetCookie" to a Response. The output CookieJar is stripped
> of any Set-Cookie headers.
> Specifies "True" for the Bool in receiveSetCookie
>
> computeCookieString :: Req.Request m -> CookieJar -> UTCTime -> Bool ->
> (W.Ascii, CookieJar)
>
> Runs the algorithm described in section 5.4 "The Cookie Header"
> The UTCTime and Bool are the same as in receiveSetCookie
>
> insertCookiesIntoRequest :: Req.Request m -> CookieJar -> UTCTime ->
> (Req.Request m, CookieJar)
>
> Applies "computeCookieString" to a Request. The output cookie jar has
> updated last-accessed-times.
> Specifies "True" for the Bool in computeCookieString
>
> evictExpiredCookies :: CookieJar -> UTCTime -> CookieJar
>
> Runs the algorithm described in the last part of section 5.3 "Storage Model"
>
> This will make the relevant part of 'http' look like:
>
>     go count req'' cookie_jar'' = do
>         now <- liftIO $ getCurrentTime
>         let (req', cookie_jar') = insertCookiesIntoRequest req''
> (evictExpiredCookies cookie_jar'' now) now
>         res' <- httpRaw req' manager
>         let (cookie_jar, res) = updateCookieJar res' req' now cookie_jar'
>         case getRedirectedRequest req' (responseHeaders res) (W.statusCode
> (statusCode res)) of
>             Just req -> go (count - 1) req cookie_jar
>             Nothing -> return res
>
> I plan to not allow for a user-supplied cookieFilter function. If they want
> that functionality, they can re-implement the redirection-following logic.
>
> Any thoughts on any of this?
>
> Thanks,
> Myles
>
> On Wed, Feb 1, 2012 at 5:19 PM, Myles C. Maxfield <myles.maxfield at gmail.com>
> wrote:
>>
>> Nope. I'm not. The RFC is very explicit about how to handle cookies. As
>> soon as I'm finished making sense of it (in terms of Haskell) I'll send
>> another proposal email.
>>
>> On Feb 1, 2012 3:25 AM, "Michael Snoyman" <michael at snoyman.com> wrote:
>>>
>>> You mean you're *not* making this proposal?
>>>
>>> On Wed, Feb 1, 2012 at 7:30 AM, Myles C. Maxfield
>>> <myles.maxfield at gmail.com> wrote:
>>> > Well, this is embarrassing. Please disregard my previous email. I
>>> > should
>>> > learn to read the RFC *before* submitting proposals.
>>> >
>>> > --Myles
>>> >
>>> >
>>> > On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield
>>> > <myles.maxfield at gmail.com> wrote:
>>> >>
>>> >> Here are my initial ideas about supporting cookies. Note that I'm
>>> >> using
>>> >> Chrome for ideas since it's open source.
>>> >>
>>> >> Network/HTTP/Conduit/Cookies.hs file
>>> >> Exporting the following symbols:
>>> >>
>>> >> type StuffedCookie = SetCookie
>>> >>
>>> >> A regular SetCookie can have Nothing for its Domain and Path
>>> >> attributes. A
>>> >> StuffedCookie has to have these fields set.
>>> >>
>>> >> type CookieJar = [StuffedCookie]
>>> >>
>>> >> Chrome's cookie jar is implemented as (the C++ equivalent of) Map
>>> >> W.Ascii
>>> >> StuffedCookie. The key is the "eTLD+1" of the domain, so lookups for
>>> >> all
>>> >> cookies for a given domain are fast.
>>> >> I think I'll stay with just a list of StuffedCookies just to keep it
>>> >> simple. Perhaps a later revision can implement the faster map.
>>> >>
>>> >> getRelevantCookies :: Request m -> CookieJar -> UTCTime -> (CookieJar,
>>> >> Cookies)
>>> >>
>>> >> Gets all the cookies from the cookie jar that should be set for the
>>> >> given
>>> >> Request.
>>> >> The time argument is whatever "now" is (it's pulled out of the
>>> >> function so
>>> >> the function can remain pure and easily testable)
>>> >> The function will also remove expired cookies from the cookie jar
>>> >> (given
>>> >> what "now" is) and return the filtered cookie jar
>>> >>
>>> >> putRelevantCookies :: Request m -> CookieJar -> [StuffedCookie] ->
>>> >> CookieJar
>>> >>
>>> >> Insert cookies from a server response into the cookie jar.
>>> >> The first argument is only used for checking to see which cookies are
>>> >> valid (which cookies match the requested domain, etc, so site1.com
>>> >> can't set
>>> >> a cookie for site2.com)
>>> >>
>>> >> stuffCookie :: Request m -> SetCookie -> StuffedCookie
>>> >>
>>> >> If the SetCookie's fields are Nothing, fill them in given the Request
>>> >> from
>>> >> which it originated
>>> >>
>>> >> getCookies :: Response a -> ([SetCookie], Response a)
>>> >>
>>> >> Pull cookies out of a server response. Return the response with the
>>> >> Set-Cookie headers filtered out
>>> >>
>>> >> putCookies :: Request a -> Cookies -> Request a
>>> >>
>>> >> A wrapper around renderCookies. Inserts some cookies into a request.
>>> >> Doesn't overwrite cookies that are already set in the request
>>> >>
>>> >> These functions will be exported from Network.HTTP.Conduit as well, so
>>> >> callers can use them to re-implement redirection chains
>>> >> I won't implement a cookie filtering function (like what
>>> >> Network.Browser
>>> >> has)
>>> >>
>>> >> If you want to have arbitrary handling of cookies, re-implement
>>> >> redirection following. It's not very difficult if you use the API
>>> >> provided,
>>> >> and the 'http' function is open source so you can use that as a
>>> >> reference.
>>> >>
>>> >> I will implement the functions according to RFC 6265
>>> >> I will also need to write the following functions. Should they also be
>>> >> exported?
>>> >>
>>> >> canonicalizeDomain :: W.Ascii -> W.Ascii
>>> >>
>>> >> turns "..a.b.c..d.com..." to "a.b.c.d.com"
>>> >> Technically necessary for domain matching (Chrome does it)
>>> >> Perhaps unnecessary for a first pass? Perhaps we can trust users for
>>> >> now?
>>> >>
>>> >> domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
>>> >>
>>> >> Does the first domain match against the second domain?
>>> >> If so, return the prefix of the first that isn't in the second
>>> >>
>>> >> pathMatches :: W.Ascii -> W.Ascii -> Bool
>>> >>
>>> >> Do the paths match?
>>> >>
>>> >> In order to implement domain matching, I have to have knowledge of
>>> >> the Public Suffix List so I know that sub1.sub2.pvt.k12.wy.us can set
>>> >> a
>>> >> cookie for sub2.pvt.k12.wy.us but not for k12.wy.us (because
>>> >> pvt.k12.wy.us
>>> >> is a "suffix"). There are a variety of ways to implement this.
>>> >>
>>> >> As far as I can tell, Chrome does it by using a script (which a human
>>> >> periodically runs) which parses the list at creates a .cc file that is
>>> >> included in the build.
>>> >>
>>> >> I might be wrong about the execution of the script; it might be a
>>> >> build
>>> >> step. If it is a build step, however, it is suspicious that a build
>>> >> target
>>> >> would try to download a file...
>>> >>
>>> >> Any more elegant ideas?
>>> >>
>>> >> Feedback on any/all of the above would be very helpful before I go off
>>> >> into the weeds on this project.
>>> >>
>>> >> Thanks,
>>> >> Myles C. Maxfield
>>> >>
>>> >> On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman <michael at snoyman.com>
>>> >> wrote:
>>> >>>
>>> >>> Thanks, looks great! I've merged it into the Github tree.
>>> >>>
>>> >>> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
>>> >>> <myles.maxfield at gmail.com> wrote:
>>> >>> > Ah, yes, you're completely right. I completely agree that moving
>>> >>> > the
>>> >>> > function into the Maybe monad increases readability. This kind of
>>> >>> > function
>>> >>> > is what the Maybe monad was designed for.
>>> >>> >
>>> >>> > Here is a revised patch.
>>> >>> >
>>> >>> >
>>> >>> > On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman
>>> >>> > <michael at snoyman.com>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
>>> >>> >> <myles.maxfield at gmail.com> wrote:
>>> >>> >> > the fromJust should never fail, beceause of the guard statement:
>>> >>> >> >
>>> >>> >> >     | 300 <= code && code < 400 && isJust l'' && isJust l' =
>>> >>> >> > Just $
>>> >>> >> > req
>>> >>> >> >
>>> >>> >> > Because of the order of the && operators, it will only evaluate
>>> >>> >> > fromJust
>>> >>> >> > after it makes sure that the argument isJust. That function in
>>> >>> >> > particular
>>> >>> >> > shouldn't throw any exceptions - it should only return Nothing.
>>> >>> >> >
>>> >>> >> > Knowing that, I don't quite think I understand what your concern
>>> >>> >> > is.
>>> >>> >> > Can
>>> >>> >> > you
>>> >>> >> > elaborate?
>>> >>> >>
>>> >>> >> You're right, but I had to squint really hard to prove to myself
>>> >>> >> that
>>> >>> >> you're right. That's the kind of code that could easily be broken
>>> >>> >> in
>>> >>> >> future updates by an unwitting maintainer (e.g., me). To protect
>>> >>> >> the
>>> >>> >> world from me, I'd prefer if the code didn't have the fromJust.
>>> >>> >> This
>>> >>> >> might be a good place to leverage the Monad instance of Maybe.
>>> >>> >>
>>> >>> >> Michael
>>> >>> >
>>> >>> >
>>> >>
>>> >>
>>> >
>
>



More information about the Haskell-Cafe mailing list