[Haskell-cafe] Contributing to http-conduit

Myles C. Maxfield myles.maxfield at gmail.com
Mon Feb 6 22:07:36 CET 2012


After all these commits have been flying around, I have yet another
question:

the 'HTTP' package defines Network.Browser which is a State monad which
keeps state about a "browser" (i.e. a cookie jar, a proxy, redirection
parameters, etc.) It would be pretty straightforward to implement this kind
of functionality on top of http-conduit.

I was originally going to do it and release it as its own package, but it
may be beneficial to add such a module to the existing http-conduit
package. Should I add it in to the existing package, or release it as its
own package?

--Myles

On Mon, Feb 6, 2012 at 12:15 AM, Michael Snoyman <michael at snoyman.com>wrote:

> Just an FYI for everyone: Myles sent an (incredibly thorough) pull
> request to handle cookies:
>
> https://github.com/snoyberg/http-conduit/pull/13
>
> Thanks!
>
> On Sun, Feb 5, 2012 at 8:20 AM, Myles C. Maxfield
> <myles.maxfield at gmail.com> wrote:
> > 1. The spec defines a grammar for the attributes. They're in uppercase.
> > 2. Yes - 1.3 is the first version that lists DiffTime as an instance of
> > RealFrac (so I can use the 'floor' function to pull out the number of
> > seconds to render it)
> > 3. I'll see what I can do.
> >
> > --Myles
> >
> >
> > On Sat, Feb 4, 2012 at 9:06 PM, Michael Snoyman <michael at snoyman.com>
> wrote:
> >>
> >> Looks good, a few questions/requests:
> >>
> >> 1. Is there a reason to upper-case all the attributes?
> >> 2. Is the time >= 1.3 a requirements? Because that can cause a lot of
> >> trouble for people.
> >> 3. Can you send the patch as a Github pull request? It's easier to
> >> track that way.
> >>
> >> Michael
> >>
> >> On Sat, Feb 4, 2012 at 1:21 AM, Myles C. Maxfield
> >> <myles.maxfield at gmail.com> wrote:
> >> > Here is the patch to Web.Cookie. I didn't modify the tests at all
> >> > because
> >> > they were already broken - they looked like they hadn't been updated
> >> > since
> >> > SetCookie only had 5 parameters. I did verify by hand that the patch
> >> > works,
> >> > though.
> >> >
> >> > Thanks,
> >> > Myles
> >> >
> >> >
> >> > On Thu, Feb 2, 2012 at 11:26 PM, Myles C. Maxfield
> >> > <myles.maxfield at gmail.com> wrote:
> >> >>
> >> >> Alright, I'll make a small patch that adds 2 fields to SetCookie:
> >> >> setCookieMaxAge :: Maybe DiffTime
> >> >> setCookieSecureOnly :: Bool
> >> >>
> >> >> I've also gotten started on those cookie functions. I'm currently
> >> >> writing
> >> >> tests for them.
> >> >>
> >> >> @Chris: The best advice I can give is that Chrome (what I'm using as
> a
> >> >> source on all this) has the data baked into a .cc file. However, they
> >> >> have
> >> >> directions in a README and a script which will parse the list and
> >> >> generate
> >> >> that source file. I recommend doing this. That way, the Haskell
> module
> >> >> would
> >> >> have 2 source files: one file that reads the list and generates the
> >> >> second
> >> >> file, which is a very large source file that contains each element in
> >> >> the
> >> >> list. The list should export `elem`-type queries. I'm not quite sure
> >> >> how to
> >> >> handle wildcards that appear in the list - that part is up to you.
> >> >> Thanks
> >> >> for helping out with this :]
> >> >>
> >> >> --Myles
> >> >>
> >> >>
> >> >> On Thu, Feb 2, 2012 at 10:53 PM, Michael Snoyman <
> michael at snoyman.com>
> >> >> wrote:
> >> >>>
> >> >>> Looks good to me too. I agree with Aristid: let's make the change to
> >> >>> cookie itself. Do you want to send a pull request? I'm also
> >> >>> considering making the SetCookie constructor hidden like we have for
> >> >>> Request, so that if in the future we realize we need to add some
> other
> >> >>> settings, it doesn't break the API.
> >> >>>
> >> >>> Chris: I would recommend compiling it into the module. Best bet
> would
> >> >>> likely being converting the source file to Haskell source.
> >> >>>
> >> >>> Michael
> >> >>>
> >> >>> On Fri, Feb 3, 2012 at 6:32 AM, Myles C. Maxfield
> >> >>> <myles.maxfield at gmail.com> wrote:
> >> >>> > Alright. After reading the spec, I have these questions /
> concerns:
> >> >>> >
> >> >>> > The spec supports the "Max-Age" cookie attribute, which
> Web.Cookies
> >> >>> > doesn't.
> >> >>> >
> >> >>> > I see two possible solutions to this. The first is to have
> >> >>> > parseSetCookie
> >> >>> > take a UTCTime as an argument which will represent the current
> time
> >> >>> > so
> >> >>> > it
> >> >>> > can populate the setCookieExpires field by adding the Max-Age
> >> >>> > attribute
> >> >>> > to
> >> >>> > the current time. Alternatively, that function can return an IO
> >> >>> > SetCookie so
> >> >>> > it can ask for the current time by itself (which I think is
> inferior
> >> >>> > to
> >> >>> > taking the current time as an argument). Note that the spec says
> to
> >> >>> > prefer
> >> >>> > Max-Age over Expires.
> >> >>> > Add a field to SetCookie of type Maybe DiffTime which represents
> the
> >> >>> > Max-Age
> >> >>> > attribute
> >> >>> >
> >> >>> > Cookie code should be aware of the Public Suffix List as a part of
> >> >>> > its
> >> >>> > domain verification. The cookie code only needs to be able to tell
> >> >>> > if a
> >> >>> > specific string is in the list (W.Ascii -> Bool)
> >> >>> >
> >> >>> > I propose making an entirely unrelated package,
> public-suffix-list,
> >> >>> > with a
> >> >>> > module Network.PublicSuffixList, which will expose this function,
> as
> >> >>> > well as
> >> >>> > functions about parsing the list itself. Thoughts?
> >> >>> >
> >> >>> > Web.Cookie doesn't have a "secure-only" attribute. Adding one in
> is
> >> >>> > straightforward enough.
> >> >>> > The spec describes cookies as a property of HTTP, not of the World
> >> >>> > Wide
> >> >>> > Web.
> >> >>> > Perhaps "Web.Cookie" should be renamed? Just a thought; it doesn't
> >> >>> > really
> >> >>> > matter to me.
> >> >>> >
> >> >>> > As for Network.HTTP.Conduit.Cookie, the spec describes in section
> >> >>> > 5.3
> >> >>> > "Storage Model" what fields a Cookie has. Here is my proposal for
> >> >>> > the
> >> >>> > functions it will expose:
> >> >>> >
> >> >>> > receiveSetCookie :: SetCookie -> Req.Request m -> UTCTime -> Bool
> ->
> >> >>> > CookieJar -> CookieJar
> >> >>> >
> >> >>> > Runs the algorithm described in section 5.3 "Storage Model"
> >> >>> > The UTCTime is the current-time, the Bool is whether or not the
> >> >>> > caller
> >> >>> > is an
> >> >>> > HTTP-based API (as opposed to JavaScript or anything else)
> >> >>> >
> >> >>> > updateCookieJar :: Res.Response a -> Req.Request m -> UTCTime ->
> >> >>> > CookieJar
> >> >>> > -> (CookieJar, Res.Response a)
> >> >>> >
> >> >>> > Applies "receiveSetCookie" to a Response. The output CookieJar is
> >> >>> > stripped
> >> >>> > of any Set-Cookie headers.
> >> >>> > Specifies "True" for the Bool in receiveSetCookie
> >> >>> >
> >> >>> > computeCookieString :: Req.Request m -> CookieJar -> UTCTime ->
> Bool
> >> >>> > ->
> >> >>> > (W.Ascii, CookieJar)
> >> >>> >
> >> >>> > Runs the algorithm described in section 5.4 "The Cookie Header"
> >> >>> > The UTCTime and Bool are the same as in receiveSetCookie
> >> >>> >
> >> >>> > insertCookiesIntoRequest :: Req.Request m -> CookieJar -> UTCTime
> ->
> >> >>> > (Req.Request m, CookieJar)
> >> >>> >
> >> >>> > Applies "computeCookieString" to a Request. The output cookie jar
> >> >>> > has
> >> >>> > updated last-accessed-times.
> >> >>> > Specifies "True" for the Bool in computeCookieString
> >> >>> >
> >> >>> > evictExpiredCookies :: CookieJar -> UTCTime -> CookieJar
> >> >>> >
> >> >>> > Runs the algorithm described in the last part of section 5.3
> >> >>> > "Storage
> >> >>> > Model"
> >> >>> >
> >> >>> > This will make the relevant part of 'http' look like:
> >> >>> >
> >> >>> >     go count req'' cookie_jar'' = do
> >> >>> >         now <- liftIO $ getCurrentTime
> >> >>> >         let (req', cookie_jar') = insertCookiesIntoRequest req''
> >> >>> > (evictExpiredCookies cookie_jar'' now) now
> >> >>> >         res' <- httpRaw req' manager
> >> >>> >         let (cookie_jar, res) = updateCookieJar res' req' now
> >> >>> > cookie_jar'
> >> >>> >         case getRedirectedRequest req' (responseHeaders res)
> >> >>> > (W.statusCode
> >> >>> > (statusCode res)) of
> >> >>> >             Just req -> go (count - 1) req cookie_jar
> >> >>> >             Nothing -> return res
> >> >>> >
> >> >>> > I plan to not allow for a user-supplied cookieFilter function. If
> >> >>> > they
> >> >>> > want
> >> >>> > that functionality, they can re-implement the
> redirection-following
> >> >>> > logic.
> >> >>> >
> >> >>> > Any thoughts on any of this?
> >> >>> >
> >> >>> > Thanks,
> >> >>> > Myles
> >> >>> >
> >> >>> > On Wed, Feb 1, 2012 at 5:19 PM, Myles C. Maxfield
> >> >>> > <myles.maxfield at gmail.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Nope. I'm not. The RFC is very explicit about how to handle
> >> >>> >> cookies.
> >> >>> >> As
> >> >>> >> soon as I'm finished making sense of it (in terms of Haskell)
> I'll
> >> >>> >> send
> >> >>> >> another proposal email.
> >> >>> >>
> >> >>> >> On Feb 1, 2012 3:25 AM, "Michael Snoyman" <michael at snoyman.com>
> >> >>> >> wrote:
> >> >>> >>>
> >> >>> >>> You mean you're *not* making this proposal?
> >> >>> >>>
> >> >>> >>> On Wed, Feb 1, 2012 at 7:30 AM, Myles C. Maxfield
> >> >>> >>> <myles.maxfield at gmail.com> wrote:
> >> >>> >>> > Well, this is embarrassing. Please disregard my previous
> email.
> >> >>> >>> > I
> >> >>> >>> > should
> >> >>> >>> > learn to read the RFC *before* submitting proposals.
> >> >>> >>> >
> >> >>> >>> > --Myles
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield
> >> >>> >>> > <myles.maxfield at gmail.com> wrote:
> >> >>> >>> >>
> >> >>> >>> >> Here are my initial ideas about supporting cookies. Note that
> >> >>> >>> >> I'm
> >> >>> >>> >> using
> >> >>> >>> >> Chrome for ideas since it's open source.
> >> >>> >>> >>
> >> >>> >>> >> Network/HTTP/Conduit/Cookies.hs file
> >> >>> >>> >> Exporting the following symbols:
> >> >>> >>> >>
> >> >>> >>> >> type StuffedCookie = SetCookie
> >> >>> >>> >>
> >> >>> >>> >> A regular SetCookie can have Nothing for its Domain and Path
> >> >>> >>> >> attributes. A
> >> >>> >>> >> StuffedCookie has to have these fields set.
> >> >>> >>> >>
> >> >>> >>> >> type CookieJar = [StuffedCookie]
> >> >>> >>> >>
> >> >>> >>> >> Chrome's cookie jar is implemented as (the C++ equivalent of)
> >> >>> >>> >> Map
> >> >>> >>> >> W.Ascii
> >> >>> >>> >> StuffedCookie. The key is the "eTLD+1" of the domain, so
> >> >>> >>> >> lookups
> >> >>> >>> >> for
> >> >>> >>> >> all
> >> >>> >>> >> cookies for a given domain are fast.
> >> >>> >>> >> I think I'll stay with just a list of StuffedCookies just to
> >> >>> >>> >> keep
> >> >>> >>> >> it
> >> >>> >>> >> simple. Perhaps a later revision can implement the faster
> map.
> >> >>> >>> >>
> >> >>> >>> >> getRelevantCookies :: Request m -> CookieJar -> UTCTime ->
> >> >>> >>> >> (CookieJar,
> >> >>> >>> >> Cookies)
> >> >>> >>> >>
> >> >>> >>> >> Gets all the cookies from the cookie jar that should be set
> for
> >> >>> >>> >> the
> >> >>> >>> >> given
> >> >>> >>> >> Request.
> >> >>> >>> >> The time argument is whatever "now" is (it's pulled out of
> the
> >> >>> >>> >> function so
> >> >>> >>> >> the function can remain pure and easily testable)
> >> >>> >>> >> The function will also remove expired cookies from the cookie
> >> >>> >>> >> jar
> >> >>> >>> >> (given
> >> >>> >>> >> what "now" is) and return the filtered cookie jar
> >> >>> >>> >>
> >> >>> >>> >> putRelevantCookies :: Request m -> CookieJar ->
> [StuffedCookie]
> >> >>> >>> >> ->
> >> >>> >>> >> CookieJar
> >> >>> >>> >>
> >> >>> >>> >> Insert cookies from a server response into the cookie jar.
> >> >>> >>> >> The first argument is only used for checking to see which
> >> >>> >>> >> cookies
> >> >>> >>> >> are
> >> >>> >>> >> valid (which cookies match the requested domain, etc, so
> >> >>> >>> >> site1.com
> >> >>> >>> >> can't set
> >> >>> >>> >> a cookie for site2.com)
> >> >>> >>> >>
> >> >>> >>> >> stuffCookie :: Request m -> SetCookie -> StuffedCookie
> >> >>> >>> >>
> >> >>> >>> >> If the SetCookie's fields are Nothing, fill them in given the
> >> >>> >>> >> Request
> >> >>> >>> >> from
> >> >>> >>> >> which it originated
> >> >>> >>> >>
> >> >>> >>> >> getCookies :: Response a -> ([SetCookie], Response a)
> >> >>> >>> >>
> >> >>> >>> >> Pull cookies out of a server response. Return the response
> with
> >> >>> >>> >> the
> >> >>> >>> >> Set-Cookie headers filtered out
> >> >>> >>> >>
> >> >>> >>> >> putCookies :: Request a -> Cookies -> Request a
> >> >>> >>> >>
> >> >>> >>> >> A wrapper around renderCookies. Inserts some cookies into a
> >> >>> >>> >> request.
> >> >>> >>> >> Doesn't overwrite cookies that are already set in the request
> >> >>> >>> >>
> >> >>> >>> >> These functions will be exported from Network.HTTP.Conduit as
> >> >>> >>> >> well, so
> >> >>> >>> >> callers can use them to re-implement redirection chains
> >> >>> >>> >> I won't implement a cookie filtering function (like what
> >> >>> >>> >> Network.Browser
> >> >>> >>> >> has)
> >> >>> >>> >>
> >> >>> >>> >> If you want to have arbitrary handling of cookies,
> re-implement
> >> >>> >>> >> redirection following. It's not very difficult if you use the
> >> >>> >>> >> API
> >> >>> >>> >> provided,
> >> >>> >>> >> and the 'http' function is open source so you can use that
> as a
> >> >>> >>> >> reference.
> >> >>> >>> >>
> >> >>> >>> >> I will implement the functions according to RFC 6265
> >> >>> >>> >> I will also need to write the following functions. Should
> they
> >> >>> >>> >> also be
> >> >>> >>> >> exported?
> >> >>> >>> >>
> >> >>> >>> >> canonicalizeDomain :: W.Ascii -> W.Ascii
> >> >>> >>> >>
> >> >>> >>> >> turns "..a.b.c..d.com..." to "a.b.c.d.com"
> >> >>> >>> >> Technically necessary for domain matching (Chrome does it)
> >> >>> >>> >> Perhaps unnecessary for a first pass? Perhaps we can trust
> >> >>> >>> >> users
> >> >>> >>> >> for
> >> >>> >>> >> now?
> >> >>> >>> >>
> >> >>> >>> >> domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
> >> >>> >>> >>
> >> >>> >>> >> Does the first domain match against the second domain?
> >> >>> >>> >> If so, return the prefix of the first that isn't in the
> second
> >> >>> >>> >>
> >> >>> >>> >> pathMatches :: W.Ascii -> W.Ascii -> Bool
> >> >>> >>> >>
> >> >>> >>> >> Do the paths match?
> >> >>> >>> >>
> >> >>> >>> >> In order to implement domain matching, I have to have
> knowledge
> >> >>> >>> >> of
> >> >>> >>> >> the Public Suffix List so I know that
> sub1.sub2.pvt.k12.wy.us
> >> >>> >>> >> can
> >> >>> >>> >> set
> >> >>> >>> >> a
> >> >>> >>> >> cookie for sub2.pvt.k12.wy.us but not for k12.wy.us (because
> >> >>> >>> >> pvt.k12.wy.us
> >> >>> >>> >> is a "suffix"). There are a variety of ways to implement
> this.
> >> >>> >>> >>
> >> >>> >>> >> As far as I can tell, Chrome does it by using a script
> (which a
> >> >>> >>> >> human
> >> >>> >>> >> periodically runs) which parses the list at creates a .cc
> file
> >> >>> >>> >> that is
> >> >>> >>> >> included in the build.
> >> >>> >>> >>
> >> >>> >>> >> I might be wrong about the execution of the script; it might
> be
> >> >>> >>> >> a
> >> >>> >>> >> build
> >> >>> >>> >> step. If it is a build step, however, it is suspicious that a
> >> >>> >>> >> build
> >> >>> >>> >> target
> >> >>> >>> >> would try to download a file...
> >> >>> >>> >>
> >> >>> >>> >> Any more elegant ideas?
> >> >>> >>> >>
> >> >>> >>> >> Feedback on any/all of the above would be very helpful
> before I
> >> >>> >>> >> go
> >> >>> >>> >> off
> >> >>> >>> >> into the weeds on this project.
> >> >>> >>> >>
> >> >>> >>> >> Thanks,
> >> >>> >>> >> Myles C. Maxfield
> >> >>> >>> >>
> >> >>> >>> >> On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman
> >> >>> >>> >> <michael at snoyman.com>
> >> >>> >>> >> wrote:
> >> >>> >>> >>>
> >> >>> >>> >>> Thanks, looks great! I've merged it into the Github tree.
> >> >>> >>> >>>
> >> >>> >>> >>> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
> >> >>> >>> >>> <myles.maxfield at gmail.com> wrote:
> >> >>> >>> >>> > Ah, yes, you're completely right. I completely agree that
> >> >>> >>> >>> > moving
> >> >>> >>> >>> > the
> >> >>> >>> >>> > function into the Maybe monad increases readability. This
> >> >>> >>> >>> > kind
> >> >>> >>> >>> > of
> >> >>> >>> >>> > function
> >> >>> >>> >>> > is what the Maybe monad was designed for.
> >> >>> >>> >>> >
> >> >>> >>> >>> > Here is a revised patch.
> >> >>> >>> >>> >
> >> >>> >>> >>> >
> >> >>> >>> >>> > On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman
> >> >>> >>> >>> > <michael at snoyman.com>
> >> >>> >>> >>> > wrote:
> >> >>> >>> >>> >>
> >> >>> >>> >>> >> On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
> >> >>> >>> >>> >> <myles.maxfield at gmail.com> wrote:
> >> >>> >>> >>> >> > the fromJust should never fail, beceause of the guard
> >> >>> >>> >>> >> > statement:
> >> >>> >>> >>> >> >
> >> >>> >>> >>> >> >     | 300 <= code && code < 400 && isJust l'' && isJust
> >> >>> >>> >>> >> > l' =
> >> >>> >>> >>> >> > Just $
> >> >>> >>> >>> >> > req
> >> >>> >>> >>> >> >
> >> >>> >>> >>> >> > Because of the order of the && operators, it will only
> >> >>> >>> >>> >> > evaluate
> >> >>> >>> >>> >> > fromJust
> >> >>> >>> >>> >> > after it makes sure that the argument isJust. That
> >> >>> >>> >>> >> > function
> >> >>> >>> >>> >> > in
> >> >>> >>> >>> >> > particular
> >> >>> >>> >>> >> > shouldn't throw any exceptions - it should only return
> >> >>> >>> >>> >> > Nothing.
> >> >>> >>> >>> >> >
> >> >>> >>> >>> >> > Knowing that, I don't quite think I understand what
> your
> >> >>> >>> >>> >> > concern
> >> >>> >>> >>> >> > is.
> >> >>> >>> >>> >> > Can
> >> >>> >>> >>> >> > you
> >> >>> >>> >>> >> > elaborate?
> >> >>> >>> >>> >>
> >> >>> >>> >>> >> You're right, but I had to squint really hard to prove to
> >> >>> >>> >>> >> myself
> >> >>> >>> >>> >> that
> >> >>> >>> >>> >> you're right. That's the kind of code that could easily
> be
> >> >>> >>> >>> >> broken
> >> >>> >>> >>> >> in
> >> >>> >>> >>> >> future updates by an unwitting maintainer (e.g., me). To
> >> >>> >>> >>> >> protect
> >> >>> >>> >>> >> the
> >> >>> >>> >>> >> world from me, I'd prefer if the code didn't have the
> >> >>> >>> >>> >> fromJust.
> >> >>> >>> >>> >> This
> >> >>> >>> >>> >> might be a good place to leverage the Monad instance of
> >> >>> >>> >>> >> Maybe.
> >> >>> >>> >>> >>
> >> >>> >>> >>> >> Michael
> >> >>> >>> >>> >
> >> >>> >>> >>> >
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >
> >> >>> >
> >> >>> >
> >> >>
> >> >>
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120206/b050db82/attachment-0001.htm>


More information about the Haskell-Cafe mailing list