[web-devel] Upgrade to Happstack 6 - request bodies

Jeremy Shaw jeremy at n-heptane.com
Wed Jul 27 20:29:26 CEST 2011


Hello,

I have uploaded happstack-server 6.2.2 which fixes the issue you reported.

You should now be able to use lookCookieValue for a POST/PUT request
with out having to call decodeBody.

The only time you need to call decodeBody now is if you are calling a
function that actually looks at the form-data. For example:

  lookText "somefield"

That will look at both the query string and form-data in the request
body, so it does require decodeBody to be called. But only if you did
a PUT/POST request that could actually contain form data. For example,
if it is a GET request, then only the query_string will be checked,
since there could not be any form data in the body.

If the request body *could* contain form-data, but you have not
decoded it yet, but still want to get the query string value you can
do:

 queryString $ lookText "somefield"

That will only examine the query string and does not require you to
call decodeBody.

So, I believe the behavior is now sensible, and what was originally intended:

 1. you can get the raw request body by simply calling takeRequestBody
 2. you can get cookies and query string values without calling
decodeBody for all GET/POST/PUT/HEAD requests regardless of the
content-type of the Request
 3. you only need to call decodeBody if you actually want to look at
the decoded form-data in the request body

Does the new (correct) behavior seem sensible and logical to you now?
If not, do you have suggestions for how to improve it?

Thanks for your report!
- jeremy

On Wed, Jul 27, 2011 at 11:21 AM, Jeremy Shaw <jeremy at n-heptane.com> wrote:
> Hello,
>
> I think you have uncovered a bug. But first let's clear up how things
> are supposed to work.
>
> In previous versions of Happstack, the body was always decoded if the
> content-type was multipart/form-data or
> application/x-www-form-urlencoded. At the same time you also could
> access the entire raw request body.
>
> From an easy-of-use point of view, that was wonderful. But, it also
> means that if the entire request body was likely to be forced into
> RAM. If a Request contained file uploads, then the entirely file
> contents would be forced into RAM. That is clearly not acceptable.
>
> If the request body is just a field in the Request like:
>
>  data Request = Request { body :: Lazy.ByteString, ... }
>
> then there is no way to avoid forcing the whole request body into RAM.
> Even if you use Lazy.writeFile to write the file to disk, the Request
> itself will still have a reference to the body and so none of the data
> can be garbage collected until after the Response is sent and the
> whole Request gets garbage collected.
>
> So, instead we need to introduce an MVar which allows us to detach the
> body contents from the Request type so that it can be garbage
> collected separately. So that is why we have:
>
> data Request = Request { rqBody :: MVar RqBody, ... }
>
> To get the request body. You need only call:
>
>> takeRequestBody ::  MonadIO m => Request -> m (Maybe RqBody)
>
> Which will get the RqBody from the Request and leave the MVar empty.
> That ensures that as you can lazily consume the input as it comes over
> the network and everything will be nicely garbage collected.
>
> Sometimes the request body contains form data that we want to decode
> so we can examine it using the look* functions. decodeBody will use
> takeRequestBody to get the request body and stick the decoded data in:
>
> data Request = Request { ..., rqInputsBody :: MVar [(String, Input)] }
>
> Since takeRequestBody can only be called once, that means our
> application can either have access to the raw request body or it can
> have access to the decoded form data. This is to avoid a space leak.
>
> Fortunately, decodeBody will only call takeRequestBody if the
> content-type is multipart/form-data or
> application/x-www-form-urlencoded (and the request method is PUT or
> POST). Usually you are going to be calling takeRequestBody when the
> content-type is something else -- so you won't have a problem calling
> decodeBody and then calling takeRequestBody later. (If you really need
> access to the raw request and the decoded form data you can do a
> takeRequestBody and then put the request body back and call
> decodeBody.. but that will almost certainly force the entire response
> into RAM.. there is no avoiding that).
>
> So, here is why I believe you are having an error with 'lookCookieValue'.
>
> It is my guess that lookCookieValue is failing for a POST or PUT
> request where the content-type is *not* multipart/form-data or
> application/x-www-form-urlencoded.
>
> If we look at the definition for askRqEnv we  have:
>
>> instance (MonadIO m) => HasRqData (ServerPartT m) where
>>     askRqEnv =
>>         do rq <- askRq
>>            mbi <- liftIO $ if ((rqMethod rq) == POST) || ((rqMethod rq) == PUT)
>>                            then readInputsBody rq
>>                            else return (Just [])
>>            case mbi of
>>              Nothing   -> escape $ internalServerError (toResponse "askRqEnv failed because the request body has not been decoded yet. Try using 'decodeBody'.")
>>              (Just bi) -> return (rqInputsQuery rq, bi, rqCookies rq)
>
> We see here that when you call askRqEnv and the request method is POST
> or PUT, it expects to be able to find the decoded request body data in
> Request.
>
> But, that is not right. If the content-type is not
> multipart/form-data or application/x-www-form-urlencoded, then there
> isn't any request body inputs to decode. But that should not preclude
> you from look at the request parameters or the cookies.
>
> I am not sure what the best fix here is. Here are some options:
>
>  1. avoid checking if decodeBody has been called at all. We can simply
> try to call readInputsBody. If it returns Nothing then we just set the
> body inputs to []. Unfortunately, if you fail to call decodeBody, then
> even though your form-data is submitted, look won't be able to find it
> and will not tell you why. That seems really confusing and unfriendly.
>
>  2. modify the check so that it also checks the content-type. That
> would fix the problem most of the time. But, it does not handle the
> case where the request body *is* multipart/form-data, but you don't
> want to decode it for some reason. For example, maybe you want to look
> at the cookie and validate the user before decoding the request body
> because different users have different maximum upload size file
> quotas.
>
>  3. modify the check so that it only runs when you call look functions
> that actually examine the request body. Normally, functions like
> lookText will look at the query string and the body. But, doing,
> queryString $ lookText, allows you to limit that search to just the
> query string.
>
>  4. when decodeBody is called, but the content-type is not for
> form-data, it should set rqBodyInput to [].
>
> I think the right solution might be a combination of 3 & 4. We only
> report an error when you try to access fields from the body, but have
> made no attempt to actually decode the body.
>
> With that change your code should work fine. With that change, your
> code should also work fine even if you don't call decodeBody. That
> should give you the behavior that you expect (and was intended).
>
> I will fix that now.
>
> - jeremy
>
> On Wed, Jul 27, 2011 at 4:46 AM, Sebastiaan Visser <haskell at fvisser.nl> wrote:
>> Hello all,
>>
>> We have some problems upgrading one of our server applications to Happstack 6, lots of things related to request bodies and request parameters have changed. Unfortunately, the new situation doesn't seem that straightforward to me.
>>
>> In the previous version of Happstack we were able to access the request body directly using the `rqBody` function. In the new situation we have to first decode the body using `decodeBody` and than access it by reading from an MVar. The utility functions accessing this variables ensure the body is only request once, all consecutive calls will error.
>>
>> We're pretty sure we're accessing the body only once and after decoding it, but we still get the following error: "askRqEnv failed because the request body has not been decoded yet. Try using 'decodeBody'." This happens when accessing the cookie using 'readCookieValue'. Isn't strange that accessing the cookie value uses the request body at all?
>>
>> So the pattern we use is:
>>
>>>   do decodeBody (defaultBodyPolicy "/tmp/" (10 * 1024 * 1024) (10 * 1024 * 1024) (1024 * 1024))
>>>      liftIO (print 1)
>>>      c <- lookCookieValue "tid"
>>>      liftIO (print 2)
>>>      b <- takeRequestBody
>>>      ...
>>
>> And the result is:
>>
>>>   1
>>>   askRqEnv failed because the request body has not been decoded yet. Try using 'decodeBody'.
>> (this happens using happstack-server-6.0.3, in 6.1.6 the same error string is sent to the client)
>>
>> And then it stops, the handler crashes and the request fails. The migration guide[1] tells me "To simulate the old behavior, simple call decodeBody in your top-level handler for every request.", but this is clearly not enough.
>>
>> Anyone knows how to fix this in the correct way? I cannot help to think it is a bit strange that it takes me so much time to convince a web framework to hand me over the request body. This should be easy stuff right?
>>
>>
>> Thanks,
>> Sebastiaan
>>
>> [1] http://code.google.com/p/happstack/wiki/Happstack6Migration
>> _______________________________________________
>> web-devel mailing list
>> web-devel at haskell.org
>> http://www.haskell.org/mailman/listinfo/web-devel
>>
>



More information about the web-devel mailing list