[web-devel] wai-logger

Simon Meier iridcode at gmail.com
Fri Aug 19 14:24:58 CEST 2011


Hi Kazu,

2011/8/19 Kazu Yamamoto <kazu at iij.ad.jp>:
> Hello,
>
>> 2. Would it make sense to use a Builder (from blaze-builder) in place
>> of LogStr? Though based on the black magic you have going on under the
>> surface there, I'm guessing you did it this way for a reason ;).
>
> I have implemented a Builder version:
>
>        git://github.com/kazu-yamamoto/wai-logger.git
>        branch: blaze
>
> Attached code shows how to use this version.
>
> Please read the implementation of hPutBuilder:
>
>        https://github.com/kazu-yamamoto/wai-logger/blob/blaze/Network/Wai/Logger/IO.hs
>
> In this version, I guess that intermediate ByteString is not fused, so
> one unnecessary intermediate ByteString is created with
> toByteString. Thus, this version is slower than the original. Since
> I'm a beginner of blaze-builder, I don't know how we can directly copy
> Builder to Handle's buffer. Suggestions would be appreciated.
>
> --Kazu
>
> {-# LANGUAGE OverloadedStrings #-}
>
> module Main where
>
> import Blaze.ByteString.Builder (fromByteString)
> import Control.Monad.IO.Class (liftIO)
> import Data.ByteString.Char8
> import Network.HTTP.Types (status200)
> import Network.Wai
> import Network.Wai.Handler.Warp
> import Network.Wai.Logger
> import System.IO
>
> main :: IO ()
> main = do
>    initHandle stdout -- set blockBuffering
>    dref <- dateInit
>    run 3000 $ logApp dref
>
> logApp :: DateRef -> Application
> logApp dref req = do
>    date <- liftIO $ getDate dref
>    let status = status200
>        len = 4
>    liftIO $ hPutBuilder stdout $ apacheFormatBuilder date req status (Just len)
>    liftIO $ hFlush stdout  -- delete here to see if block buffered
>    return $ ResponseBuilder status
>                             [("Content-Type", "text/plain")
>                             ,("Content-Length", pack (show len))]
>           $ fromByteString "PONG"
>

I should have read this mail before answering the other one. There are
two problems with this benchmark:

1. toByteString :: Builder -> S.ByteString   is more expensive than
toLazyByteString :: Builder -> L.ByteString  because it first
generates a chunked output, which is then packed. This function was a
big mistake to put into blaze-builder, as people expect performance
that doesn't exist.

2. When comparing the speed of the two solutions, I would go for an
isolated benchmark that measures the encoding time only. As far as I
understand your setup above, you're also measuring quite some
server-overhead and do not allow the builders to amortize the buffer
allocation time, as very short responses are generated. See my copy of
the other mail for more explanations:

-- begin: copy from mail on the other thread --

Hi Kazu,

2011/8/19 Kazu Yamamoto <kazu at iij.ad.jp>:
> Hello Simon,
>
>> if you're going for such a solution, then why not use difference lists
>> or even better bytestring builders as your representation of the
>> not-yet-flushed logging journal? Bytestring builders (from the
>> blaze-builder library) support a fast append and fast serialization
>> from a number of formats to bytestring.
>
> Difference lists are not necessary at this moment because a list is
> generated at once in the apacheFormat:
>
>        http://hackage.haskell.org/packages/archive/wai-logger/0.0.0/doc/html/src/Network-Wai-Logger-Format.html#apacheFormat
>
> If my understanding is correct, blaze-builder does not help
> hPutLogStr. What I want to do is directly copy ByteString or List to
> *Handle's buffer*.
>
>        http://hackage.haskell.org/packages/archive/wai-logger/0.0.0/doc/html/src/Network-Wai-Logger-IO.html#hPutLogStr

I see. Currently, builders cannot be executed directly on a Handle's
buffer. This is functionality I wanted to have for a long time, but
have not gotten around to implement it.

Using bytestring builders you could avoid creating the intermediate
[LogStr] list. You should get a performance benefit, when describing
your log-message directly as a mapping to a builder and executing this
builder on the handle's buffer, as this avoids the indirections from
the list- and the LogStr-cells. Copying the byteStrings directly also
works for builders using the 'copyByteString' function. You would get
a further performance benefit, if you could avoid creating
intermediate String values. For example, the new builder in the
bytestring library provides functions for the decimal encoding of
numbers directly into the output buffer using a fast C-implementation.

The development repository of the new bytestring builders is available
here [1]. Its API is finished, benchmarks look good, and a
documentation draft exists. Hence, it would be cheap to give it a go
and see how fast you could produce the log-messages using the new
bytestring builders. I'd use criterion to compare

 mapM_ yourLogMessageWriter logMessageList

against

 whnf (L.length . toLazyByteString . mconcat . map
builderLogMessageWriter) logMessageList

where

 logMessageList :: [(ZonedDate, Request, Status, Maybe Integer)]
 logMessageList = replicate 10000 ( your-message-params)

This should be a fair comparison, as both implementations work on
similarly large buffers. If that shows that builders are beneficial,
then we can think about implementing output on a Handle's buffer
directly.

best regards,
Simon

PS: The new bytestring builder will very likely be released with the
next GHC in November.

[1] https://github.com/meiersi/bytestring

-- end: copy --


best regards,
Simon



More information about the web-devel mailing list