[web-devel] Data.Word8 (word8 library)

Michael Snoyman michael at snoyman.com
Thu Sep 20 13:10:23 CEST 2012


On Thu, Sep 20, 2012 at 11:41 AM, Kazu Yamamoto <kazu at iij.ad.jp> wrote:
> Hello,
>
> ByteString is an array of Word8 but it seems to me that people tend to
> use the Char interface with Data.ByteString.Char8 instead of Word8
> interface with Data.ByteString. Since the functions defined in
> Data.ByteString.Char8 converts Word8 to Char and Char to Word8, it has
> unnecessary overhead. Yes, the overhead is ignorable in many cases,
> but I would like to remove it for high performance server.
>
> Why do people use Data.ByteString.Char8? I guess that there are two
> reasons:
>
> - There are no standard utility functions for Word8 such as "isUpper"
> - Numeric literal (e.g 72 for 'H') is not readable
>
> To fix these problems, I implemented the Data.Word8 module and
> uploaded the word8 library to Hackage:
>
>         http://hackage.haskell.org/packages/archive/word8/0.0.0/doc/html/Data-Word8.html
>
> If Michael and Bas like this, I would like to modify warp and
> case-insensitive to use the word8 library. What do people think this?
>
> My concern is that character names start with "_". Some people would
> dislike this convention. But I have not a better idea at this moment.
> Suggestions are welcome.
>
> --Kazu
>
> _______________________________________________
> web-devel mailing list
> web-devel at haskell.org
> http://www.haskell.org/mailman/listinfo/web-devel

Sounds good to me. I put together a simple benchmark to compare the
performance of toLower, and the results are encouraging:

benchmarking Char8
mean: 38.04527 us, lb 37.94080 us, ub 38.12774 us, ci 0.950
std dev: 470.9770 ns, lb 364.8254 ns, ub 748.3015 ns, ci 0.950

benchmarking Word8
mean: 4.807265 us, lb 4.798199 us, ub 4.816563 us, ci 0.950
std dev: 47.20958 ns, lb 41.51181 ns, ub 55.07049 ns, ci 0.950

I want to try throwing one more idea into the mix, I'll post with
updates when I have them.

So to answer your question: I'd be happy to include word8 in warp :).

Michael


{-# LANGUAGE OverloadedStrings #-}
import Criterion.Main
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8
import qualified Data.Char
import qualified Data.Word8

main :: IO ()
main = do
    input <- S.readFile "bench.hs"
    defaultMain
        [ bench "Char8" $ whnf (S.length . S8.map Data.Char.toLower) input
        , bench "Word8" $ whnf (S.length . S.map Data.Word8.toLower) input
        ]



More information about the web-devel mailing list