[Haskell-cafe] How to speedup generically parsing sum types?

Bas van Dijk v.dijk.bas at gmail.com
Thu Nov 3 11:16:20 CET 2011


Hello,

I recently added default generic implementations of toJSON and
parseJSON to the aeson package. Now I'm optimizing them. Here are some
benchmark results that compare:

* th: toJSON and fromJSON generated by template-haskell. Can be
compared to hand-written code. Should be the fastest of all.

* syb: toJSON and fromJSON from the Data.Aeson.Generic module. Uses
the Data type class.

* generic: my toJSON and fromJSON using GHC Generics.

The benchmark itself can be found here:
https://github.com/basvandijk/aeson/blob/optimizations/benchmarks/AesonCompareAutoInstances.hs

toJSON
======

D/toJSON/th                 3.631734 us
D/toJSON/syb                32.66679 us
D/toJSON/generic            3.371868 us

BigRecord/toJSON/th         8.982990 us
BigRecord/toJSON/syb        48.90737 us
BigRecord/toJSON/generic    8.971597 us

BigProduct/toJSON/th        1.578259 us
BigProduct/toJSON/syb       29.21153 us
BigProduct/toJSON/generic   1.623115 us

BigSum/toJSON/th            51.81214 ns
BigSum/toJSON/syb           1.256708 us
BigSum/toJSON/generic       71.32851 ns


fromJSON
========

D/fromJSON/th               7.017204 us
D/fromJSON/syb              23.46567 us
D/fromJSON/generic          7.968974 us

BigRecord/fromJSON/th       8.513789 us
BigRecord/fromJSON/syb      36.64501 us
BigRecord/fromJSON/generic  10.07809 us

BigProduct/fromJSON/th      2.430677 us
BigProduct/fromJSON/syb     17.97764 us
BigProduct/fromJSON/generic 2.201130 us

BigSum/fromJSON/th          414.8699 ns
BigSum/fromJSON/syb         4.113170 us
BigSum/fromJSON/generic     13.62614 us !!!


As can be seen, in most cases the GHC Generics implementation is much
faster than SYB and just as fast as TH. I'm impressed by how well GHC
optimizes the code!

Unfortunately the last benchmark, generically parsing a big sum type,
is much slower. The code for parsing sums, which can be found here:

https://github.com/basvandijk/aeson/blob/optimizations/Data/Aeson/Types/Internal.hs#L1059

is basically this:


instance (GFromSum a, GFromSum b) => GFromJSON (a :+: b) where
    gParseJSON (Object (M.toList -> [keyVal])) = gParseSum keyVal
    gParseJSON v = typeMismatch "sum (:+:)" v
    {-# INLINE gParseJSON #-}


class GFromSum f where
    gParseSum :: Pair -> Parser (f a)

instance (GFromSum a, GFromSum b) => GFromSum (a :+: b) where
    gParseSum keyVal = (L1 <$> gParseSum keyVal) <|>
                       (R1 <$> gParseSum keyVal)
    {-# INLINE gParseSum #-}

instance (Constructor c, GFromJSON a, ConsFromJSON a) =>
    GFromSum (C1 c a) where
    gParseSum (key, value)
        | key == pack (conName (undefined :: t c a p)) =
            gParseJSON value
        | otherwise = notFound $ unpack key
    {-# INLINE gParseSum #-}


notFound :: String -> Parser a
notFound key = fail $ "The key \"" ++ key ++ "\" was not found"
{-# INLINE notFound #-}


Any idea how to make it faster?

Regards,

Bas



More information about the Haskell-Cafe mailing list