[Haskell-cafe] How to speedup generically parsing sum types?

José Pedro Magalhães jpm at cs.uu.nl
Thu Nov 3 12:49:17 CET 2011


Hi Bas,

First of all, thanks for these numbers. I have previously compared the
performance of GP libs [1] and your results confirm what I would expect,
except for that last one, BigSum/fromJSON/generic.

It's good that you're using INLINE pragmas on the generic function already.
What I would also try:
- Compile with -O2 and -fno-spec-constr-count (this last one is
particularly important)
- Add {-# INLINE [1] #-} pragmas to the to/from methods of your Generic
instances.

To add these INLINE pragmas you will have to give your own instances of
Generic (so you can't derive them). I'd suggest you get hold of them with
-ddump-deriv, copy-paste and add the pragmas, just for testing purposes.
The phase is important: you first want to make sure you inline the generic
function definition, and only then the from/to.

Please keep me posted on the effects of these suggestions. In particular,
if the INLINE pragmas on the from/to methods are essential, I'll be happy
to add them to the derived instances.


Cheers,
Pedro

[1] http://dreixel.net/research/pdf/ogie.pdf

2011/11/3 Bas van Dijk <v.dijk.bas at gmail.com>

> Hello,
>
> I recently added default generic implementations of toJSON and
> parseJSON to the aeson package. Now I'm optimizing them. Here are some
> benchmark results that compare:
>
> * th: toJSON and fromJSON generated by template-haskell. Can be
> compared to hand-written code. Should be the fastest of all.
>
> * syb: toJSON and fromJSON from the Data.Aeson.Generic module. Uses
> the Data type class.
>
> * generic: my toJSON and fromJSON using GHC Generics.
>
> The benchmark itself can be found here:
>
> https://github.com/basvandijk/aeson/blob/optimizations/benchmarks/AesonCompareAutoInstances.hs
>
> toJSON
> ======
>
> D/toJSON/th                 3.631734 us
> D/toJSON/syb                32.66679 us
> D/toJSON/generic            3.371868 us
>
> BigRecord/toJSON/th         8.982990 us
> BigRecord/toJSON/syb        48.90737 us
> BigRecord/toJSON/generic    8.971597 us
>
> BigProduct/toJSON/th        1.578259 us
> BigProduct/toJSON/syb       29.21153 us
> BigProduct/toJSON/generic   1.623115 us
>
> BigSum/toJSON/th            51.81214 ns
> BigSum/toJSON/syb           1.256708 us
> BigSum/toJSON/generic       71.32851 ns
>
>
> fromJSON
> ========
>
> D/fromJSON/th               7.017204 us
> D/fromJSON/syb              23.46567 us
> D/fromJSON/generic          7.968974 us
>
> BigRecord/fromJSON/th       8.513789 us
> BigRecord/fromJSON/syb      36.64501 us
> BigRecord/fromJSON/generic  10.07809 us
>
> BigProduct/fromJSON/th      2.430677 us
> BigProduct/fromJSON/syb     17.97764 us
> BigProduct/fromJSON/generic 2.201130 us
>
> BigSum/fromJSON/th          414.8699 ns
> BigSum/fromJSON/syb         4.113170 us
> BigSum/fromJSON/generic     13.62614 us !!!
>
>
> As can be seen, in most cases the GHC Generics implementation is much
> faster than SYB and just as fast as TH. I'm impressed by how well GHC
> optimizes the code!
>
> Unfortunately the last benchmark, generically parsing a big sum type,
> is much slower. The code for parsing sums, which can be found here:
>
>
> https://github.com/basvandijk/aeson/blob/optimizations/Data/Aeson/Types/Internal.hs#L1059
>
> is basically this:
>
>
> instance (GFromSum a, GFromSum b) => GFromJSON (a :+: b) where
>    gParseJSON (Object (M.toList -> [keyVal])) = gParseSum keyVal
>    gParseJSON v = typeMismatch "sum (:+:)" v
>    {-# INLINE gParseJSON #-}
>
>
> class GFromSum f where
>    gParseSum :: Pair -> Parser (f a)
>
> instance (GFromSum a, GFromSum b) => GFromSum (a :+: b) where
>    gParseSum keyVal = (L1 <$> gParseSum keyVal) <|>
>                       (R1 <$> gParseSum keyVal)
>    {-# INLINE gParseSum #-}
>
> instance (Constructor c, GFromJSON a, ConsFromJSON a) =>
>    GFromSum (C1 c a) where
>    gParseSum (key, value)
>        | key == pack (conName (undefined :: t c a p)) =
>            gParseJSON value
>        | otherwise = notFound $ unpack key
>    {-# INLINE gParseSum #-}
>
>
> notFound :: String -> Parser a
> notFound key = fail $ "The key \"" ++ key ++ "\" was not found"
> {-# INLINE notFound #-}
>
>
> Any idea how to make it faster?
>
> Regards,
>
> Bas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20111103/226d77dd/attachment.htm>


More information about the Haskell-Cafe mailing list