[Haskell-cafe] Optimizing performance problems with Aeson rendering large Text arrays

Fri Feb 1 11:06:19 CET 2013

Hello,

In summary, i'm working on an application that responds to a users query, a
sequence index, with the union of a list of UUIDs that have "changed" 
since that
same sequence index, split into 6 sections. I wish to respond to these 
queries
via JSON to provide an easy to use web service, and for the most part, 
what I
have works.

The problem I am having is that profiling seems to show that the 
majority of the
time spent in my application is encoding this to JSON, and also that the
application is only 60% productive with 40% allocations happening in
Data.Aeson.encode (and friends).

Here's an overview of what I'm doing, the full code can be found at the 
end of
this email.

I am storing my data in memory as an IntMap, from sequence index to a 
changeset:

     IntMap ChangeSet

Where a ChangeSet is essentially a tuple of HashSet's of UUIDs:

     data ChangeSet = ChangeSet { artistChanges :: !(HashSet MBID)
                                , labelChanges :: !(HashSet MBID)
                                , recordingChanges :: !(HashSet MBID)
                                , releaseChanges :: !(HashSet MBID)
                                , releaseGroupChanges :: !(HashSet MBID)
                                , workChanges :: !(HashSet MBID)
                                }
       deriving (Generic)

The MBID newtype is just a newtype around Text, but you can only create 
MBIDs by
parsing a UUID fromString - just to enforce a bit more correctness, but 
without
the cost of having to serialize the UUID to JSON.

When I query, I splitLookup on the IntMap to get the requested change set by
sequence index, and all future changesets. I union all of these, and 
then render
the response back to the client:

     let (_, !cs, !futureCs) = IntMap.splitLookup csId changeSets
     writeLBS $ encode $ mconcat $
       catMaybes $ map Just (IntMap.elems futureCs) ++ [ cs ]

None of this shows up in profiling however, and here's what I see:

     Thu Jan 31 17:03 2013 Time and Allocation Profiling Report  (Final)

        Main +RTS -p -RTS

     total time  =        4.75 secs   (4748 ticks @ 1000 us, 1 processor)
     total alloc = 4,329,582,160 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                             %time %alloc

encode          Data.Aeson.Encode                   23.5 17.4
string          Data.Aeson.Encode                   18.5 35.1
break           Data.Aeson.Encode                   17.5 2.3
mconcat         Main                                15.1 9.7
fromValue/Array Data.Aeson.Encode                    9.2 14.8
toJSON          Main                                 5.7 9.0
send.loop       Snap.Internal.Http.Server.HttpPort   3.0 0.0
mapIter         Snap.Iteratee                        2.1 2.3
parseJSON       Main                                 1.7 3.0
writeLBS        Snap.Internal.Types                  1.1 4.9

COST CENTRE                 MODULE                no. entries  %time 
%alloc   %time %alloc

MAIN                        MAIN 216        0    0.0    0.0   100.0  100.0
  main                       Main 433        0    0.0    0.0   100.0  100.0
   main.since                Main 1063        1    0.0    0.0    75.3   78.7
    encode                   Data.Aeson.Encode 1391        0   23.5   
17.4    75.3   78.7
     fromValue/Object        Data.Aeson.Encode    1395 254    0.0    
0.0    46.0   52.2
      fromValue/Array        Data.Aeson.Encode    1420 757    7.9   
12.0    36.4   25.7
       fromValue/String      Data.Aeson.Encode    1422 3089095    0.7    
0.0    28.5   13.7
        string               Data.Aeson.Encode    1423 3089095   10.2   
11.4    27.8   13.7
         break               Data.Aeson.Encode    1425 3089095   17.5    
2.3    17.5    2.3
      string                 Data.Aeson.Encode    1396 884    8.3   
23.7     9.6   26.6
       fromValue/Array       Data.Aeson.Encode 1421        0    1.3    
2.9     1.3    2.9
        fromValue/String     Data.Aeson.Encode 1424        0    0.0    
0.0     0.0    0.0
       break                 Data.Aeson.Encode    1397 884    0.0    
0.0     0.0    0.0
     toJSON                  Main                 1393 127    5.7    
9.0     5.7    9.0

Unless I'm reading this incorrectly, this shows that 75% of the time is 
spent in
encode, along with almost 80% of my allocations. While the performance of my
application is actually satisfactory (I respond in around 0.04s), I'd 
still like
to do better - if only for the practical experience of learning how to 
optimize.

Any ideas what I can do about this? I feel like I might get better 
performance
if fromValue/Array new that I had a vector of Text values, and they 
could just
be intercalated with ", " - but I have no idea how the internals of Text 
works
so this might really perform the same as the fold that is currently used.

I am compiling for benchmarking purposes with:

     ghc -Wall -fno-warn-orphans -Werror -O2 -rtsopts \
       -hide-package hashable-1.2.0.5 Main.hs

And I run with:

     ./Main

Though I have tried with +RTS -A2M, -A4M and -A8M - none of which seem 
to make a
huge difference. -N2 seems to performance worse with a single 
connection, though
that might perform better for many concurrent connections - I haven't yet
looked.

I'd love to hear your thoughts on what I can do to get this even 
faster!If you
need any more information, don't hesitate to ask.

Thanks,
- ocharles

--------------------

Here is the code I am using. If you want to run this, you will need a 
directory
'change-sets', which contain JSON files named according to the pattern
(\d+).json. I've tared up my test directory and uploaded it to
http://ocharles.org.uk/change-sets.tar.gz, if you want to try running
this. You will need to install:

     cabal install aeson unordered-containers containers text uuid 
snap-core snap-server

I run a benchmark with this Perl script, which just does 30 requests and 
then
prints out the average request time.

     #!/usr/bin/perl
     use 5.10.0;
     use List::Util qw( sum );
     use LWP::UserAgent;
     use Time::HiRes qw( gettimeofday tv_interval );

     my $lwp = LWP::UserAgent->new;
     my @intervals;

     for my $i (0..30) {
         my $t0 = [ gettimeofday ];
         $lwp->get('http://0.0.0.0:8000/since/65527');
         push @intervals, tv_interval($t0);
     }

     say sum(@intervals) / $#intervals;

I think that's it!

{-# LANGUAGE BangPatterns, FlexibleContexts, GeneralizedNewtypeDeriving, 
OverloadedStrings, TypeOperators #-}
import Prelude hiding (readFile)
import Control.Applicative
import Control.Monad (forM, mzero)
import Data.Aeson
import Data.ByteString.Lazy (readFile)
import Data.IntMap.Strict (IntMap)
import Data.Hashable
import Data.List
import Data.Maybe (catMaybes, fromMaybe)
import Data.Monoid
import Data.HashSet (HashSet)
import Data.Text (Text)
import Data.UUID
import Snap.Core
import Snap.Http.Server
import System.Directory
import System.FilePath
import qualified Data.IntMap as IntMap
import qualified Data.HashSet as HashSet
import qualified Data.Text as Text
import qualified Data.Text.Encoding as Encoding

data ChangeSet = ChangeSet { artistChanges :: !(HashSet MBID)
                            , labelChanges :: !(HashSet MBID)
                            , recordingChanges :: !(HashSet MBID)
                            , releaseChanges :: !(HashSet MBID)
                            , releaseGroupChanges :: !(HashSet MBID)
                            , workChanges :: !(HashSet MBID)
                            }

newtype MBID = MBID Text
   deriving (Eq, Show, Ord, Hashable, ToJSON)

instance Monoid ChangeSet where
   mempty = ChangeSet mempty mempty mempty mempty mempty mempty
   a `mappend` b = ChangeSet { artistChanges = artistChanges a `mappend` 
artistChanges b
                             , labelChanges = labelChanges a `mappend` 
labelChanges b
                             , recordingChanges = recordingChanges a 
`mappend` recordingChanges b
                             , releaseGroupChanges = releaseGroupChanges 
a `mappend` releaseGroupChanges b
                             , releaseChanges = releaseChanges a 
`mappend` releaseChanges b
                             , workChanges = workChanges a `mappend` 
workChanges b
                             }
   mconcat as = ChangeSet { artistChanges = HashSet.unions (map 
artistChanges as)
                          , labelChanges = HashSet.unions (map 
labelChanges as)
                          , recordingChanges = HashSet.unions (map 
recordingChanges as)
                          , releaseChanges = HashSet.unions (map 
releaseChanges as)
                          , releaseGroupChanges = HashSet.unions (map 
releaseGroupChanges as)
                          , workChanges = HashSet.unions (map 
workChanges as)
                          }

instance FromJSON ChangeSet where
   parseJSON (Object j) = j .: "data" >>= go
     where go o = ChangeSet <$> (HashSet.fromList <$> o .: "artist")
                            <*> (HashSet.fromList <$> o .: "label")
                            <*> (HashSet.fromList <$> o .: "recording")
                            <*> (HashSet.fromList <$> o .: "release")
                            <*> (HashSet.fromList <$> o .: "release_group")
                            <*> (HashSet.fromList <$> o .: "work")
   parseJSON _ = mzero

instance ToJSON ChangeSet where
   toJSON c =
     object [
       "data" .= object
         [ "artist" .= artistChanges c
         , "label" .= labelChanges c
         , "recording" .= recordingChanges c
         , "release" .= releaseChanges c
         , "release_group" .= releaseGroupChanges c
         , "work" .= workChanges c
         ]
       ]

instance FromJSON MBID where
   parseJSON (String s) = maybe mzero (const $ return (MBID s)) $ 
fromString (Text.unpack s)
   parseJSON _ = mzero

loadChangeSets :: FilePath -> IO (IntMap ChangeSet)
loadChangeSets d = do
   changeSetFiles <- filter (isSuffixOf ".json") <$> getDirectoryContents d
   fmap IntMap.fromList $ forM changeSetFiles $ \f -> do
     let csId = read $ reverse $ drop 5 $ reverse f
     changeSet <- fromMaybe (error "Failed to decode") . decode' <$> 
readFile (d </> f)
     return (csId, changeSet)

main :: IO ()
main = do
   changeSets <- loadChangeSets "change-sets"
   quickHttpServe $ route [("/since/:x", since changeSets)]
  where
    since changeSets = do
      Just csId <- fmap (read . Text.unpack . Encoding.decodeUtf8) <$> 
getParam "x"
      let (_, !cs, !futureCs) = IntMap.splitLookup csId changeSets
      writeLBS $ encode $ mconcat $ catMaybes $ map Just (IntMap.elems 
futureCs) ++ [ cs ]