[Haskell-cafe] Fast JSON validation - reducing allocations
dct25-561bs at mythic-beasts.com
Thu May 11 16:12:15 UTC 2017
We have a stream of ~900byte JSON documents (lazy ByteStrings) that we
would like to validate (NB not parse) as quickly as possible. We just need
to check that the bytes are a syntactically valid JSON document and then
pass them elsewhere; we do not care what the content is. The documents are
encoded in UTF-8 and have no extraneous whitespace.
We have made a number of attempts:
1. Use Aeson to parse the documents and then discard the result. This was
actually pretty good: Criterion clocked it at ~30us per document.
2. A hand-written Attoparsec parser. This did quite a bit worse: ~70us per
3. Use Get (from the binary package), which was a bit messier as it was
factored to avoid all backtracking, but managed to get down to around ~20us
4. Build a table-based, Word8-labelled DFA (with a minor hack to deal with
values-within-values) and run it with ByteString.Lazy.foldl'. Also around
~20us per document.
We didn't pursue the Attoparsec idea any further, but the other three all
allocate ~85kB per document, and this now seems to be the limiting factor
for performance. In more detail, looking at the threadscope output, we see
it parse 5 or 6 documents in ~150us (allocating ~500kB) and then run the
GC, which takes about another 150us. The GC stats indicate that it's only
copying a couple of kB of live data each time, so most of that 500kB is
Now 85kB per document is ~96B per byte, which is only a pointer-and-a-half.
It's not entirely surprising that there's this much allocation going on
each time round the loop, but we'd like to try and get rid of it. I've
tried the usual strictness tricks and have peered in a mystified fashion at
the output of -ddump-simpl but am none the wiser.
There's a copy of the relevant code for option 4 at
https://github.com/DaveCTurner/json-validator. I've hacked around with it a
bit since producing the numbers above, so it's now apparently a bit slower
than Aeson but allocates less too (~65kB).
Could anyone help, e.g. by pointing me at the bit in the Core that is
allocating within the main loop?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe