[Haskell-cafe] Aeson memory use

Thu Aug 9 13:01:07 UTC 2018

On 08/08/2018 08:59 PM, Claude Heiland-Allen wrote:
> Hi,
> 
> The test.data is very repetitive:
> 
> {"1":["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"],"10":["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"],...} 
> 
> 
> Perhaps (after parsing (which might fuse enough to avoid a memory 
> spike), otherwise during parsing might require modifications to aeson?) 
> you could compress it by interning the symbols using a `Map Text Text` 
> to generate one canonical `Text` object for each unique string.
> 
> `pack "a" == pack "a"` under `Eq` but they might be different `Text` 
> objects.
> 
> You might also need to `copy` the `Text` objects, which might be slices 
> referencing the input.
> 

I tried using Text.copy, though in the real code, not this example. It 
didn't seem to help.

The code I'm actually trying to optimize is building a map from IP 
addresses to a collection of short text samples, with potentially 
hundreds of thousands to millions of records. Though we use IPRTable 
from iproute package and not Data.Map, they occupy approximately the 
same amount of memory, so I used Data.Map in the example.