[Haskell-cafe] How on Earth Do You Reason about Space?

Ketil Malde ketil at malde.org
Thu Jun 2 13:37:40 CEST 2011


Since frequency counts are an important use of map-like data structures,
I did a brief test of the available options.  First using regular
strings for input, and Data.Map.fromListWith - i.e. the operational bit being:

  freqs :: [String] -> M.Map String Int
  freqs = M.fromListWith (+) . map (,1)

This runs on a 14M corpus consisting of King James Bible, collected
works of Shakespeare, and War and Peace.

./freqstr1 +RTS -s 
   5,093,386,368 bytes allocated in the heap
   2,682,667,904 bytes copied during GC
     261,110,368 bytes maximum residency (20 sample(s))
       9,018,000 bytes maximum slop
             623 MB total memory in use (10 MB lost due to fragmentation)
./freqstr1 +RTS -s  21.43s user 0.78s system 99% cpu 22.285 total

Kinda expensive, 250MB to store word frequencies of 14MB text.

Now, changing to 

  freqs :: [String] -> M.Map String Int
  freqs = foldl' (\m w -> M.insertWith' (+) w 1 m) M.empty

i.e. using strict insertion, avoiding the buildup of lazy thunks for the
counts.

./freqstr2 +RTS -s  -- strings, using strict insertion
   4,754,110,096 bytes allocated in the heap
   2,089,527,240 bytes copied during GC
      27,039,112 bytes maximum residency (66 sample(s))
         613,192 bytes maximum slop
              80 MB total memory in use (2 MB lost due to fragmentation)
./freqstr2 +RTS -s  17.48s user 0.13s system 99% cpu 17.665 total

This reduced maximam memory consumption to one tenth, still bigger than
input corpus, but clearly not too bad.  A bit faster, too, in spite of
probably doing more work.

Using ByteStrings instead, first fromListWith:

./freq +RTS -s
(Just 77432,113931)
   3,880,059,568 bytes allocated in the heap
   1,467,507,808 bytes copied during GC
     174,573,752 bytes maximum residency (14 sample(s))
       8,222,600 bytes maximum slop
             385 MB total memory in use (6 MB lost due to fragmentation)
./freq +RTS -s  14.26s user 0.49s system 99% cpu 14.798 total

About half the memroy of Strings, and 25% faster.  With strict insertion:

./freq2 +RTS -s   -- map using strict insertion
   3,761,614,312 bytes allocated in the heap
     849,806,000 bytes copied during GC
      23,950,328 bytes maximum residency (35 sample(s))
       2,376,904 bytes maximum slop
              58 MB total memory in use (1 MB lost due to fragmentation)
./freq2 +RTS -s  11.14s user 0.13s system 99% cpu 11.295 total

Parallel to the String case, this is a lot more frugal with memory, and
30% faster.  Now, I tried Data.HashMap from the hashmap library:

./freqH1 +RTS -s    -- hashmap using fromListWith
   4,552,922,784 bytes allocated in the heap
   2,990,287,536 bytes copied during GC
     401,247,912 bytes maximum residency (14 sample(s))
      42,098,016 bytes maximum slop
             957 MB total memory in use (15 MB lost due to fragmentation)
./freqH1 +RTS -s  15.68s user 1.53s system 99% cpu 17.277 total

./freqH2 +RTS -s   -- hashmap using foldl' insertWith
   4,518,146,968 bytes allocated in the heap
   2,986,973,352 bytes copied during GC
     394,502,832 bytes maximum residency (14 sample(s))
      41,020,040 bytes maximum slop
             957 MB total memory in use (15 MB lost due to fragmentation)
./freqH2 +RTS -s  15.86s user 1.62s system 99% cpu 17.537 total

HashMap doesn't provide a strict insertWith, so this is similar to the
lazy insertions above.  A bit worse, actually, probably due to the
overhead of hashing.

Then, I discovered that Johan's hashmap is a different library, and
thought I'd try that too for completeness.

./freqHS +RTS -s  -- hashmap strict (unordered-containers)
   2,628,628,752 bytes allocated in the heap
     945,571,872 bytes copied during GC
      26,635,744 bytes maximum residency (32 sample(s))
       2,433,504 bytes maximum slop
              66 MB total memory in use (1 MB lost due to fragmentation)

./freqHS +RTS -s  6.90s user 0.16s system 99% cpu 7.082 total

Memory residency like the other strict versions, but really fast,
probably due to faster comparisons of hash values vs comparisons of
strings. 

Conclusion: make sure you are using a strict map, and if your keys are
strings or otherwise have expensive comparisons, unordered-containers is
the library for you.

-k

PS: I also tried mapping 'copy' on the input words to avoid storing
large slices of the input, but it only worsened things:

./freqHS3 +RTS -s 
(Just 77432,113931)
   3,109,585,024 bytes allocated in the heap
     936,724,184 bytes copied during GC
      87,831,888 bytes maximum residency (19 sample(s))
       8,835,440 bytes maximum slop
             164 MB total memory in use (3 MB lost due to fragmentation)
./freqHS3 +RTS -s  12.71s user 0.31s system 99% cpu 13.060 total

Perhaps if you managed to only copy new words it would look better?

PPS: I tried to be careful juggling the results around, but there's
always the possiblity of a mistake.  Caveat lector!  (Or should that be
'cave scriptor'?)

PPPS: There are some small interface annoyances around, it'd be nice if
I could experiment with this by only changing the imports.  Would it be
possible to agree on an interface to map-like structures?
-- 
If I haven't seen further, it is by standing in the footprints of giants



More information about the Haskell-Cafe mailing list