[Haskell-cafe] How on Earth Do You Reason about Space?
Ketil Malde
ketil at malde.org
Thu Jun 2 13:37:40 CEST 2011
Since frequency counts are an important use of map-like data structures,
I did a brief test of the available options. First using regular
strings for input, and Data.Map.fromListWith - i.e. the operational bit being:
freqs :: [String] -> M.Map String Int
freqs = M.fromListWith (+) . map (,1)
This runs on a 14M corpus consisting of King James Bible, collected
works of Shakespeare, and War and Peace.
./freqstr1 +RTS -s
5,093,386,368 bytes allocated in the heap
2,682,667,904 bytes copied during GC
261,110,368 bytes maximum residency (20 sample(s))
9,018,000 bytes maximum slop
623 MB total memory in use (10 MB lost due to fragmentation)
./freqstr1 +RTS -s 21.43s user 0.78s system 99% cpu 22.285 total
Kinda expensive, 250MB to store word frequencies of 14MB text.
Now, changing to
freqs :: [String] -> M.Map String Int
freqs = foldl' (\m w -> M.insertWith' (+) w 1 m) M.empty
i.e. using strict insertion, avoiding the buildup of lazy thunks for the
counts.
./freqstr2 +RTS -s -- strings, using strict insertion
4,754,110,096 bytes allocated in the heap
2,089,527,240 bytes copied during GC
27,039,112 bytes maximum residency (66 sample(s))
613,192 bytes maximum slop
80 MB total memory in use (2 MB lost due to fragmentation)
./freqstr2 +RTS -s 17.48s user 0.13s system 99% cpu 17.665 total
This reduced maximam memory consumption to one tenth, still bigger than
input corpus, but clearly not too bad. A bit faster, too, in spite of
probably doing more work.
Using ByteStrings instead, first fromListWith:
./freq +RTS -s
(Just 77432,113931)
3,880,059,568 bytes allocated in the heap
1,467,507,808 bytes copied during GC
174,573,752 bytes maximum residency (14 sample(s))
8,222,600 bytes maximum slop
385 MB total memory in use (6 MB lost due to fragmentation)
./freq +RTS -s 14.26s user 0.49s system 99% cpu 14.798 total
About half the memroy of Strings, and 25% faster. With strict insertion:
./freq2 +RTS -s -- map using strict insertion
3,761,614,312 bytes allocated in the heap
849,806,000 bytes copied during GC
23,950,328 bytes maximum residency (35 sample(s))
2,376,904 bytes maximum slop
58 MB total memory in use (1 MB lost due to fragmentation)
./freq2 +RTS -s 11.14s user 0.13s system 99% cpu 11.295 total
Parallel to the String case, this is a lot more frugal with memory, and
30% faster. Now, I tried Data.HashMap from the hashmap library:
./freqH1 +RTS -s -- hashmap using fromListWith
4,552,922,784 bytes allocated in the heap
2,990,287,536 bytes copied during GC
401,247,912 bytes maximum residency (14 sample(s))
42,098,016 bytes maximum slop
957 MB total memory in use (15 MB lost due to fragmentation)
./freqH1 +RTS -s 15.68s user 1.53s system 99% cpu 17.277 total
./freqH2 +RTS -s -- hashmap using foldl' insertWith
4,518,146,968 bytes allocated in the heap
2,986,973,352 bytes copied during GC
394,502,832 bytes maximum residency (14 sample(s))
41,020,040 bytes maximum slop
957 MB total memory in use (15 MB lost due to fragmentation)
./freqH2 +RTS -s 15.86s user 1.62s system 99% cpu 17.537 total
HashMap doesn't provide a strict insertWith, so this is similar to the
lazy insertions above. A bit worse, actually, probably due to the
overhead of hashing.
Then, I discovered that Johan's hashmap is a different library, and
thought I'd try that too for completeness.
./freqHS +RTS -s -- hashmap strict (unordered-containers)
2,628,628,752 bytes allocated in the heap
945,571,872 bytes copied during GC
26,635,744 bytes maximum residency (32 sample(s))
2,433,504 bytes maximum slop
66 MB total memory in use (1 MB lost due to fragmentation)
./freqHS +RTS -s 6.90s user 0.16s system 99% cpu 7.082 total
Memory residency like the other strict versions, but really fast,
probably due to faster comparisons of hash values vs comparisons of
strings.
Conclusion: make sure you are using a strict map, and if your keys are
strings or otherwise have expensive comparisons, unordered-containers is
the library for you.
-k
PS: I also tried mapping 'copy' on the input words to avoid storing
large slices of the input, but it only worsened things:
./freqHS3 +RTS -s
(Just 77432,113931)
3,109,585,024 bytes allocated in the heap
936,724,184 bytes copied during GC
87,831,888 bytes maximum residency (19 sample(s))
8,835,440 bytes maximum slop
164 MB total memory in use (3 MB lost due to fragmentation)
./freqHS3 +RTS -s 12.71s user 0.31s system 99% cpu 13.060 total
Perhaps if you managed to only copy new words it would look better?
PPS: I tried to be careful juggling the results around, but there's
always the possiblity of a mistake. Caveat lector! (Or should that be
'cave scriptor'?)
PPPS: There are some small interface annoyances around, it'd be nice if
I could experiment with this by only changing the imports. Would it be
possible to agree on an interface to map-like structures?
--
If I haven't seen further, it is by standing in the footprints of giants
More information about the Haskell-Cafe
mailing list