[Haskell-beginners] Code help requested

Tue Jan 12 13:46:03 EST 2010

Am Dienstag 12 Januar 2010 17:44:12 schrieb David Frey:
> These are the numbers I got once I modified your Haskell program to only
> print out 4 results the way the Ruby program does.
>
> Your Haskell version: ~10.0 s
> My Haskell version: ~2.5 s
> Your Ruby version (Ruby 1.8): ~4.6 s
> Your Ruby version (Ruby 1.9): ~4.2 s

I sincerely hope your input file is smaller than mine :)

>
> This is my version of your program:
>
> import Control.Monad (liftM)
> import Data.Function (on)
> import Data.List (sort, sortBy)
> import qualified Data.ByteString.Char8 as B
> import qualified Data.Map as Map
>
> -- Given as stdin
> -- presents
> -- serpents
> -- no
> -- on
> -- whatever
> -- Expected Output:
> -- [["serpents","presents"],["on","no"]]
>
>
> main = do
>     input <- liftM B.lines B.getContents
>     let wordMap = buildMap $ map B.unpack input
>     print $ take 4 (listAnagrams wordMap)
>
>
> buildMap words = let
>     entries = map (\x -> (sort x, [x])) words
>     in Map.fromListWith (++) entries
>
>
> listAnagrams wordMap = let
>     anagrams = (Map.elems . Map.filter (\x -> length x > 1)) wordMap
>     in sortBy (flip (compare `on` length)) anagrams
>
>
> I found that the performance improved when I used ByteStrings to read
> the input and then unpacked to regular strings before creating the Map. 
> For some reason, using BytesStrings everywhere made the program slower. 
> Can anyone tell me why?

Yes. ByteString's sort is a bucket-sort. It allocates an array of 
256*sizeof(CSize) bytes and counts the occurrences of each character.
That's fine for long ByteStrings, but for short ByteStrings like those we 
consider here, allocating a bucket-array of 1K or 2K is incredibly much.

Sorting plain Strings is faster (not very much, though) and uses (much) 
less memory if they are short.

You can further speed up your programme if you put lists of ByteStrings in 
your Map (less memory, less GC) and unpack them only for sorting (and 
finally for output):

main = do
    input <- liftM B.lines B.getContents
    let wordMap = buildMap input
    print $ take 4 (listAnagrams wordMap)

buildMap words = let
    entries = map (\x -> (B.pack . sort $ B.unpack x, [x])) words
    in Map.fromListWith (++) entries

listAnagrams wordMap = let
    anagrams = (Map.elems . Map.filter (\x -> length x > 1)) wordMap
    -- a small speedup can be obtained by not using length:
    -- Map.filter (not . null . drop 1)
    -- or Map.filter (\l -> case l of { (_:_:_) -> True; _ -> False })
    -- if there are many long lists in the map, the speedup will become
    -- significant
    in map (map B.unpack) $ sortBy (flip (compare `on` length)) anagrams

Yours:
$ ./DFAnagrams +RTS -sstderr < /usr/share/dict/words > /dev/null                                                                                         
./DFAnagrams +RTS -sstderr                                                                        
   1,218,862,708 bytes allocated in the heap                                                      
     544,113,420 bytes copied during GC                                                           
      98,018,856 bytes maximum residency (10 sample(s))                                           
         768,552 bytes maximum slop                                                               
             211 MB total memory in use (2 MB lost due to fragmentation)                          

  Generation 0:  2315 collections,     0 parallel,  1.98s,  2.01s elapsed
  Generation 1:    10 collections,     0 parallel,  1.28s,  1.53s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    3.60s  (  3.61s elapsed)
  GC    time    3.26s  (  3.54s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    6.86s  (  7.15s elapsed)

  %GC time      47.5%  (49.4% elapsed)

  Alloc rate    338,176,931 bytes per MUT second

  Productivity  52.4% of total user, 50.3% of total elapsed

Modified:
$ ./DFBAnagrams +RTS -sstderr < /usr/share/dict/words > /dev/null                                                                                        
./DFBTAnagrams +RTS -sstderr                                                                      
   1,108,946,552 bytes allocated in the heap
     237,869,304 bytes copied during GC
      41,907,844 bytes maximum residency (10 sample(s))
       4,374,152 bytes maximum slop
              89 MB total memory in use (1 MB lost due to fragmentation)

  Generation 0:  2091 collections,     0 parallel,  1.14s,  1.19s elapsed
  Generation 1:    10 collections,     0 parallel,  0.40s,  0.50s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    2.90s  (  2.90s elapsed)
  GC    time    1.54s  (  1.69s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    4.44s  (  4.59s elapsed)

  %GC time      34.7%  (36.8% elapsed)

  Alloc rate    382,369,915 bytes per MUT second

  Productivity  65.2% of total user, 63.1% of total elapsed

versus 26.25s for the Ruby version (ruby 1.8.7).

Yay!

>
> Dave