[Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

Sun Apr 22 12:55:09 EDT 2007

Bryan O'Sullivan <bos at serpentine.com> writes:

> After this switch, I found that spellchecking one word still took 4x
> as long in Haskell as Norvig's Python program.  Since I was checking
> only one word in each case, essentially all of the runtime was taken
> up by building the word frequency map.
>
> train = foldl' updateMap M.empty . map lower . mkWords
>     where updateMap model word = M.insertWith' (+) word 1 model
>           mkWords = filter (not . B.null) . X.splitWith isNotAlpha
>           lower !s = X.map toLower s
>           isNotAlpha !c = c < 0x41 || (c > 0x5a && c < 0x61) || c > 0x7a
>           toLower !c | c >= 0x41 && c <= 0x5a = c + 0x20
>                      | otherwise = c

After reading Bryan's post, I switched my right fold to a strict left
fold and almost tripled my original speed.  Could someone provide some
guidelines on when to use each?  I thought foldr should be used when
building some large data structure such as the map I build.

Bryan, out of curiosity, is a non bytestring version of your code
faster?

Thanks,
Pete