[Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

Ketil Malde ketil at ii.uib.no
Sun Apr 22 09:12:15 EDT 2007


On Sun, 2007-04-22 at 00:25 -0400, Pete Kazmier wrote:
> Pete Kazmier <pete-expires-20070615 at kazmier.com> writes:
> 
> > I'd love to see other Haskell implementations as well if anyone has a
> > few moments to spare.  Admittedly, it took me several hours to get my
> > version working, but I'm a Haskell newbie.  Unfortunately, I think it
> > runs as slow as it took me to write it!

Hm - nobody suggested using ByteStrings yet?  String is notoriously
wasteful, and replacing it with ByteString normally gives a quite
significant speedup.

Worse - and this is true for ByteStrings, too - String comparisons are
O(n), which means lookups in Sets and Maps are expensive.  A trie (i.e,
a search tree where each internal node corresponds to a word prefix, and
has one branch per letter in the alphabet) will give you lookup that
scales with word size (and possibly alphabet size).

Instead of generating the (huge) list of misspelled words, you could
calculate edit distance to each correctly spelled word?  With a bound on
allowable mistakes, this is a linear time operation using the standard
dynamic programming algorithm.

-k





More information about the Haskell-Cafe mailing list